Exploring Art Data 5

Let’s look at some institutional data. We can scrape the Tate Galleries attendance figures from here and make a csv file of them. The first few lines of attendance.csv look like this:

```"Year","Tate Britain","Tate Modern","Tate Liverpool","Tate St Ives","BHM","Total"
2009,1595000,4788000,523000,219000,N/A,7125000
2008,1587655,4647881,1035958,203700,N/A,7475194
2007,1533217,5236702,694228,243993,N/A,7708140
2006,1597359,4895073,556976,193700,46220,7289328
2005,1729692,3902017,666258,180771,43502,6522240```

Now we can load the data into R and start working with the data:

```## Read the csv file, N/A values and all, allowing spaces in column names
## Give the BHM a more descriptive name
names(attendance)[names(attendance) %in% c("BHM")]<-"Barbara Hepworth Museum"
## Get the years
years<-attendance[,1]
## Get the individual site counts (last column is total)
sites<-attendance[,2:(length(attendance) -1)]```

We can draw a multiple line graph of the attendance figures:

```## Create lists of line properties so we can use them in the graph and legend
line.types<-c("solid", "dashed", "dotted", "dotdash", "longdash", "twodash")
line.colours<-c("cyan", "blue", "purple", "red", "orange", "green")
## Suppress the y axis so we can draw one that doesn't use scientific notation
matplot(years, sites, type = "l", yaxt="n",
xlab="Year",ylab="Attendance",
col=line.colours, lty=line.types)
## Draw the y axis using full numbers rather than scientific notation
axis(2, axTicks(2), format(axTicks(2), scientific = F))
## Add a key to the lines
legend("topleft", names(sites), col=line.colours, lty=line.types)
## Title the graph
title(main="Tate Galleries Attendance 1980-2010")```

And we can use an area chart to show the combined attendance. It’s not the best way of examining information, but in this case it shows how the attendance figures stack up, literally:

```## Import the ggplot2 library so we can use ggplot
library("ggplot2")
## To get an area plot, we need to flatten the data to year/museum/attendance
attendance.expanded<-data.frame(Year=rep(years, ncol(sites)),
Museum=rep(names(sites), each=length(years)),
Attendance=unlist(sapply(names(sites),
function(col) {sites[col]}, simplify=TRUE)))
## We use the levels of the Museum factor to order the areas and legend labels
## We do this by clculating the range of attendance at each museum and ordering
## the factor names based on that
attendance.expanded\$Museum<-
factor(attendance.expanded\$Museum,
levels=names(sites)[order(sapply(names(sites),
function(x){max(sites[x], na.rm=TRUE) -
min(sites[x], na.rm=TRUE)}))])
## A utility function to format numbers in English non-scientific format
nonscientific<-function(x, ...)
format(x, big.mark = ',', scientific = FALSE, ...)
## Plot the areas
ggplot(attendance.expanded, aes(x=Year, y=Attendance)) +
geom_area(aes(legend.title="Site", fill=Museum)) +
## Label the y axis in millions rather than scientific notation
scale_y_continuous(formatter=nonscientific) +
## Specifying the breaks orders the legend properly
scale_fill_brewer(palette=2, breaks=rev(levels(attendance.expanded\$Museum))) +
## Set a nice title
opts(title="Tate Galleries Attendance 1980-2010")
```

Art Magazines, Journals and Catalogues at archive.org

Scans of old (19th and early 20th century) art magazines, journals, and catalogues can be found on archive.org along with text extracted from them. These are a very useful resource for study of the history of art.

Be wary of later editions as these may only be out of copyright in the US.

The Yellow Book

The Magazine Of Art

The Illustrated Magazine Of Art

The Burlington Magazine

ArtNews Annual

Art In America

Studio International

Special Numbers 1897-8

The Print Connoisseur

Art Prices Current

Various Exhibition Catalogues

The Armory Show Catalogue

If anyone can suggest other items in the archive, names to search for, more avant-garde publications, or other kinds of periodicals that might have information relevant to art (particularly show listings, sale information) let me know in the comments!

Art Freedom Of Information Requests

WhatDoTheyKnow is an excellent website that allows you to make,check on and search Freedom of Information (FoI) requests in the UK.

Some of those FoI requests concern art.

Art organizations:

http://www.whatdotheyknow.com/search/art/bodies

The National Gallery:

http://www.whatdotheyknow.com/body/the_national_gallery

The NPG:

http://www.whatdotheyknow.com/body/the_national_portrait_gallery

And of course The Tate:

http://www.whatdotheyknow.com/body/the_tate_gallery

It’s interesting to see not just the answers but what kinds of things peopel are asking which organizations about (and whether they’re answering).

Exploring Art Data 4

Let’s draw some more graphs.

Here’s the matrix of form and genre rendered graphically:

```## Load the tab separated values for the table of artworks
# Get rows with both genre and form
## This loses most of the data :-/
art<-artwork[artwork\$art_genre != "" & artwork\$art_form != "",
c("art_genre", "art_form")]
## Drop unused factors
art\$art_genre<-as.factor(as.character(art\$art_genre))
art\$art_form<-as.factor(as.character(art\$art_form))
## Get table
art.table<-table(art) ##as.table(ftable(art))
## Strip rows and columns where max < tolerance
tolerance<-3
art.table.cropped<-art.table[rowSums(art.table) >= tolerance,
colSums(art.table) >=tolerance]
## Print levelplot
## Levelplot is in the "lattice" library
library("lattice")
## Rotate x labels, and set colour scale to white/blue to improve readablity
levelplot(art.table.cropped, xlab="Genre", ylab="Form",
scales=list(x=list(rot=90)),
col.regions=colorRampPalette(c("white", "blue")))```

The highest frequencies leap out of the graph. We should do a version without painting to look for subtleties in the rest of the data.

And here’s some of the basic frequencies from the data:

```## Load the tab separated values for the table of artworks
## Function to plot a summary of the most frequent values
topValuePlot<-function(values, numValues){
## Get a count of the number of times each value name appears in the list
values.summary<-summary(values)
## Draw a graph, allowing enough room for the rotated labels
par(mar=c(10,4,1,1))
barplot(values.summary[1:numValues], las=2)
}
## Artists
topValuePlot(artwork\$artist[artwork\$artist != ""], 20)
## Subject
topValuePlot(artwork\$art_subject[artwork\$art_subject != ""], 20)
```

The dataset is clearly dominated by Western art.
Exploring Art Data 3

Let’s look at how much the “Grants For The Arts” programme of Arts Council England (ACE) gives to each region.

First of all we’ll need the data. That’s available from data.gov.uk under the new CC-BY compatible Crown Copyright here. It’s in XLS format, which R doesn’t load on GNU/Linux, but we can convert that to comma-separated values using OpenOffice.org Calc.

Next we’ll need a map to plot the data on. Ideally we’d use a Shapefile of the English regions, which R would be able to load and render easily, but there isn’t a freely available one. There’s a public domain SVG map of the English regions here, but R doesn’t load SVG either. We can convert the SVG to a table of co-ordinates that we can plot from R using a Python script:

```#!/usr/bin/python
from BeautifulSoup import BeautifulStoneSoup
import re
# We know that the file consists of a single top-level g
# containing a flat list of path elements.
# Each path consists of subpaths only using M/L/z
# So use this knowledge to extract the polylines
# Convert svg class names to gfta region names
names = {"east-midlands":"East Midlands", "east-england":"East of England",
"london":"London", "north-east":"North East",
"north-west":"North West", "south-east":"South East",
"south-west":"South West", "west-midlands":"West Midlands",
"yorkshire-and-humber":"Yorkshire and The Humber"}
svg = open("map/England_Regions_-_Blank.svg")
soup = BeautifulStoneSoup(svg)
# Get the canvas size, to use for flipping the y co-ordinate
height = float(soup.svg["height"])
# Get the containing g
g = soup.find("g")
# Get the translate in the transform
transform = re.match(r"translate\((.+), (.+)\)", g["transform"])
transform_x = float(transform.group(1))
transform_y = float(transform.group(2))
# Get the paths in the g
paths = g.findAll("path")
print("region,subpath,x,y")
for path in paths:
# Get the id and convert to region name
region_name = names[path["id"]]
# Get the path data to process
path_d = path["d"]
# Split around M commands to get subpaths
path_d_subpaths = path_d.split("M")
# Keep a count of the subpaths within the id so we can identify them
subpath_count = 0
for subpath in path_d_subpaths:
# The split will result in a leading empty string
if subpath == "":
continue
subpath_count = subpath_count + 1
# Split around the L commands to get a list of points
# The first M point already has its command letter removed
points = subpath.split("L")
for point in points:
# Remove trailing z if present
cleaned_point = point.split()[0]
# Split out the point components and translate them
(x, y) = cleaned_point.split(",")
transformed_x = float(x) + transform_x
flipped_y = height + (height - float(y))
transformed_y = flipped_y + transform_y
# Write a line in the csv
print "%s,%s,%s,%s" % (region_name, subpath_count, transformed_x,
transformed_y)
```

Now we can load the grants data and the map into R, calculate the total value of grants for each region, and colour each region of the map accordingly.

Here’s the R code:

```## The data used to plot a map of the English regions
colClasses=c("factor", "integer", "numeric", "numeric"))
## Plot the English regions in the given colours
## See levels(england\$region) for the region names
## colours is a list of region="#FF00FF" colours for regions
## range.min and range.max are for the key values
## main.title is the main label for the plot
## key.title is the title for the key
plotEnglandRegions<-function(colours, range.min, range.max, main.title,
key.title){
plot.new()
## Reasonable values for the window size
plot.window(c(0, 600),
c(0, 600))
## For each regionname
lapply(levels(england\$region),
function(region){
if (region %in% levels(england\$region)){
## For each subpath of each region
lapply(1:max(england\$subpath[england\$region == region]),
function(subpath){
## Get the points of that subpath
subpath.points<-england[england\$region == region &
england\$subpath == subpath,]
## And colour it the region's colour
polygon(subpath.points\$x, subpath.points\$y,
col=colours[[region]])
})
}
})
## Colour Scale
## Turn off scientific notation (for less than 10 digits)
options(scipen=10)
## Sort the colours so they match the values
colours.sorted<-sort(colours)
## The by is set to fit the number of colours and the value range
legend("topright", legend=seq(from=range.min, to=range.max,
by=((range.max - range.min) / (length(colours) - 1))),
fill=colours.sorted,
title=key.title)
title(main.title)
}
## Load the region award data
colClasses=c("integer", "character", "character", "character",
"character", "factor", "factor", "factor",
"factor", "factor"))
## region\$Award.amount contains commas
region\$Award.amount<-gsub(",", "", region\$Award.amount)
## And we want it as a number
region\$Award.amount<-as.integer(region\$Award.amount)
## Get the totals by region
region.totals<-tapply(region\$Award.amount, list(region\$Region), sum)
## But we don't want the "Other" region
region.totals<-region.totals[names(region.totals) != "Other"]
## Calculate the range of colours
## The minimum value, to the nearest lowest million
value.max<-12000000
## The highest vvalue, to the nearest highest million
value.min<-4000000
## The darkest colour (in a range of 0.0 to 1.0)
colour.base<-0.15
## How to get the range of colours between that and 1.0
colour.multiplier<-(1.0 - colour.base) / (value.max - value.min)
## Make the colour levels
levels<-lapply(region.totals,
function(i){
colour.base + (i - value.min) * colour.multiplier})
colours<-rgb(levels, 0, 0)
## Add the region names to the colours
names(colours)<-names(region.totals)
## Plot each region in the given colour
plotEnglandRegions(colours, value.min, value.max, "Grants For The Arts 2009/10",
"Total awards in £")```

And here’s the resulting map:

Who can point out the methodological flaw in this visualisation? 😉
Exploring Art History Data 2

Let’s see how art form and genre relate in the Freebase “Visual Art” dataset of artworks.

```# read the artwork data
# Get rows with both genre and form
# This loses most of the data :-/
art<-artwork[artwork\$art_genre != "" & artwork\$art_form != "", c("art_genre", "art_form")]
# Drop unused factors
art\$art_genre<-as.factor(as.character(art\$art_genre))
art\$art_form<-as.factor(as.character(art\$art_form))
# Get table
art.table<-table(art) ##as.table(ftable(art))
# Strip rows and columns where max < tolerance
tolerance<-3
art.table.cropped<-art.table[rowSums(art.table) >= tolerance,colSums(art.table) >=tolerance]
# Print wide table (make sure you resize your terminal window)
options(width=240)
print.table(art.table.cropped)```

```                                  art_form
art_genre                          Drawing Fresco Installation art Metalworking Painting Photography Relief Sculpture Tapestry
Abstract art                           2      0                6            0       36           0      0         5        0
Allegory                               0      0                0            0        7           0      0         0        0
Animal Painting                        0      0                0            0       14           0      0         0        0
Christian art                          0      0                0            0        1           0      0         1        0
Christian art,History painting         0      0                0            0        2           0      0         0        0
Decorative art                         0      0                0            6        0           0      3         0        4
Fantastic art                          0      0                0            0        4           0      0         0        0
Genre painting                         0      0                0            0      120           0      0         0        0
Genre painting,Landscape art           0      0                0            0        4           0      0         0        0
History painting                       0     10                0            0      207           0      0         0        0
History painting,Landscape art         0      0                0            0        3           0      0         0        0
History painting,Religious image       0      0                0            0        3           0      0         0        0
Landscape art                          0      0                0            0      169           1      0         0        0
Landscape art,Genre painting           0      0                0            0        7           0      0         0        0
Landscape art,Marine art               0      0                0            0        3           0      0         0        0
Marine art                             0      0                0            0       34           1      0         0        0
Marine art,History painting            0      0                0            0        4           0      0         0        0
Marine art,Landscape art               0      0                0            0        3           0      0         0        0
Monument                               0      0                0            0        0           0      0         8        0
Portrait                               2      1                0            0      230           5      0         0        0
Religious image                        0      0                0            0        4           0      0         0        0
Religious image,History painting       0      0                0            0        4           0      0         0        0
Still life                             0      0                0            0       35           0      0         0        0
```

This time painting rather than photography has suspiciously more entries than any other medium, as more paintings than any other medium have genre information in the dataset.

Found Art Criticism

I present Found Art Criticism:

“Kanye West’s intermittent tweets about art always make my day, so you
can imagine my joy when I saw these tweets pop up in my feed”

Here’s an example of Kanye West’s art-related tweets from the post:

Like yo this Mark
Rothko is the shit! You see it works. This is a break through people.
I now know how to communicate art! YES!!!!”

Notice how this has more critical content and social context than your average self-identified art criticism or theory blog post. Now we just need to form a Surf Club to nominate found art criticism and theory texts. All conscious critical and theoretical activity on the internet will be rendered irrelevant.

We could even do it as art, which would problematise it and make it resistant to simply being meta-nominated. Chris Anderson may have been wrong about Google replacing the scientific method, but if it can replace art for the cultural management (and God knows they’ve been trying to make it do so) then it can replace criticism and theory (and curation) as well.