Exploring Art Data 6

Let’s access an API and start analysing images.

We’ll use R to get information about a series of works (Monet’s “Haystacks) and images of them from freebase.

In order to do this we’ll need to install some new libraries:

source("http://bioconductor.org/biocLite.R")
biocLite("EBImage")
install.packages("RJSONIO")

Then load the libraries:

library(EBImage)
library(RJSONIO)

And patch one of them to work with freebase:

## Monkeypatch RJSONIO so list() -> []

oldlistmethod<-getMethod("toJSON", "list") setMethod("toJSON", "list", function(x, ...){ if(length(x) == 0){ return("[]") } else { return(oldListMethod(x, ...)) } })

We can then write code to access the freebase web API:

## Query the freebase API, taking and returning R objects
queryFreebase<-function(query){
wrappedQuery<-list(query=query)
queryJSON<-toJSON(wrappedQuery)
response<-getURL(paste('http://api.freebase.com/api/service/mqlread?query=',
curlEscape(queryJSON), sep=""))
responseJSON<-fromJSON(response)
stopifnot(responseJSON$status == "200 OK")
responseJSON$result
}
## Get the series description and list of works from freebase
getSeries<-function(series_name){
query<-list(name=series_name,
type="/visual_art/art_series",
artworks=list())
queryFreebase(query)
}
## Get the artwork description from freebase
getArtwork<-function(artwork_name){
query<-list(name=artwork_name,
type="/visual_art/artwork",
"*"=NULL)
queryFreebase(query)
}
## Get the image description from freebase
getImage<-function(entity_id){
query<-list(id=entity_id,
"/common/topic/image"=list(id=NULL),
"*"=NULL)
queryFreebase(query)
}
## The maximum height or width of a thumbnail
thumbSize<-100
## Use the freebase thumbnail to try and get a thumbnail for the image
## Returns NULL if image couldn't be found
getThumbnail<-function(image, thumbSize){
# On fail, redirect to a url that's guaranteed not to be an image,
# we use the api root here
# Use http as EBImage's use of curl doesn't like https
url<-paste('http://api.freebase.com/api/trans/image_thumb',
image[[1]]$id, '?maxwidth=', thumbSize, '&maxheight=',
thumbSize, '&mode=fit&onfail=/', sep="")
readImage(url)
}

We can fetch data about Monet’s “Haystacks”, and images where those are available:

## Fetch the series entry
series<-getSeries("Haystacks")
## Fetch the entries for individual artworks in the series
artworks<-lapply(series$artworks, getArtwork)
## Get the names of the retrieved artwork data in order
artworksNames<-lapply(artworks, function(artwork){artwork[["name"]]})
## Get the image resource information for the artworks
artworksImages<-lapply(artworks, function(artwork){getImage(artwork[["id"]])})
## Fetch a thumbnail bitmap where available, and clear out NULLs
artworksThumbnails<-sapply(artworksImages,
function(image){getThumbnail(image, thumbSize)})
names(artworksThumbnails)<-artworksNames
artworksThumbnails<-Filter(Negate(is.null), artworksThumbnails

Having fetched the images, we can convert them to greyscale and produce a box plot of their brightness:

## Draw a box plot of the brightness, allowing enough room for rotated labels
par(mar=c(20,4,1,1))
boxplot(grayscaleArtworksThumbnails, las=2)

Which looks like this:

haystacks_boxplots.gif

It’s interesting to compare the brightness ranges of the paintings, and to see the outliers.
Posted in Aesthetics, Art Computing, Art History, Art Open Data