Let’s access an API and start analysing images.
We’ll use R to get information about a series of works (Monet’s “Haystacks) and images of them from freebase.
In order to do this we’ll need to install some new libraries:
source("http://bioconductor.org/biocLite.R") biocLite("EBImage") install.packages("RJSONIO")
Then load the libraries:
library(EBImage) library(RJSONIO)
And patch one of them to work with freebase:
## Monkeypatch RJSONIO so list() -> []
oldlistmethod<-getMethod("toJSON", "list") setMethod("toJSON", "list", function(x, ...){ if(length(x) == 0){ return("[]") } else { return(oldListMethod(x, ...)) } })
We can then write code to access the freebase web API:
## Query the freebase API, taking and returning R objects queryFreebase<-function(query){ wrappedQuery<-list(query=query) queryJSON<-toJSON(wrappedQuery) response<-getURL(paste('http://api.freebase.com/api/service/mqlread?query=', curlEscape(queryJSON), sep="")) responseJSON<-fromJSON(response) stopifnot(responseJSON$status == "200 OK") responseJSON$result } ## Get the series description and list of works from freebase getSeries<-function(series_name){ query<-list(name=series_name, type="/visual_art/art_series", artworks=list()) queryFreebase(query) } ## Get the artwork description from freebase getArtwork<-function(artwork_name){ query<-list(name=artwork_name, type="/visual_art/artwork", "*"=NULL) queryFreebase(query) } ## Get the image description from freebase getImage<-function(entity_id){ query<-list(id=entity_id, "/common/topic/image"=list(id=NULL), "*"=NULL) queryFreebase(query) } ## The maximum height or width of a thumbnail thumbSize<-100 ## Use the freebase thumbnail to try and get a thumbnail for the image ## Returns NULL if image couldn't be found getThumbnail<-function(image, thumbSize){ # On fail, redirect to a url that's guaranteed not to be an image, # we use the api root here # Use http as EBImage's use of curl doesn't like https url<-paste('http://api.freebase.com/api/trans/image_thumb', image[[1]]$id, '?maxwidth=', thumbSize, '&maxheight=', thumbSize, '&mode=fit&onfail=/', sep="") readImage(url) }
We can fetch data about Monet’s “Haystacks”, and images where those are available:
## Fetch the series entry series<-getSeries("Haystacks") ## Fetch the entries for individual artworks in the series artworks<-lapply(series$artworks, getArtwork) ## Get the names of the retrieved artwork data in order artworksNames<-lapply(artworks, function(artwork){artwork[["name"]]}) ## Get the image resource information for the artworks artworksImages<-lapply(artworks, function(artwork){getImage(artwork[["id"]])}) ## Fetch a thumbnail bitmap where available, and clear out NULLs artworksThumbnails<-sapply(artworksImages, function(image){getThumbnail(image, thumbSize)}) names(artworksThumbnails)<-artworksNames artworksThumbnails<-Filter(Negate(is.null), artworksThumbnails
Having fetched the images, we can convert them to greyscale and produce a box plot of their brightness:
## Draw a box plot of the brightness, allowing enough room for rotated labels par(mar=c(20,4,1,1)) boxplot(grayscaleArtworksThumbnails, las=2)
Which looks like this:
It’s interesting to compare the brightness ranges of the paintings, and to see the outliers.