Categories
Aesthetics Art History Art Open Data

Art Data Analysis: Emily Vanderpoel

(Via Ptak Science Blog, which has more information)
cu31924008632964_0322.jpg“Color problems : a practical manual for the lay student of color” (1902) – Emily Noyes Vanderpoel – Download from archive.org

Vanderpoel’s colour proportion analyses look like colour quantized and re-ordered low-resolution image pixels. They are a useful historical precedent and visual model for computational analysis of images.

Categories
Aesthetics Art History Art Open Data

Art Data Analysis: Roger de Piles

(Via Ptak Science Books)

“To his last published work: Cours de peinture par principes avec un balance de peintres (1708) de Piles appended a list of fifty-six major painters in his own
time, with whose work he had acquainted himself as a connoisseur during
his travels.

To each painter in the list he gave marks from 0 to 18 for
composition, drawing, color and expression. This gave an overview of
aesthetic appreciation hingeing on the balance between color and design.
The highest marks went to Raffaello Sanzio and Rubens,
with a slight bias on color for Rubens, a slight bias on drawing for
Raphaël. Painters who scored very badly in anything but color were Giovanni Bellini, Giorgione and remarkably Michelangelo Caravaggio
with 16 on color and 0 (zero) on expression. Painters who fell far
behind Rubens and Raphaël but whose balance between color and design was
perfect were Lucas van Leyden, Sebastian Bourdon, Albrecht Dürer.”

http://en.wikipedia.org/wiki/Roger_De_Piles

The Wikipedia page reproduces this list as a table from which the values can be easily extracted. Like this:

Painter,Composition,Drawing,Color,Expression
Andrea del Sarto,12,16,9,8
Federico Barocci,14,15,6,10
Jacopo Bassano,6,8,17,0
Giovanni Bellini,4,6,14,O
Sebastian Bourdon,10,8,8,4
Charles Le Brun,16,16,8,16
I Carracci,15,17,13,13
Cavalier D'Arpino,10,10,6,2
Correggio,13,13,15,12
Daniele da Volterra,12,15,5,8
Abraham van Diepenbeeck,11,10,14,6
Il Domenichino,15,17,9,17
Albrecht Dürer,8,10,10,8
Giorgione,8,9,18,4
Giovanni da Udine,10,8,16,3
Giulio Romano,15,16,4,14
Guercino,18,10,10,4
Guido Reni,NA,13,9,12
Holbein,9,10,16,3
Jacob Jordaens,10,8,16,6
Lucas Jordaens,13,12,9,6
Giovanni Lanfranco,14,13,10,5
Leonardo da Vinci,15,16,4,14
Lucas van Leyden,8,6,6,4
Michelangelo,8,17,4,8
Caravaggio,6,6,16,O
Murillo,6,8,15,4
Otho Venius,13,14,10,10
Palma il Vecchio,5,6,16,0
Palma il Giovane,12,9,14,6
Il Parmigianino,10,15,6,6
Gianfrancesco Penni,O,15,8,0
Perin del Vaga,15,16,7,6
Sebastiano del Piombo,8,13,16,7
Primaticcio,15,14,7,10
Raphael,17,18,12,18
Rembrandt,15,6,17,12
Rubens,18,13,17,17
Francesco Salviati,13,15,8,8
Eustache Le Sueur,15,15,4,15
Teniers,15,12,13,6
Pietro Testa,11,15,0,6
Tintoretto,15,14,16,4
Titian,12,15,18,6
Van Dyck,15,10,17,13
Vanius,15,15,12,13
Veronese,15,10,16,3
Taddeo Zuccari,13,14,10,9
Federico Zuccari,10,10,8,8

Or you can get a scan of the original text from Google Books here. The table of scores starts on page 408 of the PDF version.

This kind of mock-objective scoring of artworks using whatever system is fun but even at the time was open to ridicule. By Hogarth, for example.

What might be more interesting would be if lots of people scored artworks or artists similarly.

Categories
Art History Art Open Data

Exploring Art Data 12

Back to Vasari’s Lives.

We can compare Vasari’s description of Giovanni Cimabue to Wikipedia’s article on the artist.

The results show a surprising degree of similarity:


## install.packages("RCurl")
library(RCurl)
## Strip wiki code
deWikify<-function(text){
## Remove {{stuff}}
text<-gsub("\\{\\{[^}]+\\}\\}", "", text)
## Remove [[stuff]]
text<-gsub("\\[\\[[^]]+\\]\\]", "", text)
## Remove [stuff]
text<-gsub("\\[[^]]+\\]", "", text)
## Remove 
text<-gsub("<[^>]+>", "", text)
## Remove punctuation
#text<-gsub("[[:punct:]]", "", text)
## Lowercase words
text<-tolower(text)
text
}
## Get the text of a page from Wikipedia
getWikipediaArticle<-function(subject){
page<-getURL(paste("http://en.wikipedia.org/w/index.php?title=",
curlEscape(subject), "&action=raw", sep=""),
.opts=list(useragent="Mozilla/5.0 (X11; U; Linux i686; fr; rv:1.9.1.3) Gecko/20090913 Firefox/3.5.3"))
deWikify(page)
}
cimabuePage<-getWikipediaArticle("Cimabue")
cimabue.corpus<-Corpus(VectorSource(c(artists[1], cimabuePage)),
readerControl=list(language="english",
reader=readPlain))
cimabueDtm<-DocumentTermMatrix(cimabue.corpus)
dissimilarity(cimabueDtm, method="cosine")

They seem reassuringly similar (similarity is 1.0 – dissimilarity):


1 2 0.1079431
Categories
Art Computing Art History Art Open Data

Exploring Art Data 11

Let’s look at a more contemporary source than Vasari, Cultural Bloggers Interviewed.

We can download the PDF with a shell script:

#!/bin/bash
wget "http://live.labforculture.org/2010/09/cbi/files/cultural_blogger.pdf"

And then load in the data and process it in R using tm again (with a slight modification to the function that cleans up the text):


library(tm) blogfile<-"./cultural_blogger.pdf" bloggers.names<-c("Claire Welsby", "Michelle Kasprzak", "Alek Tarkowski", "Marco Mancuso", "Anne Helmond", "Robert Misik", "Marta Peirano & José Luis de Vicente", "Alessandro Ludovico", "Régine Debatty") bloggers<-data.frame(name=bloggers.names, from=c(6, 11, 15, 19, 23, 27, 31, 35, 41), to=c(10, 14, 18, 22, 26, 30, 34, 40, 44)) ## Clean footnotes, etc. from article text cleanArticle<-function(text){ ## Remove urls. Would miss final url in a document ;-) text<-lapply(text, function(line){gsub("http://.+\\s", "", line, perl=TRUE)}) ## Remove punctuation text<-lapply(text, function(line){gsub("[[:punct:]]", "", line)}) ## Lowercase words text<-lapply(text, tolower) text } ## Load the blogger texts bloggers.texts<-apply(bloggers, 1, function(blogger){ reader<-readPDF(PdftotextOptions=paste("-layout", "-f", blogger[2], "-l", blogger[3])) reader(elem=list(uri=blogfile), language="en", id=blogger[1])}) ## Clean up the blogger texts bloggers.texts<-lapply(bloggers.texts, cleanArticle) ## Make a corpus of the bloggers texts bloggers.corpus<-Corpus(VectorSource(bloggers.texts), readerControl=list(language="english", reader=readPlain)) ## Remove whitespace within terms bloggers.clean<-tm_map(bloggers.corpus, stripWhitespace) ## Remove stopwords bloggers.clean<-tm_map(bloggers.clean, removeWords, stopwords("english")) ## Stem words ## No, this looks weird in the results ##bloggers.clean<-tm_map(bloggers.clean, stemDocument) ## Term/document matrix dtm<-DocumentTermMatrix(bloggers.clean) ## Remove infrequent terms to save memory dtm<-removeSparseTerms(dtm, 0.4)

Then we can find the most common terms:


## Frequent terms in the matrix findFreqTerms(dtm, 4)

 [1] "art"          "artists"      "arts"         "audience"     "based"
[6] "bit"          "blog"         "blogging"     "blogs"        "community"
[11] "contemporary" "content"      "difficult"    "example"      "experience"
[16] "feel"         "include"      "involved"     "issues"       "mainly"
[21] "media"        "people"       "platform"     "post"         "probably"
[26] "project"      "public"       "regarding"    "scene"        "technology"
[31] "thats"        "time"         "via"          "website"      "world"
[36] "course"       "definitely"   "describe"     "dont"         "facebook"
[41] "focus"        "interview"    "job"          "money"        "personal"
[46] "research"     "started"      "cultural"     "culture"      "digital"
[51] "music"        "write"        "writing"      "active"       "consider"
[56] "critical"     "english"      "following"    "hand"         "information"
[61] "network"      "popular"      "tools"        "actually"     "especially"
[66] "etc"          "hard"         "led"          "live"         "lot"
[71] "question"     "ive"          "online"       "read"         "video"
[76] "book"         "changed"      "european"     "model"        "moment"
[81] "specific"     "start"        "times"        "economic"     "readers"    

Look at associations:


## Frequently associated terms findAssocs(dtm, "blogging", 0.2)

 blogging      dont      read   usually      chat     video      blog    follow
1.00      0.74      0.61      0.57      0.56      0.55      0.49      0.45
research     blogs      hard       via      life       etc      live      role
0.42      0.38      0.38      0.38      0.37      0.35      0.35      0.33
scene  cultural       job  question      able interview     money       ive
0.31      0.30      0.30      0.30      0.27      0.27      0.24      0.23
led    course
0.21      0.20 

Find similar bloggers:


## Dissimilarity dis<-dissimilarity(dtm, method="cosine") ## The most similar bloggers for each blogger, in order of similarity similarityMin<-0.25 mostSimilarBloggers<-apply(dis, 1, function(row){ sorted<-sort(row) ordered<-order(row) ## 0.0 == same blogger ordered[sorted > 0.0 & sorted < similarityMin] }) for(doc in 1:length(mostSimilarBloggers)){ mostSimilar<-unlist(mostSimilarBloggers[doc]) if(length(mostSimilar) > 0){ count<-min(length(mostSimilar), 5) similar<-paste(bloggers.names[mostSimilar[1:count]], collapse=", ") }else{ similar<-"None" } cat(bloggers.names[[doc]], ": ", similar, "\n\n") }

Claire Welsby :  None
Michelle Kasprzak :  Régine Debatty, Anne Helmond
Alek Tarkowski :  Anne Helmond, Régine Debatty
Marco Mancuso :  None
Anne Helmond :  Alek Tarkowski, Michelle Kasprzak, Régine Debatty
Robert Misik :  None
Marta Peirano & José Luis de Vicente :  None
Alessandro Ludovico :  None
Régine Debatty :  Michelle Kasprzak, Alek Tarkowski, Anne Helmond 

Cluster bloggers:


## Clusters of similar bloggers clusterCount<-3 clusters<-kmeans(dtm, clusterCount) clusters.bloggers<-lapply(1:clusterCount, function(cluster){ bloggers.names[clusters$cluster == cluster]}) for(cluster in 1:clusterCount){ cat("Cluster", cluster, ":", paste(unlist(clusters.bloggers[cluster]), collapse=", "), "\n\n") }

Cluster 1 : Michelle Kasprzak, Alek Tarkowski, Anne Helmond, Régine Debatty
Cluster 2 : Claire Welsby, Marco Mancuso
Cluster 3 : Robert Misik, Marta Peirano & José Luis de Vicente, Alessandro Ludovico 

And plot associations between terms used in the text:


## Plot associations between terms plot(dtm, findFreqTerms(dtm, 6), attrs=list(graph=list(), node=list(shape="rectangle", fontsize="120", fixedsize="false")))

bloggers-associations.png

Categories
Art History Art Open Data

Exploring Art Data 10

Let’s make a word clouds for all the artists:

## install.packages('snippets',,'http://www.rforge.net/')
library(snippets)
## Create a word cloud for all artists
freqAll<-termFreq(PlainTextDocument(paste(artists, collapse=" "),
language="en"),
control=list(removePunctuation=TRUE, removeNumbers=TRUE,
stopwords=TRUE, minDocFreq=100))
cloud(freqAll, col = col.br(freqAll, fit=TRUE))

vasari-wordcloud-all.pngAnd here’s a tag cloud for just one artist (Giovanni Cimabue) to compare it with:

## Create a word cloud for one artist
freq<-termFreq(PlainTextDocument(artists[1], language="en"),
control=list(removePunctuation=TRUE, removeNumbers=TRUE,
stopwords=TRUE, minDocFreq=3))
cloud(freq, col = col.br(freq, fit=TRUE))

vasari-wordcloud-giovanni-cimabue.png

Categories
Art Computing Art History Art Open Data

Exploring Art Data 9

Now let’s see which artists are described most similarly by Vasari:

## Dissimilarity
dis<-dissimilarity(dtm, method="cosine")
## The most similar artists for each artist, in order of similarity
similarityMin<-0.2
mostSimilarArtists<-apply(dis, 1,
function(row){
sorted<-sort(row)
ordered<-order(row)
## 0.0 == same artist
ordered[sorted > 0.0 & sorted < similarityMin]
})
for(doc in 1:length(mostSimilarArtists)){
mostSimilar<-unlist(mostSimilarArtists[doc])
if(length(mostSimilar) > 0){
count<-min(length(mostSimilar), 5)
similar<-paste(artists.names[mostSimilar[1:count]], collapse=", ")
}else{
similar<-"None"
}
cat(artists.names[[doc]], ": ", similar, "\n\n")
}

Which gives us (truncated…):

Giovanni Cimabue :  Giotto, Masaccio, Agnolo Gaddi
Arnolfo Di Lapo :  Niccola And Giovanni Of Pisa
Niccola And Giovanni Of Pisa :  Arnolfo Di Lapo, Agostino And Agnolo Of Siena
Andrea Tafi :  None
Gaddo Gaddi :  None
Margaritone :  None
Giotto :  Buonamico Buffalmacco, Pietro Perugino, Giovanni Cimabue, Raffaello Da Urbino, Taddeo Gaddi
...

And let’s cluster the artists together using k-means clustering:

## Clusters of similar artists
clusterCount<-10
clusters<-kmeans(dtm, clusterCount)
clusters.artists<-lapply(1:clusterCount,
function(cluster){
artists.names[clusters$cluster == cluster]})
for(cluster in 1:clusterCount){
cat("Cluster", cluster, ":",
paste(unlist(clusters.artists[cluster]), collapse=", "),
"\n\n")
}

Which gives us:

Cluster 1 : Filippo Brunelleschi, Torrigiano, Antonio Da San Gallo The Younger, Niccolò Called Tribolo, Simone Mosca, Fra Giovanni Agnolo Montorsoli
Cluster 2 : Domenico Puligo, Francesco Mazzuoli, Niccolò Soggi, Giovanni Antonio Bazzi Called Il Sodoma
Cluster 3 : Giotto, Lorenzo Di Bicci, Ercole Ferrarese, Baldassarre Peruzzi, Madonna Properzia De Rossi, Girolamo Da Treviso, Il Rosso, Franciabigio, Giulio Romano, Fra Sebastiano Viniziano Del Piombo, Domenico Beccafumi Of Siena, Cristofano Gherardi Called Doceno Of Borgo San Sepolcro, Michele San Michele, Giovanni Da Udine, Battista Franco, Daniello Ricciarelli
Cluster 4 : Baccio Bandinelli
Cluster 5 : Andrea Tafi, Gaddo Gaddi, Margaritone, Stefano Painter Of Florence And Of Ugolino Sanese, Pietro Laurati, Ambrogio Lorenzetti, Pietro Cavallini, Tommaso Called Giottino, Giovanni Dal Ponte, Agnolo Gaddi, Berna, Duccio, Antonio Viniziano, Jacopo Di Casentino, Gherardo Starnina, Lippo, Don Lorenzo Monaco, Taddeo Bartoli, Niccolò Aretino, Dello, Nanni Dantonio Di Banco, Masolino Da Panicale, Giuliano Da Maiano, Piero Della Francesca, Leon Batista Alberti, Lazzaro Vasari, Antonello Da Messina, Alesso Baldovinetti, Vellano Da Padova, Benozzo Gozzoli15, Galasso Ferrarese17, Desiderio Da Settignano, Mino Da Fiesole, Lorenzo Costa, Benedetto Da Maiano, Jacopo Called Lindaco, Giorgione Da Castelfranco, Antonio Da Correggio, Mariotto Albertinelli, Raffaellino Del Garbo, Simone Called Il Cronaca, Marco Calavrese, Francesco Granacci Il Granaccio, Giuliano Bugiardini
Cluster 6 : Baccio D Agnolo, Perino Del Vaga, Bastiano Da San Gallo Called Aristotile, Francesco Salviati, Taddeo Zucchero
Cluster 7 : Pietro Perugino, Raffaello Da Urbino, Andrea Del Sarto, Jacopo Da Pontormo
Cluster 8 : Arnolfo Di Lapo, Niccola And Giovanni Of Pisa, Agostino And Agnolo Of Siena, Andrea Pisano, Jacopo Della Quercia, Luca Della Robbia, Lorenzo Ghiberti, Donato, Michelozzo Michelozzi, Andrea Verrocchio, Bramante Da Urbino, Andrea Dal Monte Sansovino, Benedetto Da Rovezzano, Pierino Piero Da Vinci, Giovan Francesco Rustici
Cluster 9 : Michelagnolo Buonarroti
Cluster 10 : Giovanni Cimabue, Buonamico Buffalmacco, Simone Sanese, Taddeo Gaddi, Andrea Di Cione Orcagna, Spinello Aretino, Paolo Uccello, Parri Spinelli, Masaccio, Fra Filippo Lippi, Cosimo Rosselli, Sandro Botticelli, Andrea Mantegna, Filippo Lippi Called Filippino, Bernardino Pinturicchio, Francesco Francia, Luca Signorelli Of Cortona, Leonardo Da Vinci10, Piero Di Cosimo, Fra Bartolommeo Di San Marco, Guglielmo Da Marcilla, Lorenzo Di Credi, Giovanni Antonio Sogliani, Giovanni Antonio Lappoli

Michaelangelo is clearly singular. 😉

Categories
Art Computing Art History Art Open Data

Exploring Art Data 8

Let’s explore the text of Vasari’s Lives of The Artists.

The full text of an English translation can be found on Project Gutenberg.

We can use a shell script to download the files to a local folder:

#!/bin/bash
DESTDIR="vasari"
BOOKS="25326 25759 26860 28420 28421 28422 31845 31938 32362 33203"
mkdir -p "${DESTDIR}"
pushd "${DESTDIR}"
for ebook in ${BOOKS}
do
wget "http://www.gutenberg.org/ebooks/${ebook}.txt"
done
popd

And then we can use R’s “tm” library to load the files:


## For "dissimilar" in tm
## install.packages("proxy")
## For "plot" on dtm in tm
## source("http://bioconductor.org/biocLite.R")
## biocLite("Rgraphviz")
## install.packages("tm")
library(tm)
dir<-"vasari"
prefix<-"pg"
extention<-".txt"
## These are the Project Gutenberg book numbers for Lives Of The Artists
## Skip volume 10, this doesn't follow the same format: , 33203
books<-c(25326, 25759, 26860, 28420, 28421, 28422, 31845, 31938, 32362)
## Make a file path for a book
bookPath<-function(id){
paste(dir, "/", prefix, id, extention, sep="")
}
## Load the file
loadFile<-function(filename){
readChar(filename, file.info(filename)$size)
}
## Load the files
loadFiles<-function(filenames){
sapply(filenames, loadFile, USE.NAMES=FALSE)
}
## Load the texts
texts<-loadFiles(sapply(books, bookPath))

We can then extract the entry for each artist, clean up the data, create a corpus, and then clean up the corpus:
 

## Extract entries on each artist
extractArtists<-function(text){
## Split each file into blocks between "LIFE OF .*\n"
artists<-unlist(strsplit(text, "\nLIFE OF"))
## Discard first block, that is introduction
## Last block will be discarded by the article cleaning function
artists[2:length(artists)]
}
## Clean footnotes, etc. from article text
cleanArticle<-function(text){
## Truncate at \nFOOTNOTES:
text<-unlist(strsplit (text, split="\nFOOTNOTES:"))[1]
## Remove [Text in square brackets]
text<-gsub("\\[[^]]+\\]", "", text)
## Remove punctuation
text<-gsub("[[:punct:]]", "", text)
## Lowercase words
text<-tolower(text)
text
}
## Get the first line of a string
firstLine

We can then create a term/document matrix (and remove infrequently used terms) to explore the corpus:

## Term/document matrix
dtm<-DocumentTermMatrix(artists.clean)
## Remove infrequent terms to save memory
dtm<-removeSparseTerms(dtm, 0.4)

We can find frequently used terms:

## Frequent terms in the matrix
findFreqTerms(dtm, 60)

 [1] "lorenzo"    "painted"    "pietro"     "life"       "andrea"
[6] "francesco"  "giovanni"   "beautiful"  "executed"   "antonio"
[11] "domenico"   "duke"       "marble"     "jacopo"     "church"
[16] "hand"       "little"     "able"       "afterwards" "age"
[21] "art"        "beauty"     "caused"     "chapel"     "christ"
[26] "city"       "day"        "death"      "del"        "design"
[31] "excellent"  "figure"     "figures"    "finished"   "florence"
[36] "friend"     "head"       "held"       "house"      "judgment"
[41] "left"       "likewise"   "manner"     "master"     "messer"
[46] "painter"    "painting"   "palace"     "pictures"   "placed"
[51] "pope"       "reason"     "rome"       "seen"       "sent"
[56] "set"        "time"       "various"    "whereupon" 

We can see which words are strongly associated:

findAssocs(dtm, "painting", 0.8)

 

  painting    painter   painters       hand     little    painted   pictures
1.00       0.90       0.90       0.89       0.89       0.89       0.89
beautiful    figures      grace       lady       save     beauty   executed
0.87       0.87       0.86       0.86       0.86       0.85       0.85
manner   portrait       time     worthy  excellent      hands   likewise
0.85       0.85       0.85       0.85       0.84       0.84       0.84
particular       seen       sent      truth       able        art  craftsmen
0.84       0.84       0.84       0.84       0.83       0.83       0.83
friend      house       left      lived     living     return        age
0.83       0.83       0.83       0.83       0.83       0.83       0.82
besides     christ        day    finally    mention   received      study
0.82       0.82       0.82       0.82       0.82       0.82       0.82
chapel       city  diligence excellence       head     honour     master
0.81       0.81       0.81       0.81       0.81       0.81       0.81
nature       rome       true       held  wherefore
0.81       0.81       0.81       0.80       0.80 

And we can plot those associations:

## Plot associations between terms
plot(dtm, findFreqTerms(dtm, 120), attrs=list(graph=list(),
node=list(shape="rectangle",fontsize="72", fixedsize="false")))

Which looks like this:
termgraph.png

Categories
Art History Art Open Data

Art data Analysis: Unconcealed

unconcealed_final_cover_front.jpg

Unconcealed is an exemplary presentation of previously undisclosed data concerning the exhibition and collection of Conceptual Art in Europe in the 1960s/1970s and using it to study the social and economic networks of the history of conceptualism. It would be fascinating to see this kind of project for other artistic movements.

See here for more information.

Categories
Aesthetics Art Computing Art History Art Open Data

Exploring Art Data 6

Let’s access an API and start analysing images.

We’ll use R to get information about a series of works (Monet’s “Haystacks) and images of them from freebase.

In order to do this we’ll need to install some new libraries:

source("http://bioconductor.org/biocLite.R")
biocLite("EBImage")
install.packages("RJSONIO")

Then load the libraries:

library(EBImage)
library(RJSONIO)

And patch one of them to work with freebase:

## Monkeypatch RJSONIO so list() -> []

oldlistmethod<-getMethod("toJSON", "list") setMethod("toJSON", "list", function(x, ...){ if(length(x) == 0){ return("[]") } else { return(oldListMethod(x, ...)) } })

We can then write code to access the freebase web API:

## Query the freebase API, taking and returning R objects
queryFreebase<-function(query){
wrappedQuery<-list(query=query)
queryJSON<-toJSON(wrappedQuery)
response<-getURL(paste('http://api.freebase.com/api/service/mqlread?query=',
curlEscape(queryJSON), sep=""))
responseJSON<-fromJSON(response)
stopifnot(responseJSON$status == "200 OK")
responseJSON$result
}
## Get the series description and list of works from freebase
getSeries<-function(series_name){
query<-list(name=series_name,
type="/visual_art/art_series",
artworks=list())
queryFreebase(query)
}
## Get the artwork description from freebase
getArtwork<-function(artwork_name){
query<-list(name=artwork_name,
type="/visual_art/artwork",
"*"=NULL)
queryFreebase(query)
}
## Get the image description from freebase
getImage<-function(entity_id){
query<-list(id=entity_id,
"/common/topic/image"=list(id=NULL),
"*"=NULL)
queryFreebase(query)
}
## The maximum height or width of a thumbnail
thumbSize<-100
## Use the freebase thumbnail to try and get a thumbnail for the image
## Returns NULL if image couldn't be found
getThumbnail<-function(image, thumbSize){
# On fail, redirect to a url that's guaranteed not to be an image,
# we use the api root here
# Use http as EBImage's use of curl doesn't like https
url<-paste('http://api.freebase.com/api/trans/image_thumb',
image[[1]]$id, '?maxwidth=', thumbSize, '&maxheight=',
thumbSize, '&mode=fit&onfail=/', sep="")
readImage(url)
}

We can fetch data about Monet’s “Haystacks”, and images where those are available:

## Fetch the series entry
series<-getSeries("Haystacks")
## Fetch the entries for individual artworks in the series
artworks<-lapply(series$artworks, getArtwork)
## Get the names of the retrieved artwork data in order
artworksNames<-lapply(artworks, function(artwork){artwork[["name"]]})
## Get the image resource information for the artworks
artworksImages<-lapply(artworks, function(artwork){getImage(artwork[["id"]])})
## Fetch a thumbnail bitmap where available, and clear out NULLs
artworksThumbnails<-sapply(artworksImages,
function(image){getThumbnail(image, thumbSize)})
names(artworksThumbnails)<-artworksNames
artworksThumbnails<-Filter(Negate(is.null), artworksThumbnails

Having fetched the images, we can convert them to greyscale and produce a box plot of their brightness:

## Draw a box plot of the brightness, allowing enough room for rotated labels
par(mar=c(20,4,1,1))
boxplot(grayscaleArtworksThumbnails, las=2)

Which looks like this:

haystacks_boxplots.gif

It’s interesting to compare the brightness ranges of the paintings, and to see the outliers.
Categories
Art History Art Open Data

archive.org Art History 3

Some searches that give good results:

Victoria & Albert Museum

Painting Catalogue

Art History

Art Exhibition

Art Gallery

Artists

Modern Art