Art Art History Art Open Data Free Culture Free Software Projects

Art Open Data – Government Art Collection Dataset

I have written a script to download a dataset containing collection information from the UK Government Art Collection site and save it in tab-seperated-value files and an sqlite database for easy access. As the data is from a UK government agency it’s under the OGL.

You don’t need to run the script, a downloaded dataset is included in the project archive:

The dataset doesn’t feature as many collections as the GAC website claims to feature, but the script does omit many duplicates. This project was inspired by Kasabi‘s scraper, adding the ability to download code and data in an easy-to-use format.

Aesthetics Art Art Computing Art Open Data Free Culture Free Software Generative Art Howto Projects Satire

Psychogeodata (2/3)


Geodata represents maps as graphs of nodes joined by edges (…as points joined by lines). This is a convenient representation for processing by computer software. Other data can be represented in this way, including words and their relationships.

We can map the names of streets into the semantic graph of WordNet using NLTK. We can then establish how similar words are by searching the semantic graph to find how far apart they are. This semantic distance can be used instead of geographic distance when deciding which nodes to choose when pathfinding.

Mapping between these two spaces (or two graphs) is a conceptual mapping, and searching lexicographic space using hypernyms allows abstraction and conceptual slippage to be introduced into what would otherwise be simple pathfinding. This defamiliarizes and conceptually enriches the constructed landscape, two key elements of Psychogeography.

The example above was created by the script derive_sem, which creates random walks between semantically related nodes. It’s easy to see the relationship between the streets it has chosen. You can see the html version of the generated file here, and the script is included with the Psychogeodata project at .

(Part one of this series can be found here, part three will cover potential future directions for Psychogeodata.)

Art Art Open Data Free Culture

A Balloon Dog Print

Here’s a picture of the Balloon Dog printed on a Makerbot:

I love the “nose”.

Thanks Lunpa!

Art Computing Art Open Data Free Software Projects

R Cultural Analytics Library Update

The R Cultural Analytics library has been updated to remove any dependency on EBImage (which in turn has a dependency on ImageMagick that complicates installation on many systems). In now uses raster images instead. This has also made the code faster.

You can find the new version and installation instructions here:

Aesthetics Art Computing Art Open Data Projects

The R Cultural Analytics Library

I have gathered together much of the code from my series of posts on Exploring Art Data as a library for the R programming language which is now available as a package on R-Forge:

I will be adding more code to the library over time. It’s very easy to install, just enter the following into an R session:

install.packages("CulturalAnalytics", repos="")

The library includes code for ImagePlot-style image scatter plots, colour histograms and colour clouds and other useful functions. The examples in the documentation should help new users to get started quickly.

R is the lingua franca for statistical computing, and I believe that it’s important for art and digital humanities computing to avail itself of its power.

Art Open Data

Art Text Data Analysis 2 – Themes And Topics

Discovering Themes

Topic Models

Topic Modelling Toolbox

MALLET (and a good example of using it)

Art History Art Open Data

Art Text Data Analysis 1

Network Analysis and the Art Market: Goupil 1880 – 1895 [PDF]

Wyndham Lewis’s Art Criticism in The Listener, 1946-51

Tools For Exploring Text

Auto Converting Project Gutenberg Text to TEI

Art Art History Art Open Data

Open Art Data – Datasets Update

Here’s a new OGL-licenced list of works in the UK government’s art collection, scraped for a Culture Hack Day –

The JISC OpenART Project is making good progress and considering which ODC licence to use. It should be both a great resource and a great case study –

I’ve mentioned it before but this Seattle government list of public art with geolocation information is really good –

And Freebase keep adding new information about visual art –

Europeana are ensuring that all the metadata they provide is CC0 –

Their API isn’t publicly available yet, though! 🙁 –

Finally, for now, some of the National Gallery’s data now seems to be under an attempt at a BSD-style licence. The OGL would be even better… –

Art Computing Art Open Data Free Software

Exploring Art Data 23

Having written a command-line interface (CLI), we will now write a graphical user interface (GUI). GUIs can be an effective way of managing the complexity of software, but their disadvantage is that they usually cannot be effectively scripted like CLI applications and that they usually cannot be extended or modified as simply or as deeply as code run from a REPL.

That said, if software is intended as a stand-alone tool for performing tasks that will not be repeated and do not require much setup, a GUI can be very useful. So we will write one for the code in image-properties.r

As with the CLI version, we will run this code using RScript. The script can be run from the command line, or an icon for it can be created in the operating system’s applications menu or dock.

#!/usr/bin/env Rscript
## -*- mode: R -*-

The GUI framework that we will use is the cross-platform gWidgets library. I have set it up to use Gtk here, but Qt and Tk versions are available as well. You can find out more about gWidgets at

## install.packages("gWidgetsRGtk2", dep = TRUE)

We source properties-plot.r to load the code that we will use to plot the image once we have gathered all the configuration information we need using the GUI


The first part of the GUI that we define is the top level window and layout. The layout of the top level window is a tabbed pane of the kind used by preferences dialogs and web browsers. We use this to organise the large number of configuration options for the code and to present them to the user in easily understood groupings.
Notice the use of “layout” objects as matrices to arrange interface widgets such as buttons within the window and later within each page of the “notebook” tabbed view.


The first tab contains code to create and handle input from user interface elements for selecting the kind of plot, the data file and folder of images to use, and the file to save the plot as if required. It also allows the user to specify which properties from the data file to plot.


We use functions to allow the user to choose the data file, image folder, and save file. Using the GUI framework's built-in support for file choosing makes this code remarkably compact.


Often part of the GUI must be updated, enabled or disabled in response to changes in another part. When the user selects a "Display" plot we need not require the user to select a file to save the plot in, as the plot will be displayed in a window on the screen. The next functions implement this logic.


The second tab contains fields to allow the user to configure the basic visual properties of the plot, its height, width, and background colour.


The third tab allows the user to control the plotting of images, labels, points and lines.


The fourth (and final) tab allows the user to manage how the axes are plotted.


Having created the contents of each tab, we set the initial tab that will be shown to the user and display the window on the screen.


Next we will write code to set the values of the global variables from the GUI, and perform a render. Until then, we can define a do-nothing renderImage function to allow us to run and test the GUI code.


If we save this code in a file called propgui and make it executable using the shell command:

chmod +x propgui

We can call the script from the command line like this:


We can enter values into the fields of the GUI, choose files, and press buttons (although pressing the Render button will of course have no effect yet).

Art Computing Art Open Data Satire

Digital Evaluation Of The Humanities

Humanities Computing dates back to the use of mainframe computers with museum catalogues in the 1950s. The first essays on Humanities Computing appeared in academic journals in the 1960s, the first conventions on the subject (and the Icon programming language) emerged in the 1970s, and ChArt was founded in the 1980s. But it isn’t until the advent of Big Data in the 2000s and the rebranding of Humanities Computing as the “Digital Humanities” that it became the subject of moral panic in the broader humanities.

The literature of this moral panic is an interesting cultural phenomenon that deserves closer study. The claims that critics from the broader humanities make against the Digital Humanities fall into two categories. The first is material and political: the Digital Humanities require and receive more resources than the broader humanities, and these resources are often provided by corporate interests that may have a corrupting influence. The second is effectual and categorical: it’s all well and good making pretty pictures with computers or coming up with some numbers free of any social context, but the value of the broader humanities is in the narratives and theories that they produce.

We can use the methods of the Digital Humanities to characterise and evaluate this literature. Doing so will create a test of the Digital Humanities that has bearing on the very claims against them by critics from the broader humanities that this literature contains. I propose a very specific approach to this evaluation. Rather than using the Digital Humanities to evaluate the broader humanities claims against it, we should use these claims to identify key features of the broader humanities self-image that they use to contrast themselves with the Digital Humanities and then evaluate the extent to which the literature of the broader humanities actually embody these features.

This project has five stages:

1. Determine the broader humanities’ claims of properties that they posses in contrast to the Digital Humanities.
2. Identify models or procedures that can be used to evaluate each of these claims.
3. Identify a corpus or canon of broader humanities texts to evaluate.
3. Evaluate the corpus or canon using the models or procedures.
4. Use the results of these evaluations as direct constraints on a theory of the broader humanities.

Notes on each stage:

Stage 1

I outlined some of the broader humanities’ claims against the Digital Humanities above that I am familiar with. We can perform a Digital Humanities analysis of texts critical of the Digital Humanities in order to test the centrality of these claims to the case against the Digital Humanities and to identify further claims for evaluation.

Stage 2

There are well defined computational and non-computational models of narrative, for example. There are also models of theories, and of knowledge. To the extent that the broader humanities find these insufficient to describe what they do and regard their use in a Digital critique as inadequate they will have to explain why they feel this is so. This will help both to improve such models and to advance the terms of the debate within the humanities.

One characteristic of broader humanities writing that is outside of the scope of the stated aims of this project but that I believe is worthwhile investigating are the extents to which humanities writing is simply social grooming and ideological normativity within an educational institutional bureaucracy, which can be evaluated using measures of similarity, referentiality and distinctiveness.

Stage 3

It is the broader humanities’ current self-image (in contrast to its image of the Digital Humanities) that concerns us, so we should identify a defensible set of texts for analysis.

There are well established methods for establishing a corpus or canon. We can take the most read, most cited, most awarded or most recommended articles established by a particular service or institution from a given date range (for example 2000-2009 inclusive or the academic year for 2010). We can take a reading list from a leading course on the subject. Or we can try to locate every article published online within a given period. Whichever criterion we choose we will need to explicitly identify and defend it.

Stage 4

Evaluating the corpus or canon will require an iterative process of preparing data and running software then correcting for flaws in the software, data, and models or processes. This process should be recorded publicly online in order to engender trust and gain input. To support this and to allow recreation of results the software used to evaluate the corpus or canon, and the resulting data, must be published in a free and open source manner and maintained in a publicly readable version control repository.

Stage 5

Stage five is a deceptive moment of jouissance for the broader humanities. It percolates number and model into narrative and theory, but in doing so it provides a test of the broader humanities’ self-image.

For the broader humanities to criticise the results of the project will require its critics to understand more of the Digital Humanities and of their own position than they currently do. Therefore even if the project fails to demonstrate or persuade it will succeed in advancing the terms of the debate.