After some slightly frantic last-minute corrections and checking, we have published the three main outputs of the project today.
1. Two free online training courses, introducing historians to semantic markup and text mining.
2. Case studies illustrating how historical projects have used various digital research techniques.
3. A tools audit, listing some of the most common digital tools for historical research (most of which are free).
The courses are the most substantial part of the project, and we hope that they give historians a solid basis for going further with semantic markup and text mining if they think these approaches would be useful to their research.
We originally promised five case studies and we have published four. That is because our favourite historical visualisation site, Mapping the Republic of Letters, is currently being revamped. We will put up a case study on this site as soon as the new version is finished.
Finally, the tools audit is by no means comprehensive. We never intended it to be, and too many tools may make the audit less useful, but if there is a tool that you think we really should have included then let us know (either in the comments below or using the webform for the audit) and we’ll certainly look into it.
The Histore workshop took place last Thursday. Thanks to everyone who came and made it an engaging and thought-provoking afternoon; the breakout session generated all kinds of ideas for the IHR Digital team,and we hope that they were equally useful for other participants. We’d especially like to thank our external speakers, Matteo Romanello and Pip Willcox, who came to give talks on text mining and semantic markup.
You can find pdfs of the presentations below. Histore introduction1 and Histore introduction2 are the slides from talks by me and Matt Phillpott about the project. Pip’s talk is called new tools for old books and Matteo’s is introduction to text mining.
We’re planning to add more materials from the workshop over the next couple of weeks. We’ll post the summaries of the breakout session (which we hope might generate further discussion), links to the audio files of the talks and, finally, links to the videos.
introduction to text mining
new tools for old books
On Thursday 21 June we are going to hold an afternoon workshop for historians interested in using digital tools for research. It will be held in Senate House, London, and will run from 2pm to about 4.30.
The project team will discuss the work we’ve done on Histore to date. Then there will be talks on semantic markup and text mining, followed by a break-out session for group discussion of different techniques. Attendees are encouraged to bring digital project ideas to discuss during the break-out. There will also be an opportunity to discuss your projects with us one-to-one, if you’d like to.
This workshop is free but places are limited. If you’d like to come to the workshop, or have any questions about it, just drop me an email at email@example.com.
In the rapid analysis phase of the project we looked at what tools are available to historians for digital research in the five areas we are concerned with (visualisation, text mining, linked data, cloud computing and semantic data) and tried to decide which we thought would be most useful to historians as the focus of introductory training courses.
In the end we decided in favour of text mining and semantic data. Visualisation was a strong candidate but we felt that cloud computing was somewhat nebulous as the subject of a training course: if we could do it at all, it seemed to us, it would essentially be training people to use particular tools rather than general techniques – which isn’t what the project has undertaken to do.
As it happens I am currently working on a JISC-funded linked data project, Liparm, which will use linked metadata to create a union catalogue of UK parliamentary material. This will be a good proof of concept for linked data in a historical context and I have already learned a lot about linked data from working on the project. But the point of Histore is to address the lack of take-up of digital resources by historians, and linked data already assumes (doesn’t it?) that those digital resources have been, or are being, created by historians. Linked data looks like a next step, rather than the kind of intial impetus that Histore seeks to provide.
That was our conclusion, but readers might disagree. We’d be keen to hear your thoughts.
We thought it might be a good idea, at the beginning of the project, to attempt to define what we’re talking about in the five digital research areas that will be covered: cloud computing, linked data, semantic data, text mining and visualisation.
These are the definitions that the team has come up with, but there is bound to be room for improvement. If you can suggest improvements we’d love to hear about them in the comments.
Cloud computing storage and processing of data entirely on third-party servers, with no local working or copies.
Linked data exposed data which can be read by machines along with other data in the same format.
Semantic data data marked up, however lightly or heavily, in ways which reflect the semantic content of a text, rather than its structure.
Text mining the derivation of meaningful data from a large body of unstructured data, using automated methods to reveal structure and associations.
Visualisation the visual representation of data in an attempt to show otherwise hidden patterns and relationships.