On February 6 and 7 CLARIAH WP3 organized a workshop to discuss the application of Linked Data for linguistic research. The workshop that went under the appropriate acronym of LD4LR, invited presentations from a number of foreign experts as also a number of representatives from CLARIN centers that had made some experience using Linked Data in their projects. The workshop concentrated on the perspective of the linguistic researcher that is increasingly confronted with all kinds of information about Linked Data and that needs to know what Linked Data can bring to her research. A number of prominent Dutch linguists were invited to present their current research topics where subsequently our experts could make suggestions on how to apply Linked Data paradigms to the researcher’s benefit. The invited experts that next to being Linked Data experts also are active linguists presented their efforts with Linked Data in the fields of Lexica, Phonetics and Treebanks.
Overall there was sufficient time for good discussions, where the experts tried to avoid too specific terminology and concentrate on user needs. In the round-up summaries from day one and two Sjef Barbiers and Jan Odijk concluded that although interesting things happen with Linked Data in linguistics, it seems not immediately useable for the end-user researcher unless they themselves are very familiar with Linked Data as the invited experts are. To make the potential of Linked Data use benefit a broader group of linguists, we need a better bridge between technologists and researchers. Dedicated pilots in WP3 should stimulate investigation of the usefulness of Linked Data application to different types of linguistic research, esp. Lexical resources (DUELME). From a data provisioning perspective, the benefits of Linked Data for interoperability purposes are clear.
|13:00 – 13:15||Welcome, (Why this Workshop)||Daan Broeder|
|13:15 – 14:00||Why should I use LD for my research?,
LD in Comparative Syntax
|Nicoline van der Sijs
|14:00 – 15:30||Broad overview: What kind of (L)L(O)D is available? What linguistic research has been done using it?|
|14:00 – 14:30||John McCrae|
|14:30 – 15:00||Steven Moran|
|15:00 – 15:30||Giuseppe Celano|
|15:30 – 16:00||break|
|16:00 – 17:30||Experiences of CLARIAH/CLARIN centers|
|16:00 – 16:30||Antske Fokkens
Willem van Hage
|16:30 – 17:00||Matej Durco|
|17:00 – 17:30||Thomas Eckart
|17:30 – 17:15||Wrap-up day 1||Sjef Barbiers|
|09:00 – 09:15||Outlook Day 2||Menzo Windhouwer|
|09:15 – 11:00||Linguistic research case studies|
|09:15 – 09:45||Introduction|
|Marjo van Koppen,
|09:45 – 11:00||How would you use LD for this research? Expert responses||John McCrae
|11:00 – 11:30||break|
|11:30 – 12:00||Dieter van Uytvanck
|12:00 – 13:00||lunch|
|13:00 – 13:30||Linked Data opportunities & limitations||Daan Broeder|
|13:30 – 14:00||Conclusions||Jan Odijk|
On 27 October 2016, the University of Amsterdam opened its doors to The Humanities And Technology Camp (THATCamp).
In recent years the THATCamp formula has crossed the Atlantic and spread over Europe. At THATCamp Amsterdam we came to fully understand the reason for THATCamp’s success: THATCamp is a playful, informal and fun event where programmers and humanities scholars are able to meet, learn about each other's work, toy around with different types of software, and make plans for a collaborative projects in the future.
At THATCamp Amsterdam topics ranged widely: from the web’s unboundedness to the use of crowdsourcing in research, from the spread of cinemas in the Netherlands to the role of machines on the work floor. Linked Open Data practitioners exchanged working techniques, while Art Historians explored best computational research practices and an Amsterdam historical GIS hotspot took shape. In between there was coffee, salad and "broodjes", and by the end of the day new plans had emerged for collaborative work on Amsterdam’s Creative Industries, from various perspectives and on multiple scales.
For anyone organizing a THATCamp, the catch in the formula is that THATCamp does not really want to be organized top-down. As can be read on the official website, THATCamp is an "unconference": it is participatory ("there are no spectators at a THATCamp"), informal (there are "no lengthy proposals, papers, presentations"), productive (the focus is on "collegial work or free-form discussion"), flat structured ("non-hierarchical”), and crucially bottom-up: at THATCamp, the program is created by all participants together, "on the spot" as part of a collective voting session.
For the record: we need not have worried. THATCamp recommends avoiding web-based technology to facilitate the voting, arguing that “the in-person method works well and is fun.” In Amsterdam, this participatory, personal approach of the first session resonated well with the general enthusiastic and constructive attitude of the THATCamp participants. As it turns out, a small collection of post-its, clothespegs, a few sheets of paper and a large dose of enthusiasm and curiosity may just be the perfect toolkit to start a day of collectively exploring the intersections of humanities scholarship and technology.
THATCamp Amsterdam was hosted by the research project Creative Amsterdam: an E-Humanities Perspective (CREATE), at the Amsterdam Centre for Cultural Heritage and Identity. An impression of THATCamp Amsterdam, including a list of session proposals, may be found on the THATCamp Amsterdam webpage and the CREATE blog.
For more information on other events and research projects carried out within the CREATE Program, please visit the CREATE page.
It all seemed rather funny to them, until the very moment they laid eyes upon the prison block. As ‘Team Clariah’ Marieke van Erp (VU, WP3) and Richard Zijdeman (IISG, WP4) participated in the National Library's HackaLOD on 11-12 November. Alongside seven other teams they faced the challenge of building a cool (prototype) application using Linked Open Data made especially available for this event, by the National Library and Heritage partners. It had to be done within 24 hours… Inside a former prison… Here’s their account of the event.
We set out on Friday, somewhat dispirited as our third team mate Melvin Wevers (UU) was caught out by a cold. Upon arrival, it turned out we had two cells: one for hacking and one for sleeping (well more like for a three-hour tossing and turning). As you'd expect, the cells were not exactly cosy, but the organisers had provided goodie bags from which the contents were put to good use and even a Jaw Harp midnight concert.
With that, and our pre-set up plan to tell stories around buildings we set out to build our killer app. We found several datasets that contain information about buildings. The BAG for example contains addresses, geo-coordinates and information about how a building is used (as a shop or a gathering place) and 'mutations' (things that happened to the building). However, what it doesn't contain is building names (for example Rijksmuseum or Wolvenburg), which is contained in the Rijksmonumenten dataset. But the Rijksmonumenten dataset doesn't contain addresses, but as both contain geo-coordinates, they can be linked. Yay for Linked Data!
To tell the stories, we wanted to find some more information in the National Library's newspaper collection. With some help from other hackers we managed to efficiently bring up news articles that mention a particular location. With some manual analysis, we for example found that for Kloveniersburgwal 73 up until 1890 there was a steady stream of ads asking for ‘decent’ kitchen maids, followed by a sudden spike in ads announcing real estate. It turns out a notary had moved in, for which another (not linked) dataset could also provide a marriage license, confirmed by a wedding ad in the newspaper. These sort of stories can give us more insight into what happened in a particular building at a given time.
We have made some steps in starting to analyse these ads automatically to detect these changes in order to automatically generate timelines for locations, but we didn't get that done in 24 hours. However, the audience was sufficiently pleased with our idea for us to win the audience award! (Admittedly to our great surprise, as the other teams' ideas were all really awesome as well). We’re now looking for funding to complete the prototype.
In summary, it was all great fun, not in the least due to great organisation by the National Library as well as the nice ‘bonding’ atmosphere among the teams. So, our lessons learnt:
- prison food is really not that bad (and there was lots of it)
- 24 hours of hacking is heaps of fun
- the data always turn out to behave different from what you'd expect
- isolated from the daily routine, events like these prove crucial to foster new ideas and relations, in order to keep the field in motion.
(by Marieke van Erp)
This year, the 15th International Semantic Web Conference took place in Kobe Japan. The conference itself was 3 days with 3 parallel sessions as well as a 3-hour poster and demo session one evening. The two days prior to the main conference 5 tutorials and 16 workshops took place.
For NLP aficionados there was the the LD4IE (Linked Data for Information Extraction) workshop which I attended on Tuesday morning, the NLP&DBpedia workshop that I co-organised on Tuesday afternoon, the keynote by Kathleen McKeown (Columbia University) on Wednesday and the NLP session in the main conference on Friday. But there were other NLP papers dispersed along the conference programme.
For the CLARIAH community some of the work McKeown presented on computational analysis of novels is probably most relevant. It was also nice to see that more research is moving towards event extraction, for example in the work of Valentina Presutti and Aldo Gangemi (presented at the LD4IE workshop). They presented a new resource called Framester that links up all types of resources such as FrameNet, VerbNet and DBpedia to help describe events. New at the conference was the journal papers track, where I got to present our work on building Event-centric Knowledge Graphs [slides] [paper] to a pretty big room.
Sentiment analysis was also a hot topic, with several interesting papers such as On the Role of Semantics for Detecting pro-ISIS Stances on Social Media by Hassan Saif, Miriam Fernandez, Matthew Rowe and Harith Alani and A Replication Study of the Top Performing Systems in SemEval Twitter Sentiment Analysis by Efstratios Sygkounas, Giuseppe Rizzo, Raphaël Troncy. Incidentally, the last paper was only one of two replication papers in the conference.
There weren’t that many papers this year dealing with humanities research questions. Next year’s conference will take place in Vienna, perhaps CLARIAH can mitigate that?
[This post is based on Maartje Kruijt‘s Media Studies Bachelor thesis: “Supporting exploratory search with features, visualizations, and interface design: a theoretical framework“.]
In today’s network society there is a growing need to share, integrate and search in collections of various libraries, archives and museums. For researchers interpreting these interconnected media collections, tools need to be developed. In the exploratory phase of research the media researcher has no clear focus and is uncertain what to look for in an integrated collection. Data Visualization technology can be used to support strategies and tactics of interest in doing exploratory research
The DIVE tool is an event-based linked media browser that allows researchers to explore interconnected events, media objects, people, places and concepts (see screenshot). Maartje Kruijt’s research project involved investigating to what extent and in what way the construction of narratives can be made possible in DIVE, in such a way that it contributes to the interpretation process of researchers. Such narratives can be either automatically generated on the basis of existing event-event relationships, or be constructed manually by researchers.
The research proposes an extension of the DIVE tool where selections made during the exploratory phase can be presented in narrative form. This allows researchers to publish the narrative, but also share narratives or reuse other people’s narratives. The interactive presentation of a narrative is complementary to the presentation in a text, but it can serve as a starting point for further exploration of other researchers who make use of the DIVE browser.
Within DIVE and CLARIAH, we are currently extending the user interface based on the recommendations made in the context of this thesis. You can read more about it in Maartje Kruijt’s thesis (Dutch). The user stories that describe the needs of media researchers are descibed in English and found in Appendix I.
- 23-02-2017 LD4LR: Linked Data for Linguistic Research
- 17-11-2016 THATCamp Amsterdam 2016: happy afterthoughts
- 16-11-2016 Team CLARIAH wins Audience Award at Hackalod 2016
- 16-11-2016 ISWC 2016
- 07-10-2016 The Role of Narratives in DIVE
- 16-09-2016 CLARIAH Linked Data Workshop
- 24-07-2016 Audiovisual Data And Digital Scholarship: Towards Multimodal Literacy
- 19-06-2016 Trip Report Language Resources and Evaluation Conference 2016 (LREC 2016)
- 09-07-2013 Workshop Research Infrastructures towards 2020
- 09-07-2013 Teaching Digital Humanities to Students