Fondazione Giorgio Cini. Image credit: Johan Oomen CC-BYFondazione Giorgio Cini. Image credit: Johan Oomen CC-BY

We are excited to announce that DIVE+ has been awarded the Grand Prize at the LODLAM Summit, held at the Fondazione Giorgio Cini this week. The summit brought together ~100 experts in the vibrant and global community of Linked Open Data in Libraries, Archives and Museums. It is organised bi-annually since 2011. Earlier editions were held in de US, Canada and Australia, making the 2017 edition the first in Europe.

The Grand Prize (USD$2,000) was awarded by the LODLAM community a recognition of how DIVE+ demonstrates social, cultural and technical impact of linked data. A the Open Data Prize (of USD$1,000) was awarded to WarSampo for its groundbreaking approach to publish open data.

Finalists

A total of five finalists were invited to present their work, selected from a total of 21 submissions after an open call published earlier this year. Johan Oomen, head of research at the Netherlands Institute for Sound and Vision presented DIVE on day one of the summit. The slides of his pitch have been published, as well as the demo video that was submitted to the open call. Next to DIVE+ (Netherlands) and WarSampo (Finland) the finalists are Oslo public library (Norway), Fishing in the Data Ocean (Taiwan) and Genealogy Project (China). The diversity of the finalists is a clear indication that the use of linked data technology is gaining momentum. Throughout the summit, delegates have been capturing the outcomes of various breakout sessions. Please look at the overview of session notes and follow @lodlam on Twitter to keep track.

pitch LODLAM.jpgPictured: Johan Oomen (@johanoomen) pitching DIVE+. Photo: Enno Meijers. https://twitter.com/ennomeijers/status/880035572798062593/photo/1

What is DIVE+

DIVE+ is an event-centric linked data digital collection browser aimed to provide an integrated and interactive access to multimedia objects from various heterogeneous on-line collections. It enriches the structured metadata of online collections with linked open data vocabularies with focus on events, people, locations and concepts that are depicted or associated with particular collection objects. DIVE+ is result of a true interdisciplinary collaboration between computer scientists, humanities scholars, cultural heritage professionals and interaction designers. DIVE+ is integrated in the national CLARIAH (Common Lab Research Infrastructure for the Arts and Humanities) research infrastructure.

IMG_6857.JPGPictured: each day experts shape the agenda for that day, following the OpenSpace format. Image credit: Johan Oomen (cc-by)

DIVE+ is a collaborative effort of the VU University Amsterdam (Victor de Boer, Oana Inel, Lora Aroyo, Chiel van den Akker, Susane Legene), Netherlands Institute for Sound and Vision (Jaap Blom, Liliana Melgar, Johan Oomen), Frontwise (Werner Helmich), University of Groningen (Berber Hagendoorn, Sabrina Sauer) and the Netherlands eScience Centre (Carlos Martinez). It is supported by CLARIAH.

About the challenge

The challenge is sponsored by Synaptica. We would also like to thanks the organisers, especially Valentine Charles, Antoine Isaac of Europeana and Ingrid Mason of Aarnet for all of their efforts.
LODLAM 2017 has been a truly unforgettable experience for the DIVE+ team.

clarin logoSTDescription

These awards, named in honour of Steven Krauwer (the first executive director of CLARIN ERIC) are given annually to outstanding scientists or engineers in recognition of outstanding contributions toward CLARIN goals in the areas of language resource building, tool or service development, exemplary use cases, user involvement or knowledge sharing.

Categories

In 2017, two Steven Krauwer Awards will be given, one in each of the following categories:

  1. Young CLARIN Researcher. The intended goal of this award is to promote a young person below the age of 35.
  2. CLARIN Achievements. The intended goal of this award is to recognize the work of a person or a group of cooperating people.

Prize

A certificate will be given to each recipient; in addition, the prize for Young CLARIN Scientist will be 500 Euro.

Eligibility

The award recipients must be researchers or engineers working in any of the countries or regions covered by CLARIN ERIC members, CLARIN ERIC observers or partners that have an agreement with CLARIN ERIC according to article 18 of the Statutes.

The recipient of the award for Young CLARIN Researcher should not have reached the age of 35 years on December 31, 2017.

The recipient of the award for CLARIN achievements should be a single person or a group of people who have closely cooperated with each other.

Selection

The recipients will be selected by the National Coordinators’ Forum (NCF) of CLARIN ERIC. The NCF is responsible for establishing the procedure for reviewing the nominees’ work and for selecting the award recipients.

Presentation

There will be a prize ceremony at the CLARIN Annual Conference.

Nomination

Candidates must be nominated by at least two people who are not affiliated with the nominees' home institution. Nominators and nominees must be working in the countries or regions covered by CLARIN ERIC members or by CLARIN ERIC observers. Ideally, the nominators should come from two different countries.

Nominators should provide a written account of the nominees' work and the reasons it is felt to be an outstanding contribution to CLARIN. A list of bibliographic references to the nominees' work, relevant links to language resources, language tools, use cases, user involvement, knowledge sharing and/or other related activities by the nominees are desirable.

Nominations should state the names, affiliations and contact details of all nominators and should be sent to the Chair of the NCF, Koenraad De Smedt, by e-mail to .

Nomination Deadline for 2017

July 31, 2017, 13:00 CET.

CLARIAH logo logo escience zwart

The eScience Center and CLARIAH are pleased to announce the initiation of four new projects in the Arts and Humanities. The four projects will pursue new scientific domain challenges and enhance and accelerate the process of scientific discovery within the Arts and Humanities using computer science, data science, and eScience technologies.

Scheduled to start in the second half of 2017, the projects are collaborations with research teams from multiple Dutch academic groups. The granted projects will use, adapt, and integrate existing methods and tools, as made available through the CLARIAH and eScience Center software infrastructures. Newly developed tools will be made available through the eScience Technology Platform of the Netherlands eScience Center and the CLARIAH Infrastructure for potential use in other studies.

The granted projects are:

Bridging the gap: Digital Humanities and the Arabic-Islamic corpus

Who Prof. dr. Christian Lange 
Where Utrecht University
What

Despite some pioneering efforts in recent times, the longue durée analysis of conceptual history in the Islamic world remains a largely unexplored field of research. Researchers of Islamic intellectual history still tend to study a certain canon of texts, made available by previous Western researchers of the Islamic world largely based on considerations of the relevance of these texts for Western theories, concepts and ideas. Indigenous conceptual developments and innovations are therefore insufficiently understood, particularly as concerns the transition from premodern to modern thought in Islam.

This project seeks to harness state-of-the art Digital Humanities approaches and technologies to make pioneering forays into the vast corpus of digitised Arabic texts that has become available in the last decade. This is done along the lines of four case studies, each of which examines a separate genre of Arabic and Islamic literary history (jurisprudence, inter-faith literature, early modern and modern journalism, and Arabic poetry).

This project seeks to develop a web-based application that will (a) enable easy access to existing Arabic corpora on GitHub and other online repositories and offer the opportunity for researchers to upload their own corpus (b) offer a set of tools for Arabic text mining and computational analysis, and (c) provide opportunities to link search results to the datasets in Islamic and Middle Eastern Studies of Brill Publishers, Europe’s leading publisher in this area.

The project will be inserted into two ongoing ERC projects on Islamic intellectual history housed at the Department of Philosophy and Religious Studies at Utrecht University, and collaborate closely with international initiatives in the field of Arabic Digital Humanities.

TICCLAT: Text-Induced Corpus Correction and Lexical Assessment Tool

Who Dr. Martin Reynaert
Where Tilburg University
What

The Text-Induced Corpus Clean-up tool TICCL, integral part of the CLARIN infrastructure, is globally unique in utilizing the corpus-derived word form statistics to attempt to fully-automatically post-correct texts digitized by means of Optical Character Recognition.

The NWO 'Groot' project Nederlab will deliver by the end of 2017 a uniformly processed and linguistically enriched diachronic corpus of Dutch containing an estimated 5-6 billion word tokens. We aim to extend TICCL's correction capabilities with classification facilities based on specific data collected from the full Nederlab corpus: word statistics, document and time references and linguistic annotations, i.e. Part-of-Speech and Named-Entity labels. These data will complement a solid, renewed basis composed of the available validated lexicons and name lists for Dutch.

In this, TICCL as a post-correction tool will be transformed into TICCLAT, a lexical assessment tool capable of delivering not only correction candidates, but also e.g. more accurately dated diachronic Dutch word forms, more securely classified person and place names. To achieve this on scale, the TICCLAT project will seek a successful merger of TICCL's anagram hashing with bit-vectorization techniques. TICCLAT's capabilities will also be evaluated in comparison to human performance by an expert psycholinguist.

The data collected will be exportable for storage in a data repository, as RDF triples, for broad reuse. The project will greatly contribute to a more comprehensive overview of the lexicon of Dutch since its earliest days and of the person and place names that share its history. Its partners are the Dutch experts in Lexicology, Person Names and Toponyms.

News Genres: Advancing Media History by Transparent Automatic Genre Classification (NEWSGAC)

Who Prof.dr. Marcel J. Broersma 
Where University of Groningen
What

This project studies how genres in newspapers and television news can be detected automatically using machine learning in a transparent manner. This will enable us to capture the often hypothesized but, due to the highly time consuming nature of manual content analysis, largely understudied shift from opinion-based to fact-centred reporting. Moreover, we will open the black box of machine learning by comparing, predicting and visualizing the effects of applying various algorithms on heterogeneous data with varying quality and genre features that shift over time. This will enable scholars to do large-scale analyses of historic texts and other media types as well as critically evaluate the methodological effects of various machine learning approaches.

This project brings together expertise of journalism history scholars (RUG), specialists in data modelling, integration and analysis (CWI), digital collection experts (KB & NISV) and e-science engineers (eScience Center). It will first use a big manually annotated dataset (VIDI-project PI) to develop a transparent and reproducible approach to train an automatic classifier. Building upon this, the project will generate three outcomes:

  1. A study that revises our current understanding of the interrelated development of genre conventions in print and television journalism based upon large-scale automated content analysis via machine learning;
  2. Metrics and guidelines for evaluating the bias and error of the different preprocessing and machine learning approaches and of-the-shelf software packages;
  3. A dashboard that integrates, compares and visualises different algorithms and underlying machine learning approaches which can be integrated in the CLARIAH media suite.

EviDENce: Ego Documents Events modelliNg. How individuals recall mass violence

Who Dr. Susan Hogervorst
Where Open Universiteit Nederland
What

Much of our historical knowledge is based on oral or written accounts of eyewitnesses, particularly in cases of war and mass violence, when regular ways of documentation and record keeping are often absent. Although oral history and the study of ego documents both value these individual perspectives on history and its meaning, these research fields tend to operate separately. However, the digital revolution has shaken up the balance between spoken and written text. The paradigm emerging in the application of search technology to digitised oral history is characterised by a post-documentary sensibility: away from text and sensitive to other dimensions of human expression than language. Nonetheless, ‘mining’ of oral history accounts remains valuable in humanities research, especially considering the re-use of digital interview collections throughout the humanities.

EviDENce explores new ways of analysing and contextualising historical sources by applying event modelling and semantic web technologies. Our project suggests a systematic and integral content analysis of ‘ego-sources’ by applying state-of-the-art entity and event modelling methods and tools, in order to explore the nature and value of ego-sources and to disclose existing collections. We focus on representations of mass-violence in two case studies to generate and explore different kinds of events: 1) a synchronic analysis of WW2 events, centered around the oral history collection ‘Getuigenverhalen’ [1] and using the WW2 thesaurus [2], and 2) a diachronic analysis of ego-documents (1573-2012) from Nederlab [3]. In both cases, we use content-related contextual sources from Nederlab [4].

About the ADAH Call

The four projects result from the recent ADAH call (Accelerating Scientific Discovery in the Arts and Humanities). The purpose of the 2016 ADAH call is to enable researchers working in the Arts and Humanities to address compute-intensive and/or data-driven problems within their research and to contribute to a generic and sustainable research software infrastructure.

About the Netherlands eScience Center

The eScience Center is the national hub for the development and application of domain overarching software and methods for the scientific community. The eScience Center develops crucial bridges between increasingly complex modern e-infrastructures and the growing demands and ambitions of scientists from across all scientific disciplines. 

About CLARIAH

CLARIAH is a national project that is designing, constructing and exploiting the Dutch parts of the European CLARIN and DARIAH infrastructures. CLARIAH covers the humanities as a whole but has three core discipline areas: linguistics, media studies, and socio-economic history.

Contact information

Prof. dr. Jan Odijk, Program Director CLARIAH
+31 (0)30 253 5745

Dr. Frank Seinstra, Director eScience Program, Netherlands eScience Center
+31 (0)20 4604770

SpringerWe invite submissions of papers to a special issue of the journal ”Language Resources and Evaluation”. The special issue will focus on the use of language technology for digital humanities and will have the title: Language Technology for Digital Humanities.

MOTIVATION

The use of digital resources and tools across humanities disciplines has steadily increased, giving rise to new research paradigms and associated methods that are commonly subsumed under the term ”digital humanities”. Digital humanities does not constitute a new discipline in itself, but rather a new approach to humanities research that cuts across different existing humanities disciplines.
While digital humanities extends well beyond language-based research, textual resources and spoken language materials play a central role in most humanities disciplines. Applying LT tools and data for digital humanities research implies new perspectives on these resources regarding domain adaptation, interoperability, technical requirements, documentation, and usability of user interfaces.

TOPICS

We invite original contributions on completed work, not published before and not under consideration for publication elsewhere. Specific topics include, but are not limited to:

  • Case studies of using language technology and/or language resources with the goal of finding new answers to existing research questions in a particular humanities discipline or addressing entirely new research questions
  • Case studies of expanding the functionality of existing language processing tools in order to be able to address research questions in digital humanities
  • The design of new language processing tools as well as annotation tools for spoken and written language, showcasing their use in digital humanities research
  • Domain adaption of rule-based, statistical, or machine-learning models for language processing tools in digital humanities research
  • Challenges posed for language processing tools when used on diachronic data, language variation data, or literary texts
  • Showcasing the use of language processing tools in humanities disciplines such as anthropology, gender studies, history, literary studies, philosophy, political science, and theology

SUBMISSION

Accepted papers will have a length of 20-30 pages, excluding references.

Authors are advised to use the online manuscript submission for the journal. Make sure to select the special issue when asked to provide the article type. More information, including formatting instructions for authors can be found on the journal's webpage at: http://www.springer.com/education+%26+language/linguistics/journal/10579#

Authors are requested to send a brief email to the guest editors () indicating their intention to participate as soon as possible, including their contact information and the topic they intend to address in their submission. Questions regarding the special issue should be sent to the same address.

IMPORTANT DATES

GUEST EDITORS

  • Erhard Hinrichs, University of Tübingen
  • Marie Hinrichs, University of Tübingen
  • Sandra Kübler, Indiana University
  • Thorsten Trippel, University of Tübingen

CONTACT

Date 13 June 2017

LOD

Time 09.30-18.30
 Location Spinhuis
Oudezijds Achterburgwal 185
1012 DK,  AMSTERDAM

Are you intrigued by Linked (Open) Data but don’t know how to get started with it? Do you have tabular data in the Arts and Humanities domain lying around? Then this workshop might just be the thing for you.

In September 2016, CLARIAH organised its first Linked Open Data Workshop, which introduced Linked Data to the wider arts and humanities community. On 13 June 2017, we will organise a second CLARIAH-wide workshop in which you can obtain hands-on experience with the CLARIAH tools for Linked Data. In the workshop, we will show you how to convert your tabular data to RDF, connect it to other datasets and explore, analyse and visualise the resulting enriched dataset.

We will aim to support two user groups explicitly: technically-savvy, but novice users and more advanced users. Novice users are typically familiar with software tools such as Excel and Access, more advanced users may already be using Python, R and MySQL.

The plan

  • Sign up via the EventBrite page
  • Send us a half-page description of your dataset one week prior to the workshop (6 June), preferably with a small data sample so we can get an idea for the types of data that participants bring
  • Bring a laptop to the workshop with your data
  • We will show and help you convert your data to Linked Data using Timbuctoo
  • Teach you how to link your data to other sources
  • Help you figure out how to explore, analyse and visualise your data to help you answer your data-driven research questions
  • We will present some short use cases to give you an idea of the types of things you can do, but after that you get to work on your own dataset.
  • Staff from the different CLARIAH domains will be on hand to give you advice

Info

  • Contact  for more information

Organisers

  • Marnix van Berchum (WP2, Engineering)
  • Jauco Noordzij (WP2, Engineering)
  • Marieke van Erp (WP3, Linguistics)
  • Albert Meroño Peñuela (WP4, Social and Economic History)
  • Victor de Boer (WP5, Media Studies)