opensonarOn Tuesday 9 April, the Institute for the Dutch Language launched a new version of the OpenSoNaR web application, which allows you to search in large quantities of written and spoken Dutch. The application provides access to data from the SoNaR Corpus, a collection of written texts of more than 500 million words, and the Corpus Gesproken Nederlands (CGN), a collection of 900 hours of Dutch speech.

The new web application makes it possible to search the data of the two collections (corpora). The texts are provided with additional linguistic information such as part of speech and lemma, and the sound fragments of the Corpus Gesproken Nederlands can be played. In the application you can easily search for a word, or do a more complex search by selecting a specific annotation or by using regular expressions. It is also possible to save the search results, consult the search history and view frequency lists.

CGN and SoNaR

cgn logoThe Corpus Gesproken Nederlands (CGN) is a collection of 900 hours (almost 9 million words) of contemporary Dutch speech from Flemish and Dutch speakers. The speech fragments (spontaneous and prepared) are provided with various transcriptions (e.g. orthographic, phonetic) and annotations (syntactic, POS-tags). The SoNaR Corpus contains more than 500 million words of text from various domains and genres.
All texts were automatically tokenized, POS-tagged and lemmatized. The named entities were also labelled.

OpenSoNaR can be accessed free of charge with a university user account, or with a CLARIN account. The application was developed by a team from the Institute for the Dutch Language, Tilburg University and Radboud University, within the CLARIN-NL and CLARIAH projects.


Call for Course and Workshop Proposals

32nd European Summer School in Logic, Language and Information - ESSLLI 2020
3-14 August, 2020, Utrecht, The Netherlands

Important dates

1 June 2019 Proposal submission deadline
14 September 2019 Notification
3-14 August, 2020 32nd European Summer School in Logic, Language and Information - ESSLLI 2020
, Utrecht, The Netherlands

Under the auspices of FoLLI the European Summer School in Logic, Language, and Information (ESSLLI) is organized every year in a different European country. It takes place over two weeks in the European Summer, hosts approximately 50 different courses at both the introductory and advanced levels, attracting around 400 participants each year from all the world.

The main focus of the program of the summer schools is the interface between linguistics, logic and computation, with special emphasis in human linguistic and cognitive ability. Courses, both introductory and advanced, cover a wide variety of topics within the combined areas of interest: Logic and Computation, Computation and Language, and Language and Logic. Workshops are also organized, providing opportunities for in-depth discussion of issues at the forefront of research, as well as a series of invited lectures.

For all relevant information see here on the ESSLLI-website.

A workshop on a proposed standard format for parliamentary data is organised by the CLARIN Interoperability Committee in close collaboration with Tomaž Erjavec and Andrej Pančur from CLARIN Slovenia. 

The workshop will take place on 23 May and 24 May 2019 at Utrecht University in Utrecht, The Netherlands. 

Workshop goal

The goal of the workshop is to propose an outline of a standard format for parliamentary data to the research community, to assess the support for it and to identify potential or real problems for its development and wide adoption. It is intended as a preparation for the work on the CLARIN endorsed proposal for a standard encoding of parliamentary data. If the workshop shows sufficient support for the proposed standard format, it may be followed in the future by a shared task and accompanying workshop.

To read more about the workhop and how you can particpate, please visit the ParlaFormat event page 

Dates 30 September – 2 October 2019
Location Leipzig, Germany
Submission Deadline 15 April 2019
Submission link


CLARIN ERIC is happy to announce the CLARIN Annual Conference 2019 and calls for the submission of extended abstracts. CLARIN is a research infrastructure that makes digital language resources available to scholars, researchers, students and citizen-scientists from all disciplines, coordinates work on collecting language resources and tools, and offers advanced tools to discover, explore, exploit, annotate, analyse or combine such datasets, wherever they are located.


The CLARIN Annual Conference is organized for the wider Humanities and Social Sciences community in order to exchange experiences with and plans for the CLARIN infrastructure. This includes the design, construction and operation of the CLARIN infrastructure, the data, tools and services that it contains or should contain, its actual use by researchers, its relation to other infrastructures and projects, and the CLARIN Knowledge Sharing Infrastructure.


Special topic:

Humanities and Social Science research enabled by language resources and technology

We especially invite papers for a thematic session that reports on research carried out in the Humanities or the Social Sciences that crucially made use of language resources, technology or services from the CLARIN infrastructure. Perspectives addressed by these papers include, but are not limited to use cases, data life cycles, service life cycles, and demonstrations, for instance the following:

  • a use case of language documentation enabled by CLARIN resources, tools and services
  • illustrations of data life cycles and of service life cycles for different types of CLARIN resources and tools
  • workflows for CLARIN resources and/or tools that support data-driven research in different SSH disciplines
  • a demonstration of an application in CLARIN that played a crucial role in addressing a specific research question in the Humanities and Social Science

Other topics: 

Use of the CLARIN infrastructure, e.g.

  • Use of the CLARIN infrastructure in Humanities and Social Sciences research;
  • Usability studies and evaluations of CLARIN services;
  • Analysis of the CLARIN infrastructure usage, identification of user audience and impact studies;
  • Showcases, demonstrations and research projects in Humanities and Social Sciences that are relevant to CLARIN;

Design and construction of the CLARIN infrastructure, e.g.

  • Recent tools and resources added to the CLARIN infrastructure
  • Metadata and concept registries, cataloguing and browsing
  • Persistent identifiers and citation mechanisms
  • Access, including single sign-on authentication and authorisation
  • Search, including Federated Content Search
  • Web applications, web services, workflows
  • Standards and solutions for interoperability of language resources, tools and services
  • Models for the sustainability of the infrastructure, including issues in curation, migration, financing and cooperation
  • Legal and ethical issues in operating the infrastructure

CLARIN Knowledge Infrastructure and Dissemination, e.g.

  • User assistance (help desks, user manuals, FAQs)
  • CLARIN portals and outreach to users
  • Videos, screencasts, recorded lectures
  • Researcher training activities
  • Knowledge infrastructure centres

CLARIN in relation with other infrastructures and projects, e.g.

  • Relations with other SSH research infrastructures such as DARIAH, CESSDA, etc.
  • Relations with meta-infrastructure projects such as EUDAT, RDA and Digital Humanities
  • Relations with national and regional initiatives


The programme of both the general sessions and the thematic session may include oral presentations, posters, and demos. The type of session for which a paper will be selected will not be dependent on the quality of the paper but only on the appropriateness of the type of communication (more or less interactive) in view of the content of the paper. The authors of accepted submissions will be provided an additional opportunity to a demo their work.


Proposals for oral or poster presentations (optionally with demo) must be submitted as extended abstracts (length: 3-4 pages A4 including references) in PDF format, in accordance with the template (ZIP-archive, online Overleaf template). Authors can freely choose between anonymous and non-anonymous submission. 

Extended abstracts should address one or more topics that are relevant to the CLARIN activities, resources, tools or services, and this relevance should be explicitly articulated in the submission, as well as in the presentation at the conference. Contributions addressing desiderata for the CLARIN infrastructure that are currently not in place are also eligible. It is not required that the authors are or have been directly involved in national or cross-national CLARIN projects.

Extended abstracts must be submitted through the EasyChair submission system and will be reviewed by the Programme Committee. 

All proposals will be reviewed on the basis of both individual criteria and global criteria.

Individual acceptance criteria are the following:

  • Appropriateness: the contribution must pertain to the CLARIN infrastructure or be relevant for it (e.g., its use, design, construction, operation, exploitation, illustration of possible applications, etc.), and this relevance should be explicitly articulated in the submission. In addition, submissions to the special thematic session will be selected on the basis of their appropriateness to the special topic.
  • Soundness and correctness: the content must be technically and factually correct and methods must be scientifically sound, according to best practice, and preferably evaluated.
  • Meaningful comparison: the abstract must indicate that the author is aware of alternative approaches, if any, and highlight relevant differences.
  • Substance: concrete work and experiences will be preferred over ideas and plans.
  • Impact: contributions with a higher impact on the research community and society at large will be preferred over papers with lower impact.
  • Clarity: the abstract should be clearly written and well structured.
  • Timeliness and novelty: the work must convey relevant new knowledge to the audience at this event.



  • Lars Borin, University of Gothenburg, Sweden
  • António Branco, University of Lisbon, Portugal
  • Griet Depoorter, Dutch Language Institute, The Netherlands/Flanders
  • Koenraad De Smedt, University of Bergen, Norway
  • Roald Eiselen, South African Centre for Digital Language Resources, South Africa
  • Tomaž Erjavec, Jožef Stefan Institute, Slovenia
  • Eva Hajičová, Charles University Prague, Czech Republic
  • Erhard Hinrichs, University of Tübingen, Germany
  • Nicolas Larrousse, Huma-Num, France
  • Krister Lindén, University of Helsinki, Finland
  • Monica Monachini, Institute of Computational Linguistics «A. Zampolli», Italy
  • Karlheinz Mörth, Austrian Academy of Sciences, Austria
  • Costanza Navaretta, University of Copenhagen, Denmark
  • Jan Odijk, Utrecht University, the Netherlands
  • Maciej Piasecki, Wrocław University of Science and Technology, Poland
  • Stelios Piperidis, ILSP, Athena Research Center, Greece
  • Eirikur Rögnvaldsson, University of Iceland, Iceland
  • Kiril Simov, IICT, Bulgarian Academy of Sciences, Bulgaria (Chair)
  • Inguna Skadiņa, University of Latvia, Latvia 
  • Marko Tadič, University of Zagreb, Croatia
  • Jurgita Vaičenonienė, Vytautas Magnus University, Lithuania
  • Tamás Váradi, Research Institute for Linguistics, Hungarian Academy of Sciences
  • Kadri Vider, University of Tartu, Estonia
  • Martin Wynne, University of Oxford, United Kingdom


3rd Workshop on Humanities in the Semantic Web - WHiSe III

Date May 20, 21, or 22, 2019 (to be confirmed)
Venue Leipzig, Germany (co-located with LDK 2019)
Hashtag #whise2019
Twitter @whiseworkshop

Workshop chairs:

  • Alessandro Adamou - Data Science Institute, NUI Galway, Ireland
  • Marieke van Erp - KNAW Humanities Cluster, The Netherlands
  • Albert Meroño Peñuela - Vrije Universiteit Amsterdam, The Netherlands


The emergence of tractable and affordable methods for the collection, enhancement and analysis of data generated en masse has helped shape several research fields, such as social sciences, into structured research fields. Digital Humanities are enjoying such a transformation to the point that their very boundaries and methodological foundations are being called into question. The quality and relevance of findings obtained from the thorough, human-driven analysis of a few sources, compared to unsupervised large-scale analytics on masses of data, is a fervent ongoing debate; and yet, the latter cannot prescind from a conscious effort in shaping the world to which the analyses need to relate.
This has largely taken the form of knowledge modelling efforts, from which many ontologies, controlled vocabularies and conceptual models like CIDOC-CRM, the Europeana Data Model and FRBRoo have arisen. However, other fields traditionally less reliant on machine-readable data have seen the emergence of "ecological" communities with an approach to the Web of Data. Recent examples include the 2014 ISAW papers for the ancient world, Transforming Musicology for music and musicology Transforming Musicology for music and musicology and Linked Pasts for history and archaeology. As these emerging research networks deal with the reality of the Semantic Web and the ever-growing Linked Data Cloud, the WHiSe workshop series was conceived from a reflection on the extent to which the Semantic Web community is serving the needs of historians, philologists, cultural critics, musicologists and other humanists that generally:

  1. cannot rely on structured data generated en masse through social networks or online media platforms;
  2. deal with vague, fragmentary, uncertain, contradictory and yet still valuable evidence that poses a challenge even to Artificial Intelligence research per se;
  3. have good reason to value the systematic investigation of a few sources over the (semi-)automated analytical findings on masses of content. WHiSe addresses this need by promoting dialogue between humanists who employ or are contemplating Semantic Web technologies, and Semantic Web scholars providing accounts of applied research in the Humanities. It will also be a forum for raising opportunities to explore novel research problems that can be relevant to both communities.

WHiSe III welcomes original research contributions crossing Humanities and the Semantic Web. Scholars who have conducted research or developed impactful applications are invited to submit full papers (12 pages, Springer LNCS typeset) with appropriately evaluated contributions. WHiSe III also welcomes short vision or position papers (6 pages, Springer LNCS typeset) on novel challenges or approaches to existing problems.

Topics include, but are not limited to:

  • Knowledge base generation from classical texts
  • Linking data within and across gazetteers
  • Semantic enrichment of data from historical records and biographies
  • Ecosystems and process descriptions for linking data in the humanities
  • Linked Digital Libraries and semantic archives
  • Ontology adoption in specific domains in the humanities
  • Knowledge graph construction and exploitation within and across domains
  • Computational methods for the prosopography of historical figures
  • Capturing, modelling and reasoning on musical data
  • The role of ontologies and controlled vocabularies in data preservation
  • Criticism of Semantic Web standards from the point of view of humanities scholarship
  • Ethical issues in using Semantic Web and Linked Data and their impact on the openness of traditional research data
  • Notions on integrating digital humanities and data science
  • Knowledge bottlenecks and practical difficulties in using Semantic Web technologies by humanities scholars
  • Utopian / dystopian visions of the Semantic Web of the future

Submissions in all the categories mentioned above (both full and short papers) will be peer-reviewed by acknowledged researchers familiar with both scientific communities.

Accepted papers will be published as online proceedings courtesy of


Submission deadline: Tuesday, March 5, 2019
Notification to authors: Tuesday, April 2, 2019
Camera-ready due on: Wednesday, April 17, 2019
Workshop day: May 20, 2019
All deadlines are 23:59 Hawaii time


Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop. We welcome the following types of contributions:

  • Full papers (up to 12 pages)
  • Short papers (up to 6 pages)

All submissions must be PDF documents written in English and formatted according to LNCS instructions for authors (see here).

Page limits are inclusive of references and appendices, if any. Papers are to be submitted through the  Easychair Conference Management System.

Please note that paper submissions to WHiSe III are not anonymous.

At least one author of each accepted paper must register for the workshop, in order to present the paper there, and to the conference. For further instructions please refer to the LDK 2019 page.


Every submitted paper must represent original and unpublished work: it must not be under review or accepted elsewhere and there must be a significantly clear element of novelty distinguishing a submitted paper from any other prior publication or current submission.

PROGRAM COMMITTEE (to be extended)

  • Elton Barker, The Open University
  • Francesca Benatti, The Open University
  • Victor de Boer, Vrije Universiteit Amsterdam
  • Enrico Daga, The Open University
  • Rossana Damiano, University of Turin
  • Marilena Daquino, Alma Mater Studiorum Università  di Bologna
  • Paula Granados-Garcia, The Open University
  • Eero Hyvönen, Aalto University and University of Helsinki (HELDIG)
  • Ioanna Kyvernitou, National University of Ireland Galway
  • Paul Mulholland, The Open University
  • Silvio Peroni, Alma Mater Studiorum Università di Bologna
  • Rainer Simon, Austrian Institute of Technology
  • Konstantin Todorov, University of Montpellier
  • Francesca Tomasi, Alma Mater Studiorum Università di Bologna
  • François Vignale, Université du Maine