Data

CLARIAH creates, annotates and shares datasets for researchers form the broader Humanities and part of the Social Sciences.

As a researcher you can:

Explore source material
Jumpstart exploratory research
Link and enrich datasets
Generate, transform and share new datasets

Ineo: all our digital resources in one place

Ineo lets you search, browse, find and select digital resources for your research in humanities and social sciences. Gradually, it will offer access to thousands of tools, datasets, workflows, standards and educational material.

Go to Ineo

What data can you use?

Natural language and speech data

Natural language data for linguistic and textual research, from the collections of Meertens Institute, Huygens Institute, Institute for the Dutch Language, and DANS.

Socio-economic data

Socio-economic data from the historical archives of governments, civil societies, commercial enterprises, trade unions, population registers, censuses, baptism and marriage registers, made accessible in Linked Open Data format.

Audio-visual data

Audio-visual data, such as films, TV-programmes, radio broadcasts, images, vlogs and recorded interviews from The Netherlands Institute for Sound and Vision, EYE Film Museum collections, DANS oral history interview collections and collections from the Open Images Project.

Textual data

Textual data from literary studies, history, philosophy, theology and religion studies, dating from the Dead Sea Scrolls to tweets about recent events.

Who is our data meant for?

CLARIAH makes datasets accessible to researchers from a large number of disciplines within the Humanities and Social Sciences. A good example are the digitized newspapers of the KB in Delpher, which can be combined with the historical TV newsreels of Sound & Vision within the Media Suite. This combination not only enables interdisciplinary historical and linguistic research, but can also facilitate modern journalism and policy development.

In addition, we develop datasets for specific disciplines or types of research, such as linguistics, socio-economic history, media studies and research in textual sources.

Thirdly, CLARIAH develops datasets that are easy to convert to the standards and tools of disciplines with varying research methods, so that more interdisciplinary research questions can be asked.

Get started with our data

The datasets are stored in accordance with generic and/or research domain-specific standards.
Much of the data is made accessible through specific applications.
Most datasets can be used by any visitor online and/or offline.
With some datasets users have to log in, to view the content within a closed environment, for instance if it concerns confidential personal data or copyrighted material.
Researchers and students from Dutch universities and research institutes can usually log in with the account of their own organization.
Other professional users must request access from the owner of the dataset.