MasterSearch is about training the next generation of data-driven masters of concepts using Autosearch, by developing a suite of 'how to'-tools to facilitate answering a concept-focused, research question, using corpus-based techniques on a machine-readable corpus.

Principal investigator
  • Arianna Betti

In what way can a group of MA students in philosophy and related disciplines rely on Autosearch to facilitate answering a concept-focused research question of their choice, using corpus-based techniques on a machine-readable corpus of their choice in a language of their choice? In particular, (Subquestions) what support do students need to be aptly guided in implementing a state-of-the-art workflow from corpus identification to the answering of their research question? How can they be best taught how to go about identifying, gathering and converting their corpora into Folia XML? How can students best be guided in learning how to use Autosearch's modes to answer a concept-focused, corpus-based question?

CLARIAH Components

We address these questions by setting up a series of real-life classroom experiments during the course Data-Driven History of Ideas taught at University of Amsterdam, by (re)designing the course to fit the Fellowship's aims. We maintain but adapt the current setup organized around two challenges, corpus building and research design and execution, and the current role-playing format; the didactical aims will be adjusted and new teaching materials will be introduced.

The students will now be requested, as part of the necessary training to be completed, to execute a workflow from research question identification to corpus building for upload in AutoSearch in Folia XML, to answer a concept-focused research question through corpus analysis via AutoSearch's modes, and to report on it. Execution of the workflow will be supported by instructions in the form of lectures, manuals, scripts and tutorials.

The workflow will be modular, with different entry-points tailored to different students needs, and represented on a flowchart for overview. We will also test for students' needs a new CLARIAH subcomponent currently scheduled for integration in AutoSearch+. By this, we mean an enhanced version of AutoSearch that we are co-developing. Ideally, our modular workflow will directly refer to the integrated environment of AutoSearch+ including HitPaRank and an annotation environment, rather than to three separate components, as the integration makes it for a more user-friendly and less error-prone user experience.