Acoustic-phonetic features for research on atypical/non-native-speech
To analyse speech recordings, a so-called AASPAA Pipeline was created. The pipeline was used to analyse two data sets with atypical speech: non-native, and dysarthric speech.
To analyse speech recordings, we have created the so-called AASPAA Pipeline. Briefly this consists of the following steps: forced alignment to obtain a segmentation, extraction of max. 110 features, outlier detection, data reduction, classification, and statistical analysis. The AASPAA pipeline was used to analyse two data sets with atypical speech: non-native speech, and dysarthric speech.
Non-native Dutch speech of the JASMIN corpus was compared to native speech. For 96 of the 106 features, we observed significant differences, e.g. for loudness related features and articulation rate features (both sign. lower for non-natives).
With only 11 features the classification Accuracy was 0.987.
We also compared recordings of 8 dysarthric speakers before and after treatment through an ASR-based game that provided speech therapy and that we had developed in the CHASING project. Our results show that the effect of the treatment differs substantially between the speakers, and that for 4 of the 8 speakers the required effect of the treatment is observed: speak louder, while limiting pitch.
For our analyses, we wanted to classify words as content and function words. In order to do so, we compared two POS taggers: Alpino and Frog.
About 11% of the words were tagged differently by the two parsers. Since we observed more errors for Frog, we decided to use Alpine in our analyses.
The same pipeline has also been used for analysing other types of atypical speech, e.g. speech of COPD patients. Additional results will be presented during the presentation.
Xing Wei, Rosa Bosland, Catia Cucchiarini, Roeland van Hout, Helmer Strik (2022). Distinctive Features for Classifying Spoken Native Versus Non-native Dutch. Submitted to interspeech2022
Chiara Pesenti, Loes van Bemmel, Roeland van Hout, Helmer Strik (2022). The effect of eHealth training on dysarthric speech. Accepted for the LREC 2022 workshop RAPID.
Loes van Bemmel, Catia Cucchiarini, Helmer Strik (2021). Using feature selection to evaluate pathological speech after training with a serious game. Proceedings of the 12th International Conference of Experimental Linguistics, 11-13 October 2021, Athens, Greece, pp. 231-234.
Loes van Bemmel, Wieke Harmsen, Catia Cucchiarini, Helmer Strik (2021). Automatic Selection of the Most Characterizing Features for Detecting COPD in Speech. 23rd International Conference on Speech and Computer, SPECOM 2021, St. Petersburg, Russia, September 27–30, 2021, pp. 737-748.