Detection of voicing and place of articulation of fricatives with deep learning in a virtual speech and language therapy tutor

Ivo Anjos, Maxine Eskenazi, Nuno Marques, Margarida Grilo, Isabel Guimarães, João Magalhães, Sofia Cavaco

Research output: Contribution to journalConference articlepeer-review

6 Citations (Scopus)
27 Downloads (Pure)

Abstract

Children with fricative distortion errors have to learn how to correctly use the vocal folds, and which place of articulation to use in order to correctly produce the different fricatives. Here we propose a virtual tutor for fricatives distortion correction. This is a virtual tutor for speech and language therapy that helps children understand their fricative production errors and how to correctly use their speech organs. The virtual tutor uses log Mel filter banks and deep learning techniques with spectral-temporal convolutions of the data to classify the fricatives in children's speech by place of articulation and voicing. It achieves an accuracy of 90.40% for place of articulation and 90.93% for voicing with children's speech. Furthermore, this paper discusses a multidimensional advanced data analysis of the first layer convolutional kernel filters that validates the usefulness of performing the convolution on the log Mel filter bank.

Original languageEnglish
Pages (from-to)3156-3160
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2020-October
DOIs
Publication statusPublished - 2020
Event21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 - Shanghai, China
Duration: 25 Oct 202029 Oct 2020

Keywords

  • Convolutional neural networks
  • Fricatives
  • Speech and language therapy

Fingerprint

Dive into the research topics of 'Detection of voicing and place of articulation of fricatives with deep learning in a virtual speech and language therapy tutor'. Together they form a unique fingerprint.

Cite this