Author: "Deniz Jafari" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Deniz Jafari"' showing total 4 results

Start Over Author "Deniz Jafari"

4 results on '"Deniz Jafari"'

1. Improving Dysarthric Speech Segmentation With Emulated and Synthetic Augmentation

Author: Saeid Alavi Naeini, Leif Simmatis, Deniz Jafari, Yana Yunusova, and Babak Taati
Subjects: Dysarthria, speech segmentation, speech recognition, orofacial assessment, data augmentation, Computer applications to medicine. Medical informatics, R858-859.7, Medical technology, R855-855.5
Abstract: Acoustic features extracted from speech can help with the diagnosis of neurological diseases and monitoring of symptoms over time. Temporal segmentation of audio signals into individual words is an important pre-processing step needed prior to extracting acoustic features. Machine learning techniques could be used to automate speech segmentation via automatic speech recognition (ASR) and sequence to sequence alignment. While state-of-the-art ASR models achieve good performance on healthy speech, their performance significantly drops when evaluated on dysarthric speech. Fine-tuning ASR models on impaired speech can improve performance in dysarthric individuals, but it requires representative clinical data, which is difficult to collect and may raise privacy concerns. This study explores the feasibility of using two augmentation methods to increase ASR performance on dysarthric speech: 1) healthy individuals varying their speaking rate and loudness (as is often used in assessments of pathological speech); 2) synthetic speech with variations in speaking rate and accent (to ensure more diverse vocal representations and fairness). Experimental evaluations showed that fine-tuning a pre-trained ASR model with data from these two sources outperformed a model fine-tuned only on real clinical data and matched the performance of a model fine-tuned on the combination of real clinical data and synthetic speech. When evaluated on held-out acoustic data from 24 individuals with various neurological diseases, the best performing model achieved an average word error rate of 5.7% and a mean correct count accuracy of 94.4%. In segmenting the data into individual words, a mean intersection-over-union of 89.2% was obtained against manual parsing (ground truth). It can be concluded that emulated and synthetic augmentations can significantly reduce the need for real clinical data of dysarthric speech when fine-tuning ASR models and, in turn, for speech segmentation.
Published: 2024
Full Text: View/download PDF

2. Analytical Validation of a Webcam-Based Assessment of Speech Kinematics: Digital Biomarker Evaluation following the V3 Framework

Author: Leif Simmatis, Saeid Alavi Naeini, Deniz Jafari, Michael (Kai Yue) Xie, Chelsea Tanchip, Niyousha Taati, Scotia McKinlay, Rupinder Sran, Justin Truong, Diego L Guarin, Babak Taati, and Yana Yunusova
Subjects: speech, remote assessment, facial tracking, validation, Biology (General), QH301-705.5
Abstract: Introduction: Kinematic analyses have recently revealed a strong potential to contribute to the assessment of neurological diseases. However, the validation of home-based kinematic assessments using consumer-grade video technology has yet to be performed. In line with best practices for digital biomarker development, we sought to validate webcam-based kinematic assessment against established, laboratory-based recording gold standards. We hypothesized that webcam-based kinematics would possess psychometric properties comparable to those obtained using the laboratory-based gold standards. Methods: We collected data from 21 healthy participants who repeated the phrase “buy Bobby a puppy” (BBP) at four different combinations of speaking rate and volume: Slow, Normal, Loud, and Fast. We recorded these samples twice back-to-back, simultaneously using (1) an electromagnetic articulography (“EMA”; NDI Wave) system, (2) a 3D camera (Intel RealSense), and (3) a 2D webcam for video recording via an in-house developed app. We focused on the extraction of kinematic features in this study, given their demonstrated value in detecting neurological impairments. We specifically extracted measures of speed/acceleration, range of motion (ROM), variability, and symmetry using the movements of the center of the lower lip during these tasks. Using these kinematic features, we derived measures of (1) agreement between recording methods, (2) test-retest reliability of each method, and (3) the validity of webcam recordings to capture expected changes in kinematics as a result of different speech conditions. Results: Kinematics measured using the webcam demonstrated good agreement with both the RealSense and EMA (ICC-A values often ≥0.70). Test-retest reliability, measured using the absolute agreement (2,1) formulation of the intraclass correlation coefficient (i.e., ICC-A), was often “moderate” to “strong” (i.e., ≥0.70) and similar between the webcam and EMA-based kinematic features. Finally, the webcam kinematics were typically as sensitive to differences in speech tasks as EMA and the 3D camera gold standards. Discussion and Conclusions: Our results suggested that webcam recordings display good psychometric properties, comparable to laboratory-based gold standards. This work paves the way for a large-scale clinical validation to continue the development of these promising technologies for the assessment of neurological diseases via home-based methods.
Published: 2023
Full Text: View/download PDF

3. Automated Temporal Segmentation of Orofacial Assessment Videos.

Author: Saeid Alavi Naeini, Leif E. R. Simmatis, Deniz Jafari, Diego L. Guarin, Yana Yunusova, and Babak Taati
Published: 2022
Full Text: View/download PDF

4. 3D Video Tracking Technology in the Assessment of Orofacial Impairments in Neurological Disease: Clinical Validation

Author: Deniz Jafari, Leif Simmatis, Diego Guarin, Liziane Bouvier, Babak Taati, and Yana Yunusova
Subjects: Speech and Hearing, Linguistics and Language, Language and Linguistics
Abstract: Purpose: This study sought to determine whether clinically interpretable kinematic features extracted automatically from three-dimensional (3D) videos were correlated with corresponding perceptual clinical orofacial ratings in individuals with orofacial impairments due to neurological disorders. Method: 45 participants (19 diagnosed with motor neuron diseases [MNDs] and 26 poststroke) performed two nonspeech tasks (mouth opening and lip spreading) and one speech task (repetition of a sentence “Buy Bobby a Puppy”) while being video-recorded in a standardized lab setting. The color video recordings of participants were assessed by an expert clinician—a speech language pathologist—on the severity of three orofacial measures: symmetry, range of motion (ROM), and speed. Clinically interpretable 3D kinematic features, linked to symmetry, ROM, and speed, were automatically extracted from video recordings, using a deep facial landmark detection and tracking algorithm for each of the three tasks. Spearman correlations were used to identify features that were significantly correlated ( p value < .05) with their corresponding clinical scores. Clinically significant kinematic features were then used in the subsequent multivariate regression models to predict the overall orofacial impairment severity score. Results: Several kinematic features extracted from 3D video recordings were associated with their corresponding perceptual clinical scores, indicating clinical validity of these automatically derived measures. Different patterns of significant features were observed between MND and poststroke groups; these differences were aligned with clinical expectations in both cases. Conclusions: The results show that kinematic features extracted automatically from simple clinical tasks can capture characteristics used by clinicians during assessments. These findings support the clinical validity of video-based automatic extraction of kinematic features.
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

4 results on '"Deniz Jafari"'

1. Improving Dysarthric Speech Segmentation With Emulated and Synthetic Augmentation

2. Analytical Validation of a Webcam-Based Assessment of Speech Kinematics: Digital Biomarker Evaluation following the V3 Framework

3. Automated Temporal Segmentation of Orofacial Assessment Videos.

4. 3D Video Tracking Technology in the Assessment of Orofacial Impairments in Neurological Disease: Clinical Validation

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

4 results on '"Deniz Jafari"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources