7 results on '"Roland Goecke"'
Search Results
2. Synthesis of a six-bar mechanism for generating knee and ankle motion trajectories using deep generative neural network
- Author
-
Akim Kapsalyamov, Shahid Hussain, Nicholas A.T. Brown, Roland Goecke, Munawar Hayat, and Prashant K. Jamwal
- Subjects
Artificial Intelligence ,Control and Systems Engineering ,Electrical and Electronic Engineering - Published
- 2023
- Full Text
- View/download PDF
3. Automatic depression classification based on affective read sentences: Opportunities for text-dependent analysis
- Author
-
Brian Stasak, Roland Goecke, and Julien Epps
- Subjects
Protocol (science) ,Linguistics and Language ,Communication ,Speech recognition ,020206 networking & telecommunications ,02 engineering and technology ,01 natural sciences ,Language and Linguistics ,Computer Science Applications ,Speech disfluency ,Protocol design ,Modeling and Simulation ,0103 physical sciences ,Evaluation methods ,otorhinolaryngologic diseases ,0202 electrical engineering, electronic engineering, information engineering ,Feature (machine learning) ,Computer Vision and Pattern Recognition ,Valence (psychology) ,Psychology ,010301 acoustics ,Software ,Depression (differential diagnoses) ,Affective stimuli - Abstract
In the future, automatic speech-based analysis of mental health could become widely available to help augment conventional healthcare evaluation methods. For speech-based patient evaluations of this kind, protocol design is a key consideration. Read speech provides an advantage over other verbal modes (e.g. automatic, spontaneous) by providing a clinically stable and repeatable protocol. Further, text-dependent speech helps to reduce phonetic variability and delivers controllable linguistic/affective stimuli, therefore allowing more precise analysis of recorded stimuli deviations. The purpose of this study is to investigate speech disfluency behaviors in non-depressed/depressed speakers using read aloud text containing constrained affective-linguistic criteria. Herein, using the Black Dog Institute Affective Sentences (BDAS) corpus, analysis demonstrates statistically significant feature differences in speech disfluencies, whereby when compared to non-depressed speakers, depressed speakers show relatively higher recorded frequencies of hesitations (55% increase) and speech errors (71% increase). Our study examines both manually and automatically labeled speech disfluency features, demonstrating that detailed disfluency analysis leads to considerable gains, of up to 100% in absolute depression classification accuracy, especially with affective considerations, when compared with the affect-agnostic acoustic baseline (65%).
- Published
- 2019
- Full Text
- View/download PDF
4. An investigation of linguistic stress and articulatory vowel characteristics for automatic depression classification
- Author
-
Roland Goecke, Brian Stasak, and Julien Epps
- Subjects
Psychomotor retardation ,Computer science ,020206 networking & telecommunications ,02 engineering and technology ,behavioral disciplines and activities ,01 natural sciences ,Linguistics ,Theoretical Computer Science ,Human-Computer Interaction ,Vowel ,0103 physical sciences ,Stress (linguistics) ,0202 electrical engineering, electronic engineering, information engineering ,Feature (machine learning) ,medicine ,medicine.symptom ,Set (psychology) ,010301 acoustics ,psychological phenomena and processes ,Software ,Depression (differential diagnoses) - Abstract
The effects of psychomotor retardation associated with clinical depression are linked to a reduction in variability in acoustic parameters. However, linguistic stress differences between non-depressed and clinically depressed individuals have yet to be investigated. In this paper, by examining vowel articulatory parameters, statistically significant differences in articulatory characteristics are found at a paraphonetic level. For articulatory characteristic features, tongue height and advancement in terms of ‘mid’ and ‘front’ vowel sets show similar depression classification performance trends for both the DAIC-WOZ (English) and AViD (German) databases. Considering linguistic stress feature components, for both databases, depressed speakers exhibit shorter vowel durations and less variance for ‘low’, ‘back’, and ‘rounded’ vowel positions. Results for the DAIC-WOZ and AViD datasets using a small set of linguistic stress based features derived from multiple vowel articulatory parameter sets show absolute, statistically significant, gains of 7% and 20% in two-class depression classification performance over baseline approaches. Linguistic stress feature results indicate that specific vowel set analysis provides better discrimination of clinically depressed and non-depressed speakers. Knowledge gleaned from this research allows the design of more effective automatic depression disorder classification systems.
- Published
- 2019
- Full Text
- View/download PDF
5. Efficient multi-target tracking via discovering dense subgraphs
- Author
-
Roland Goecke and Behzad Bozorgtabar
- Subjects
Smoothness (probability theory) ,BitTorrent tracker ,business.industry ,Context (language use) ,02 engineering and technology ,Link (geometry) ,021001 nanoscience & nanotechnology ,Tracking (particle physics) ,Discriminative model ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Trajectory ,020201 artificial intelligence & image processing ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,0210 nano-technology ,Set (psychology) ,business ,Software ,Mathematics - Abstract
A multi-target tracking is formulated as a dense subgraph discovering problem.Both local and global cues are exploited to represent the tracklet affinity model.The distinguishable appearance based models are learned for the targets. In this paper, we cast multi-target tracking as a dense subgraph discovering problem on the undirected relation graph of all given target hypotheses. We aim to extract multiple clusters (dense subgraphs), in which each cluster contains a set of hypotheses of one particular target. In the presence of occlusion or similar moving targets or when there is no reliable evidence for the target's presence, each target trajectory is expected to be fragmented into multiple tracklets. The proposed tracking framework can efficiently link such fragmented target trajectories to build a longer trajectory specifying the true states of the target. In particular, a discriminative scheme is devised via learning the targets' appearance models. Moreover, the smoothness characteristic of the target trajectory is utilised by suggesting a smoothness tracklet affinity model to increase the power of the proposed tracker to produce persistent target trajectories revealing different targets' moving paths. The performance of the proposed approach has been extensively evaluated on challenging public datasets and also in the context of team sports (e.g. soccer, AFL), where team players tend to exhibit quick and unpredictable movements. Systematic experimental results conducted on a large set of sequences show that the proposed approach performs better than the state-of-the-art trackers, in particular, when dealing with occlusion and fragmented target trajectory.
- Published
- 2016
- Full Text
- View/download PDF
6. Ordered trajectories for human action recognition with large number of classes
- Author
-
Roland Goecke and O. V. Ramana Murthy
- Subjects
business.industry ,Optical flow ,Pattern recognition ,Feature selection ,Support vector machine ,Bag-of-words model ,Feature (computer vision) ,Signal Processing ,Trajectory ,Benchmark (computing) ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Representation (mathematics) ,business ,Mathematics - Abstract
Recently, a video representation based on dense trajectories has been shown to outperform other human action recognition methods on several benchmark datasets. The trajectories capture the motion characteristics of different moving objects in space and temporal dimensions. In dense trajectories, points are sampled at uniform intervals in space and time and then tracked using a dense optical flow field over a fixed length of L frames (optimally 15) spread overlapping over the entire video. However, among these base (dense) trajectories, a few may continue for longer than duration L, capturing motion characteristics of objects that may be more valuable than the information from the base trajectories. Thus, we propose a technique that searches for trajectories with a longer duration and refer to these as 'ordered trajectories'. Experimental results show that ordered trajectories perform much better than the base trajectories, both standalone and when combined. Moreover, the uniform sampling of dense trajectories does not discriminate objects of interest from the background or other objects. Consequently, a lot of information is accumulated, which actually may not be useful. This can especially escalate when there is more data due to an increase in the number of action classes. We observe that our proposed trajectories remove some background clutter, too. We use a Bag-of-Words framework to conduct experiments on the benchmark HMDB51, UCF50 and UCF101 datasets containing the largest number of action classes to date. Further, we also evaluate three state-of-the art feature encoding techniques to study their performance on a common platform. A technique that captures information of objects with longer duration.A feature selection like approach that delivers better performance than several trajectory variants.Removal of a large number of trajectories related to background noise.We apply our technique on action datasets HMDB51, UCF50 and UCF101 containing largest number of classes till date.
- Published
- 2015
- Full Text
- View/download PDF
7. Regression based automatic face annotation for deformable model building
- Author
-
Akshay Asthana, Simon Lucey, and Roland Goecke
- Subjects
Ground truth ,Facial expression ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Expression (mathematics) ,Active appearance model ,Artificial Intelligence ,Face (geometry) ,Signal Processing ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Correspondence problem ,Software - Abstract
A major drawback of statistical models of non-rigid, deformable objects, such as the active appearance model (AAM), is the required pseudo-dense annotation of landmark points for every training image. We propose a regression-based approach for automatic annotation of face images at arbitrary pose and expression, and for deformable model building using only the annotated frontal images. We pose the problem of learning the pattern of manual annotation as a data-driven regression problem and explore several regression strategies to effectively predict the spatial arrangement of the landmark points for unseen face images, with arbitrary expression, at arbitrary poses. We show that the proposed fully sparse non-linear regression approach outperforms other regression strategies by effectively modelling the changes in the shape of the face under varying pose and is capable of capturing the subtleties of different facial expressions at the same time, thus, ensuring the high quality of the generated synthetic images. We show the generalisability of the proposed approach by automatically annotating the face images from four different databases and verifying the results by comparing them with a ground truth obtained from manual annotations.
- Published
- 2011
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.