Author: "Wang, James Z." / Topic: computer vision - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Wang, James Z."' showing total 6 results

Start Over Author "Wang, James Z." Topic computer vision

6 results on '"Wang, James Z."'

1. Toward Rapid Stroke Diagnosis with Multimodal Deep Learning

Author: Yu, Mingli, Cai, Tongan, Huang, Xiaolei, Wong, Kelvin, Volpi, John, Wang, James Z., Wong, Stephen T. C., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Martel, Anne L., editor, Abolmaesumi, Purang, editor, Stoyanov, Danail, editor, Mateus, Diana, editor, Zuluaga, Maria A., editor, Zhou, S. Kevin, editor, Racoceanu, Daniel, editor, and Joskowicz, Leo, editor
Published: 2020
Full Text: View/download PDF

2. ARBEE: Towards Automated Recognition of Bodily Expression of Emotion in the Wild

Author: Luo, Yu, Ye, Jianbo, Adams, Jr., Reginald B., Li, Jia, Newman, Michelle G., and Wang, James Z.
Published: 2020
Full Text: View/download PDF

3. PaDNet: Pan-Density Crowd Counting.

Author: Tian, Yukun, Lei, Yiming, Zhang, Junping, and Wang, James Z.
Subjects: COMPUTER vision, CROWDS, DENSITY, COUNTING
Abstract: Crowd counting is a highly challenging problem in computer vision and machine learning. Most previous methods have focused on consistent density crowds, i.e., either a sparse or a dense crowd, meaning they performed well in global estimation while neglecting local accuracy. To make crowd counting more useful in the real world, we propose a new perspective, named pan-density crowd counting, which aims to count people in varying density crowds. Specifically, we propose the Pan-Density Network (PaDNet) which is composed of the following critical components. First, the Density-Aware Network (DAN) contains multiple subnetworks pretrained on scenarios with different densities. This module is capable of capturing pan-density information. Second, the Feature Enhancement Layer (FEL) effectively captures the global and local contextual features and generates a weight for each density-specific feature. Third, the Feature Fusion Network (FFN) embeds spatial context and fuses these density-specific features. Further, the metrics Patch MAE (PMAE) and Patch RMSE (PRMSE) are proposed to better evaluate the performance on the global and local estimations. Extensive experiments on four crowd counting benchmark datasets, the ShanghaiTech, the UCF_CC_50, the UCSD, and the UCF-QNRF, indicate that PaDNet achieves state-of-the-art recognition performance and high robustness in pan-density crowd counting. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

4. MILES: Multiple-Instance Learning via Embedded Instance Selection.

Author: Chen, Yixin, Bi, Jinbo, and Wang, James Z.
Subjects: MACHINE learning, COMPUTATIONAL learning theory, COMPUTER vision, LABELS, APPLICATION software, ALGORITHMS
Abstract: Multiple-instance problems arise from the situations where training class labels are attached to sets of samples (named bags), instead of individual samples within each bag (called instances). Most previous multiple-instance learning (MIL) algorithms are developed based on the assumption that a bag is positive if and only if at least one of its instances is positive. Although the assumption works well in a drug activity prediction problem, it is rather restrictive for other applications, especially those in the computer vision area. We propose a learning method, MILES (Multiple-Instance Learning via Embedded instance Selection), which converts the multiple-instance learning problem to a standard supervised learning problem that does not impose the assumption relating instance labels to bag labels. MILES maps each bag into a feature space defined by the instances in the training bags via an instance similarity measure. This feature mapping often provides a large number of redundant or irrelevant features. Hence, 1-norm SVM is applied to select important features as well as construct classifiers simultaneously. We have performed extensive experiments. In comparison with other methods, MILES demonstrates competitive classification accuracy, high computation efficiency, and robustness to labeling uncertainty. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

5. Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach.

Author: Jia Li and Wang, James Z.
Subjects: *IMAGE processing, *INDEXING, *COMPUTER vision, *MARKOV processes
Abstract: Automatic linguistic indexing of pictures is an important but highly challenging problem for researchers in computer vision and content-based image retrieval. In this paper, we introduce a statistical modeling approach to this problem. Categorized images are used to train a dictionary of hundreds of statistical models each representing a concept. Images of any given concept are regarded as instances of a stochastic process that characterizes the concept. To measure the extent of association between an image and the textual description of a concept, the likelihood of the occurrence of the image based on the characterizing stochastic process is computed. A high likelihood indicates a strong association. In our experimental implementation, we focus on a particular group of stochastic processes, that is, the two-dimensional multiresolution hidden Markov models (2D MHMMs). We implemented and tested our ALIP (Automatic Linguistic Indexing of Pictures) system on a photographic image database of 600 different concepts, each with about 40 training images. The system is evaluated quantitatively using more than 4,600 images outside the training database and compared with a random annotation scheme. Experiments have demonstrated the good accuracy of the system and its high potential in linguistic indexing of photographic images. [ABSTRACT FROM AUTHOR]
Published: 2003
Full Text: View/download PDF

6. DeepStroke: An efficient stroke screening framework for emergency rooms with multimodal adversarial deep learning.

Author: Cai, Tongan, Ni, Haomiao, Yu, Mingli, Huang, Xiaolei, Wong, Kelvin, Volpi, John, Wang, James Z., and Wong, Stephen T.C.
Subjects: *HOSPITAL emergency services, *DEEP learning, *MULTIMODAL user interfaces, *FACIAL paralysis, *SPEECH disorders, *FACE, *PHYSICIANS, *MEDICAL screening
Abstract: • A powerful multimodal deep learning framework for stroke screening in ER settings • Spatiotemporal facial frame proposal tackles "in-the-wild" patient conditions • Multi-level fusion of visual and audio features achieves better overall performance • Adversarial training mitigates "face-remembering" and learns stroke features • Transfer learning reduces facial-attribute bias and improves generalizability [Display omitted] In an emergency room (ER) setting, stroke triage or screening is a common challenge. A quick CT is usually done instead of MRI due to MRI's slow throughput and high cost. Clinical tests are commonly referred to during the process, but the misdiagnosis rate remains high. We propose a novel multimodal deep learning framework, DeepStroke , to achieve computer-aided stroke presence assessment by recognizing patterns of minor facial muscles incoordination and speech inability for patients with suspicion of stroke in an acute setting. Our proposed DeepStroke takes one-minute facial video data and audio data readily available during stroke triage for local facial paralysis detection and global speech disorder analysis. Transfer learning was adopted to reduce face-attribute biases and improve generalizability. We leverage a multi-modal lateral fusion to combine the low- and high-level features and provide mutual regularization for joint training. Novel adversarial training is introduced to obtain identity-free and stroke-discriminative features. Experiments on our video-audio dataset with actual ER patients show that DeepStroke outperforms state-of-the-art models and achieves better performance than both a triage team and ER doctors, attaining a 10.94% higher sensitivity and maintaining 7.37% higher accuracy than traditional stroke triage when specificity is aligned. Meanwhile, each assessment can be completed in less than six minutes, demonstrating the framework's great potential for clinical translation. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

6 results on '"Wang, James Z."'

1. Toward Rapid Stroke Diagnosis with Multimodal Deep Learning

2. ARBEE: Towards Automated Recognition of Bodily Expression of Emotion in the Wild

3. PaDNet: Pan-Density Crowd Counting.

4. MILES: Multiple-Instance Learning via Embedded Instance Selection.

5. Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach.

6. DeepStroke: An efficient stroke screening framework for emergency rooms with multimodal adversarial deep learning.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

6 results on '"Wang, James Z."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources