Author: "Jiang, Yu-Gang" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Jiang, Yu-Gang"' showing total 974 results

Start Over Author "Jiang, Yu-Gang"

974 results on '"Jiang, Yu-Gang"'

201. Long-Term Cloth-Changing Person Re-identification

Author: Qian, Xuelin, Wang, Wenxuan, Zhang, Li, Zhu, Fangrui, Fu, Yanwei, Xiang, Tao, Jiang, Yu-Gang, Xue, Xiangyang, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Ishikawa, Hiroshi, editor, Liu, Cheng-Lin, editor, Pajdla, Tomas, editor, and Shi, Jianbo, editor
Published: 2021
Full Text: View/download PDF

202. Hierarchical Visual-Textual Graph for Temporal Activity Localization via Language

Author: Chen, Shaoxiang, Jiang, Yu-Gang, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Vedaldi, Andrea, editor, Bischof, Horst, editor, Brox, Thomas, editor, and Frahm, Jan-Michael, editor
Published: 2020
Full Text: View/download PDF

203. A Coarse-to-Fine Framework for Resource Efficient Video Recognition

Author: Wu, Zuxuan, Li, Hengduo, Zheng, Yingbin, Xiong, Caiming, Jiang, Yu-Gang, and Davis, Larry S
Published: 2021
Full Text: View/download PDF

204. Deep Learning for Video Classification and Captioning

Author: Wu, Zuxuan, Yao, Ting, Fu, Yanwei, and Jiang, Yu-Gang
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: Accelerated by the tremendous increase in Internet bandwidth and storage space, video data has been generated, published and spread explosively, becoming an indispensable part of today's big data. In this paper, we focus on reviewing two lines of research aiming to stimulate the comprehension of videos with deep learning: video classification and video captioning. While video classification concentrates on automatically labeling video clips based on their semantic contents like human actions or complex events, video captioning attempts to generate a complete and natural sentence, enriching the single label as in video classification, to capture the most informative dynamics in videos. In addition, we also provide a review of popular benchmarks and competitions, which are critical for evaluating the technical progress of this vibrant field., Comment: Book chapter in Frontiers of Multimedia Research
Published: 2016
Full Text: View/download PDF

205. The THUMOS Challenge on Action Recognition for Videos 'in the Wild'

Author: Idrees, Haroon, Zamir, Amir R., Jiang, Yu-Gang, Gorban, Alex, Laptev, Ivan, Sukthankar, Rahul, and Shah, Mubarak
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Automatically recognizing and localizing wide ranges of human actions has crucial importance for video understanding. Towards this goal, the THUMOS challenge was introduced in 2013 to serve as a benchmark for action recognition. Until then, video action recognition, including THUMOS challenge, had focused primarily on the classification of pre-segmented (i.e., trimmed) videos, which is an artificial task. In THUMOS 2014, we elevated action recognition to a more practical level by introducing temporally untrimmed videos. These also include `background videos' which share similar scenes and backgrounds as action videos, but are devoid of the specific actions. The three editions of the challenge organized in 2013--2015 have made THUMOS a common benchmark for action classification and detection and the annual challenge is widely attended by teams from around the world. In this paper we describe the THUMOS benchmark in detail and give an overview of data collection and annotation procedures. We present the evaluation protocols used to quantify results in the two THUMOS tasks of action classification and temporal detection. We also present results of submissions to the THUMOS 2015 challenge and review the participating approaches. Additionally, we include a comprehensive empirical study evaluating the differences in action recognition between trimmed and untrimmed videos, and how well methods trained on trimmed videos generalize to untrimmed videos. We conclude by proposing several directions and improvements for future THUMOS challenges., Comment: Preprint submitted to Computer Vision and Image Understanding
Published: 2016
Full Text: View/download PDF

206. DB-LSTM: Densely-connected Bi-directional LSTM for human action recognition

Author: He, Jun-Yan, Wu, Xiao, Cheng, Zhi-Qi, Yuan, Zhaoquan, and Jiang, Yu-Gang
Published: 2021
Full Text: View/download PDF

207. Semi-supervised Single-View 3D Reconstruction via Prototype Shape Priors

Author: Xing, Zhen, primary, Li, Hengduo, additional, Wu, Zuxuan, additional, and Jiang, Yu-Gang, additional
Published: 2022
Full Text: View/download PDF

208. MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes

Author: Jiao, Yang, primary, Chen, Shaoxiang, additional, Jie, Zequn, additional, Chen, Jingjing, additional, Ma, Lin, additional, and Jiang, Yu-Gang, additional
Published: 2022
Full Text: View/download PDF

209. A Survey on Video Diffusion Models.

Author: Xing, Zhen, Feng, Qijun, Chen, Haoran, Dai, Qi, Hu, Han, Xu, Hang, Wu, Zuxuan, and Jiang, Yu-Gang
Published: 2025
Full Text: View/download PDF

210. Ultrafast non-volatile flash memory based on van der Waals heterostructures

Author: Liu, Lan, Liu, Chunsen, Jiang, Lilai, Li, Jiayi, Ding, Yi, Wang, Shuiyuan, Jiang, Yu-Gang, Sun, Ya-Bin, Wang, Jianlu, Chen, Shiyou, Zhang, David Wei, and Zhou, Peng
Published: 2021
Full Text: View/download PDF

211. FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model

Author: Feng, Qijun, Xing, Zhen, Wu, Zuxuan, Jiang, Yu-Gang, Feng, Qijun, Xing, Zhen, Wu, Zuxuan, and Jiang, Yu-Gang
Abstract: Reconstructing detailed 3D objects from single-view images remains a challenging task due to the limited information available. In this paper, we introduce FDGaussian, a novel two-stage framework for single-image 3D reconstruction. Recent methods typically utilize pre-trained 2D diffusion models to generate plausible novel views from the input image, yet they encounter issues with either multi-view inconsistency or lack of geometric fidelity. To overcome these challenges, we propose an orthogonal plane decomposition mechanism to extract 3D geometric features from the 2D input, enabling the generation of consistent multi-view images. Moreover, we further accelerate the state-of-the-art Gaussian Splatting incorporating epipolar attention to fuse images from different viewpoints. We demonstrate that FDGaussian generates images with high consistency across different views and reconstructs high-quality 3D objects, both qualitatively and quantitatively. More examples can be found at our website https://qjfeng.net/FDGaussian/.
Published: 2024

212. Multi-Trigger Backdoor Attacks: More Triggers, More Threats

Author: Li, Yige, Ma, Xingjun, He, Jiabo, Huang, Hanxun, Jiang, Yu-Gang, Li, Yige, Ma, Xingjun, He, Jiabo, Huang, Hanxun, and Jiang, Yu-Gang
Abstract: Backdoor attacks have emerged as a primary threat to (pre-)training and deployment of deep neural networks (DNNs). While backdoor attacks have been extensively studied in a body of works, most of them were focused on single-trigger attacks that poison a dataset using a single type of trigger. Arguably, real-world backdoor attacks can be much more complex, e.g., the existence of multiple adversaries for the same dataset if it is of high value. In this work, we investigate the practical threat of backdoor attacks under the setting of \textbf{multi-trigger attacks} where multiple adversaries leverage different types of triggers to poison the same dataset. By proposing and investigating three types of multi-trigger attacks, including parallel, sequential, and hybrid attacks, we provide a set of important understandings of the coexisting, overwriting, and cross-activating effects between different triggers on the same dataset. Moreover, we show that single-trigger attacks tend to cause overly optimistic views of the security of current defense techniques, as all examined defense methods struggle to defend against multi-trigger attacks. Finally, we create a multi-trigger backdoor poisoning dataset to help future evaluation of backdoor attacks and defenses. Although our work is purely empirical, we hope it can help steer backdoor research toward more realistic settings.
Published: 2024

213. Automating the Diagnosis of Human Vision Disorders by Cross-modal 3D Generation

Author: Zhang, Li, Yang, Yuankun, Xie, Ziyang, Yuan, Zhiyuan, Feng, Jianfeng, Zhu, Xiatian, Jiang, Yu-Gang, Zhang, Li, Yang, Yuankun, Xie, Ziyang, Yuan, Zhiyuan, Feng, Jianfeng, Zhu, Xiatian, and Jiang, Yu-Gang
Abstract: Understanding the hidden mechanisms behind human's visual perception is a fundamental quest in neuroscience, underpins a wide variety of critical applications, e.g. clinical diagnosis. To that end, investigating into the neural responses of human mind activities, such as functional Magnetic Resonance Imaging (fMRI), has been a significant research vehicle. However, analyzing fMRI signals is challenging, costly, daunting, and demanding for professional training. Despite remarkable progress in artificial intelligence (AI) based fMRI analysis, existing solutions are limited and far away from being clinically meaningful. In this context, we leap forward to demonstrate how AI can go beyond the current state of the art by decoding fMRI into visually plausible 3D visuals, enabling automatic clinical analysis of fMRI data, even without healthcare professionals. Innovationally, we reformulate the task of analyzing fMRI data as a conditional 3D scene reconstruction problem. We design a novel cross-modal 3D scene representation learning method, Brain3D, that takes as input the fMRI data of a subject who was presented with a 2D object image, and yields as output the corresponding 3D object visuals. Importantly, we show that in simulated scenarios our AI agent captures the distinct functionalities of each region of human vision system as well as their intricate interplay relationships, aligning remarkably with the established discoveries of neuroscience. Non-expert diagnosis indicate that Brain3D can successfully identify the disordered brain regions, such as V1, V2, V3, V4, and the medial temporal lobe (MTL) within the human visual system. We also present results in cross-modal 3D visual construction setting, showcasing the perception quality of our 3D scene generation., Comment: 25 pages, 16 figures, project page: https://brain-3d.github.io
Published: 2024

214. Eyes Can Deceive: Benchmarking Counterfactual Reasoning Abilities of Multi-modal Large Language Models

Author: Li, Yian, Tian, Wentao, Jiao, Yang, Chen, Jingjing, Jiang, Yu-Gang, Li, Yian, Tian, Wentao, Jiao, Yang, Chen, Jingjing, and Jiang, Yu-Gang
Abstract: Counterfactual reasoning, as a crucial manifestation of human intelligence, refers to making presuppositions based on established facts and extrapolating potential outcomes. Existing multimodal large language models (MLLMs) have exhibited impressive cognitive and reasoning capabilities, which have been examined across a wide range of Visual Question Answering (VQA) benchmarks. Nevertheless, how will existing MLLMs perform when faced with counterfactual questions? To answer this question, we first curate a novel \textbf{C}ounter\textbf{F}actual \textbf{M}ulti\textbf{M}odal reasoning benchmark, abbreviated as \textbf{CFMM}, to systematically assess the counterfactual reasoning capabilities of MLLMs. Our CFMM comprises six challenging tasks, each including hundreds of carefully human-labeled counterfactual questions, to evaluate MLLM's counterfactual reasoning capabilities across diverse aspects. Through experiments, interestingly, we find that existing MLLMs prefer to believe what they see, but ignore the counterfactual presuppositions presented in the question, thereby leading to inaccurate responses. Furthermore, we evaluate a wide range of prevalent MLLMs on our proposed CFMM. The significant gap between their performance on our CFMM and that on several VQA benchmarks indicates that there is still considerable room for improvement in existing MLLMs toward approaching human-level intelligence. On the other hand, through boosting MLLMs performances on our CFMM in the future, potential avenues toward developing MLLMs with advanced intelligence can be explored.
Published: 2024

215. Heterogeneous Knowledge Transfer in Video Emotion Recognition, Attribution and Summarization

Author: Xu, Baohan, Fu, Yanwei, Jiang, Yu-Gang, Li, Boyang, and Sigal, Leonid
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Multimedia
Abstract: Emotion is a key element in user-generated videos. However, it is difficult to understand emotions conveyed in such videos due to the complex and unstructured nature of user-generated content and the sparsity of video frames expressing emotion. In this paper, for the first time, we study the problem of transferring knowledge from heterogeneous external sources, including image and textual data, to facilitate three related tasks in understanding video emotion: emotion recognition, emotion attribution and emotion-oriented summarization. Specifically, our framework (1) learns a video encoding from an auxiliary emotional image dataset in order to improve supervised video emotion recognition, and (2) transfers knowledge from an auxiliary textual corpora for zero-shot recognition of emotion classes unseen during training. The proposed technique for knowledge transfer facilitates novel applications of emotion attribution and emotion-oriented summarization. A comprehensive set of experiments on multiple datasets demonstrate the effectiveness of our framework., Comment: 13 pages, 11 figures. Published at the IEEE Transactions on Affective Computing
Published: 2015
Full Text: View/download PDF

216. Fusing Multi-Stream Deep Networks for Video Classification

Author: Wu, Zuxuan, Jiang, Yu-Gang, Wang, Xi, Ye, Hao, Xue, Xiangyang, and Wang, Jun
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: This paper studies deep network architectures to address the problem of video classification. A multi-stream framework is proposed to fully utilize the rich multimodal information in videos. Specifically, we first train three Convolutional Neural Networks to model spatial, short-term motion and audio clues respectively. Long Short Term Memory networks are then adopted to explore long-term temporal dynamics. With the outputs of the individual streams, we propose a simple and effective fusion method to generate the final predictions, where the optimal fusion weights are learned adaptively for each class, and the learning process is regularized by automatically estimated class relationships. Our contributions are two-fold. First, the proposed multi-stream framework is able to exploit multimodal features that are more comprehensive than those previously attempted. Second, we demonstrate that the adaptive fusion method using the class relationship as a regularizer outperforms traditional alternatives that estimate the weights in a "free" fashion. Our framework produces significantly better results than the state of the arts on two popular benchmarks, 92.2\% on UCF-101 (without using audio) and 84.9\% on Columbia Consumer Videos.
Published: 2015

217. Evaluating Two-Stream CNN for Video Classification

Author: Ye, Hao, Wu, Zuxuan, Zhao, Rui-Wei, Wang, Xi, Jiang, Yu-Gang, and Xue, Xiangyang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Videos contain very rich semantic information. Traditional hand-crafted features are known to be inadequate in analyzing complex video semantics. Inspired by the huge success of the deep learning methods in analyzing image, audio and text data, significant efforts are recently being devoted to the design of deep nets for video analytics. Among the many practical needs, classifying videos (or video clips) based on their major semantic categories (e.g., "skiing") is useful in many applications. In this paper, we conduct an in-depth study to investigate important implementation options that may affect the performance of deep nets on video classification. Our evaluations are conducted on top of a recent two-stream convolutional neural network (CNN) pipeline, which uses both static frames and motion optical flows, and has demonstrated competitive performance against the state-of-the-art methods. In order to gain insights and to arrive at a practical guideline, many important options are studied, including network architectures, model fusion, learning parameters and the final prediction methods. Based on the evaluations, very competitive results are attained on two popular video classification benchmarks. We hope that the discussions and conclusions from this work can help researchers in related fields to quickly set up a good basis for further investigations along this very promising direction., Comment: ACM ICMR'15
Published: 2015
Full Text: View/download PDF

218. Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification

Author: Wu, Zuxuan, Wang, Xi, Jiang, Yu-Gang, Ye, Hao, and Xue, Xiangyang
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: Classifying videos according to content semantics is an important problem with a wide range of applications. In this paper, we propose a hybrid deep learning framework for video classification, which is able to model static spatial information, short-term motion, as well as long-term temporal clues in the videos. Specifically, the spatial and the short-term motion features are extracted separately by two Convolutional Neural Networks (CNN). These two types of CNN-based features are then combined in a regularized feature fusion network for classification, which is able to learn and utilize feature relationships for improved performance. In addition, Long Short Term Memory (LSTM) networks are applied on top of the two features to further model longer-term temporal clues. The main contribution of this work is the hybrid learning framework that can model several important aspects of the video data. We also show that (1) combining the spatial and the short-term motion features in the regularized fusion network is better than direct classification and fusion using the CNN with a softmax layer, and (2) the sequence-based LSTM is highly complementary to the traditional classification strategy without considering the temporal frame orders. Extensive experiments are conducted on two popular and challenging benchmarks, the UCF-101 Human Actions and the Columbia Consumer Videos (CCV). On both benchmarks, our framework achieves to-date the best reported performance: $91.3\%$ on the UCF-101 and $83.5\%$ on the CCV.
Published: 2015

219. Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks

Author: Jiang, Yu-Gang, Wu, Zuxuan, Wang, Jun, Xue, Xiangyang, and Chang, Shih-Fu
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: In this paper, we study the challenging problem of categorizing videos according to high-level semantics such as the existence of a particular human action or a complex event. Although extensive efforts have been devoted in recent years, most existing works combined multiple video features using simple fusion strategies and neglected the utilization of inter-class semantic relationships. This paper proposes a novel unified framework that jointly exploits the feature relationships and the class relationships for improved categorization performance. Specifically, these two types of relationships are estimated and utilized by rigorously imposing regularizations in the learning process of a deep neural network (DNN). Such a regularized DNN (rDNN) can be efficiently realized using a GPU-based implementation with an affordable training cost. Through arming the DNN with better capability of harnessing both the feature and the class relationships, the proposed rDNN is more suitable for modeling video semantics. With extensive experimental evaluations, we show that rDNN produces superior performance over several state-of-the-art approaches. On the well-known Hollywood2 and Columbia Consumer Video benchmarks, we obtain very competitive results: 66.9\% and 73.5\% respectively in terms of mean average precision. In addition, to substantially evaluate our rDNN and stimulate future research on large scale video categorization, we collect and release a new benchmark dataset, called FCVID, which contains 91,223 Internet videos and 239 manually annotated categories., Comment: Please cite the officially published IEEE TPAMI version if you find this work helpful
Published: 2015
Full Text: View/download PDF

220. Text-Driven Video Prediction.

Author: Song, Xue, Chen, Jingjing, Zhu, Bin, and Jiang, Yu-Gang
Subjects: CAUSAL inference, STOCHASTIC processes, SEMANTICS, VIDEOS, NOISE
Abstract: Current video generation models usually convert signals indicating appearance and motion received from inputs (e.g., image and text) or latent spaces (e.g., noise vectors) into consecutive frames, fulfilling a stochastic generation process for the uncertainty introduced by latent code sampling. However, this generation pattern lacks deterministic constraints for both appearance and motion, leading to uncontrollable and undesirable outcomes. To this end, we propose a new task called Text-driven Video Prediction (TVP). Taking the first frame and text caption as inputs, this task aims to synthesize the following frames. Specifically, appearance and motion components are provided by the image and caption separately. The key to addressing the TVP task depends on fully exploring the underlying motion information in text descriptions, thus facilitating plausible video generation. In fact, this task is intrinsically a cause-and-effect problem, as the text content directly influences the motion changes of frames. To investigate the capability of text in causal inference for progressive motion information, our TVP framework contains a Text Inference Module (TIM), producing step-wise embeddings to regulate motion inference for subsequent frames. In particular, a refinement mechanism incorporating global motion semantics guarantees coherent generation. Extensive experiments are conducted on Something-Something V2 and Single Moving MNIST datasets. Experimental results demonstrate that our model achieves better results over other baselines, verifying the effectiveness of the proposed framework. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

221. Unified View Empirical Study for Large Pretrained Model on Cross-Domain Few-Shot Learning.

Author: Zhuo, Linhai, Fu, Yuqian, Chen, Jingjing, Cao, Yixin, and Jiang, Yu-Gang
Subjects: DATA augmentation, GENERALIZATION, EMPIRICAL research
Abstract: The challenge of cross-domain few-shot learning (CD-FSL) stems from the substantial distribution disparities between target and source domain images, necessitating a model with robust generalization capabilities. In this work, we posit that large-scale pretrained models are pivotal in addressing the CD-FSL task owing to their exceptional representational and generalization prowess. To our knowledge, no existing research comprehensively investigates the utility of large-scale pretrained models in the CD-FSL context. Addressing this gap, our study presents an exhaustive empirical assessment of the Contrastive Language–Image Pre-Training model within the CD-FSL task. We undertake a comparison spanning six dimensions: base model, transfer module, classifier, loss, data augmentation, and training schedule. Furthermore, we establish a straightforward baseline model, E-base, based on our empirical analysis, underscoring the importance of our investigation. Experimental results substantiate the efficacy of our model, yielding a mean gain of 1.2% in 5-way 5-shot evaluations on the BSCD dataset. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

222. Non-local NetVLAD Encoding for Video Classification

Author: Tang, Yongyi, Zhang, Xing, Wang, Jingwen, Chen, Shaoxiang, Ma, Lin, Jiang, Yu-Gang, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Leal-Taixé, Laura, editor, and Roth, Stefan, editor
Published: 2019
Full Text: View/download PDF

223. From Canteen Food to Daily Meals: Generalizing Food Recognition to More Practical Scenarios

Author: Liu, Guoshan, primary, Jiao, Yang, additional, Chen, Jingjing, additional, Zhu, Bin, additional, and Jiang, Yu-Gang, additional
Published: 2024
Full Text: View/download PDF

224. Dynamic Routing and Knowledge Re-Learning for Data-Free Black-Box Attack

Author: Qian, Xuelin, Wang, Wenxuan, Jiang, Yu-Gang, Xue, Xiangyang, and Fu, Yanwei
Abstract: Deep learning models have emerged as strong and efficient tools that can be applied to a broad spectrum of complex learning problems and many real-world applications. However, more and more works show that deep models are vulnerable to adversarial examples. Compared to vanilla attack settings, this paper advocates a more practical setting of data-free black-box attack, for which the attackers can completely not access the structures and parameters of the target model, as well as the intermediate features and any training data associated with the model. To tackle this task, previous methods generate transferable adversarial examples from a transparent substitute model to the target model. However, we found that these works have the limitations of taking static substitute model structure for different targets, only using hard synthesized examples once, and still relying on data statistics of the target model. This may potentially harm the performance of attacking the target model. To this end, we propose a novel Dynamic Routing and Knowledge Re-Learning framework (DraKe) to effectively learn a dynamic substitute model from the target model. Specifically, given synthesized training samples, a dynamic substitute structure learning strategy is proposed to adaptively generate optimal substitute model structure via a policy network according to different target models and tasks. To facilitate the substitute training, we present a graph-based structure information learning to capture the structural knowledge learned from the target model. For the inherent limitation that online data generation can only be learned once, a dynamic knowledge re-learning strategy is proposed to adjust the weights of optimization objectives and re-learn hard samples. Extensive experiments on four public image classification datasets and one face recognition benchmark are conducted to evaluate the efficacy of our Drake. We can obtain significant improvement compared with state-of-the-art competitors. More importantly, our DraKe consistently achieves attack superiority for different target models (e.g., residual networks, and vision transformers), showing great potential for complex real-world applications.
Published: 2025
Full Text: View/download PDF

225. Two-dimensional materials for next-generation computing technologies

Author: Liu, Chunsen, Chen, Huawei, Wang, Shuiyuan, Liu, Qi, Jiang, Yu-Gang, Zhang, David Wei, Liu, Ming, and Zhou, Peng
Published: 2020
Full Text: View/download PDF

226. Pose-Normalized Image Generation for Person Re-identification

Author: Qian, Xuelin, Fu, Yanwei, Xiang, Tao, Wang, Wenxuan, Qiu, Jie, Wu, Yang, Jiang, Yu-Gang, Xue, Xiangyang, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Ferrari, Vittorio, editor, Hebert, Martial, editor, Sminchisescu, Cristian, editor, and Weiss, Yair, editor
Published: 2018
Full Text: View/download PDF

227. Long-Term Cloth-Changing Person Re-identification

Author: Qian, Xuelin, primary, Wang, Wenxuan, additional, Zhang, Li, additional, Zhu, Fangrui, additional, Fu, Yanwei, additional, Xiang, Tao, additional, Jiang, Yu-Gang, additional, and Xue, Xiangyang, additional
Published: 2021
Full Text: View/download PDF

228. Generalizing Face Forgery Detection via Uncertainty Learning

Author: Wu, Yanqi, primary, Song, Xue, additional, Chen, Jingjing, additional, and Jiang, Yu-Gang, additional
Published: 2023
Full Text: View/download PDF

229. Relation Triplet Construction for Cross-modal Text-to-Video Retrieval

Author: Song, Xue, primary, Chen, Jingjing, additional, and Jiang, Yu-Gang, additional
Published: 2023
Full Text: View/download PDF

230. On the Importance of Spatial Relations for Few-shot Action Recognition

Author: Zhang, Yilun, primary, Fu, Yuqian, additional, Ma, Xingjun, additional, Qi, Lizhe, additional, Chen, Jingjing, additional, Wu, Zuxuan, additional, and Jiang, Yu-Gang, additional
Published: 2023
Full Text: View/download PDF

231. GCMA: Generative Cross-Modal Transferable Adversarial Attacks from Images to Videos

Author: Chen, Kai, primary, Wei, Zhipeng, additional, Chen, Jingjing, additional, Wu, Zuxuan, additional, and Jiang, Yu-Gang, additional
Published: 2023
Full Text: View/download PDF

232. Suspected Objects Matter: Rethinking Model's Prediction for One-stage Visual Grounding

Author: Jiao, Yang, primary, Jie, Zequn, additional, Chen, Jingjing, additional, Ma, Lin, additional, and Jiang, Yu-Gang, additional
Published: 2023
Full Text: View/download PDF

233. Hierarchical Visual-Textual Graph for Temporal Activity Localization via Language

Author: Chen, Shaoxiang, primary and Jiang, Yu-Gang, additional
Published: 2020
Full Text: View/download PDF

234. Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos

Author: Chen, Shaoxiang, primary, Jiang, Wenhao, additional, Liu, Wei, additional, and Jiang, Yu-Gang, additional
Published: 2020
Full Text: View/download PDF

235. Learning part-based mid-level representation for visual recognition

Author: Yuan, Baodi, Tu, Jian, Zhao, Rui-Wei, Zheng, Yingbin, and Jiang, Yu-Gang
Published: 2018
Full Text: View/download PDF

236. Small footprint transistor architecture for photoswitching logic and in situ memory

Author: Liu, Chunsen, Chen, Huawei, Hou, Xiang, Zhang, Heng, Han, Jun, Jiang, Yu-Gang, Zeng, Xiaoyang, Zhang, David Wei, and Zhou, Peng
Published: 2019
Full Text: View/download PDF

237. HCMS: Hierarchical and Conditional Modality Selection for Efficient Video Recognition

Author: Weng, Zejia, primary, Wu, Zuxuan, additional, Li, Hengduo, additional, Chen, Jingjing, additional, and Jiang, Yu-Gang, additional
Published: 2023
Full Text: View/download PDF

238. CDistNet: Perceiving Multi-domain Character Distance for Robust Text Recognition

Author: Zheng, Tianlun, primary, Chen, Zhineng, additional, Fang, Shancheng, additional, Xie, Hongtao, additional, and Jiang, Yu-Gang, additional
Published: 2023
Full Text: View/download PDF

239. TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition

Author: Zheng, Tianlun, primary, Chen, Zhineng, additional, Bai, Jinfeng, additional, Xie, Hongtao, additional, and Jiang, Yu-Gang, additional
Published: 2023
Full Text: View/download PDF

240. Zn2+ reduction induces neuronal death with changes in voltage-gated potassium and sodium channel currents

Author: Tian, Kun, He, Cong-cong, Xu, Hui-nan, Wang, Yu-xiang, Wang, Hong-gang, An, Di, Heng, Bin, Pang, Wei, Jiang, Yu-gang, and Liu, Yan-qiang
Published: 2017
Full Text: View/download PDF

241. Genistein inhibits hypoxia, ischemic-induced death, and apoptosis in PC12 cells

Author: Wang, Yu-xiang, Tian, Kun, He, Cong-cong, Ma, Xue-ling, Zhang, Feng, Wang, Hong-gang, An, Di, Heng, Bin, Jiang, Yu-gang, and Liu, Yan-qiang
Published: 2017
Full Text: View/download PDF

242. The THUMOS challenge on action recognition for videos “in the wild”

Author: Idrees, Haroon, Zamir, Amir R., Jiang, Yu-Gang, Gorban, Alex, Laptev, Ivan, Sukthankar, Rahul, and Shah, Mubarak
Published: 2017
Full Text: View/download PDF

243. Extreme vocabulary learning

Author: Dong, Hanze, Sun, Zhenfeng, Fu, Yanwei, Zhong, Shi, Zhang, Zhengjun, and Jiang, Yu-Gang
Published: 2020
Full Text: View/download PDF

244. A comparative study of the effectiveness and safety of combined procarbazine, lomustine, and vincristine as a therapeutic method for recurrent high-grade glioma: A protocol for systematic review and meta-analysis

Author: Cai, Yang, Jiang, Yu-Gang, Wang, Ming, Jiang, Zhuo-Hang, and Tan, Zhi-Gang
Published: 2020
Full Text: View/download PDF

245. Stacked multichannel autoencoder – an efficient way of learning from synthetic data

Author: Zhang, Xi, Fu, Yanwei, Jiang, Shanshan, Xue, Xiangyang, Jiang, Yu-Gang, and Agam, Gady
Published: 2018
Full Text: View/download PDF

246. Microarray expression profiling and co-expression network analysis of circulating LncRNAs and mRNAs associated with neurotoxicity induced by BPA

Author: Pang, Wei, Lian, Fu-Zhi, Leng, Xue, Wang, Shu-min, Li, Yi-bo, Wang, Zi-yu, Li, Kai-ren, Gao, Zhi-xian, and Jiang, Yu-gang
Published: 2018
Full Text: View/download PDF

247. PolarFormer: Multi-Camera 3D Object Detection with Polar Transformer

Author: Jiang, Yanqin, primary, Zhang, Li, additional, Miao, Zhenwei, additional, Zhu, Xiatian, additional, Gao, Jin, additional, Hu, Weiming, additional, and Jiang, Yu-Gang, additional
Published: 2023
Full Text: View/download PDF

248. Look Before You Match: Instance Understanding Matters in Video Object Segmentation

Author: Wang, Junke, primary, Chen, Dongdong, additional, Wu, Zuxuan, additional, Luo, Chong, additional, Tang, Chuanxin, additional, Dai, Xiyang, additional, Zhao, Yucheng, additional, Xie, Yujia, additional, Yuan, Lu, additional, and Jiang, Yu-Gang, additional
Published: 2023
Full Text: View/download PDF

249. SVFormer: Semi-supervised Video Transformer for Action Recognition

Author: Xing, Zhen, primary, Dai, Qi, additional, Hu, Han, additional, Chen, Jingjing, additional, Wu, Zuxuan, additional, and Jiang, Yu-Gang, additional
Published: 2023
Full Text: View/download PDF

250. Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning

Author: Wang, Rui, primary, Chen, Dongdong, additional, Wu, Zuxuan, additional, Chen, Yinpeng, additional, Dai, Xiyang, additional, Liu, Mengchen, additional, Yuan, Lu, additional, and Jiang, Yu–Gang, additional
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

974 results on '"Jiang, Yu-Gang"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources