8 results on '"Chen, Ming yu"'
Search Results
2. Video Classification and Retrieval with the Informedia Digital Video Library System
- Author
-
Hauptmann, Alexander, Yan, Rong, Y. Qi, Jin, Rong, Christel, Michael G, M. Derthick, Chen, Ming-Yu, Baron, Robert, W.-H. Lin, and T. D. Ng
- Subjects
FOS: Computer and information sciences ,ComputingMilieux_COMPUTERSANDEDUCATION ,89999 Information and Computing Sciences not elsewhere classified ,ComputingMilieux_MISCELLANEOUS - Abstract
Computer Science Department
- Published
- 2012
- Full Text
- View/download PDF
3. Informedia @ TRECVID 2010
- Author
-
Li, Huan, Bao, Lei, Overwijk, Arnold, Liu, Wei, Long-Fei Zhang, Shoou-I Yu, Chen, Ming-Yu, Metze, Florian, and Hauptmann, Alexander
- Subjects
FOS: Computer and information sciences ,89999 Information and Computing Sciences not elsewhere classified - Abstract
The Informedia group participated in four tasks this year, including Semantic indexing, Known-item search, Surveillance event detection and Event detection in Internet multimedia pilot. For semantic indexing, except for training traditional SVM classifiers for each high level feature by using different low level features, a kind of cascade classifier was trained which including four layers with different visual features respectively. For Known Item Search task, we built a text-based video retrieval and a visual-based video retrieval system, and then query-class dependent late fusion was used to combine the runs from these two systems. For surveillance event detection, we especially put our focus on analyzing motions and human in videos. We detected the events by three channels. Firstly, we adopted a robust new descriptor called MoSIFT, which explicitly encodes appearance features together with motion information. And then we trained event classifiers in sliding windows using a bag-of-video-word approach. Secondly, we used the human detection and tracking algorithms to detect and track the regions of human, and then just focus on the MoSIFT points in the human regions. Thirdly, after getting the decision, we also borrow the results of human detection to filter the decision. In addition, to reduce the number of false alarms further, we aggregated short positive windows to favor long segmentation and applied a cascade classi- fier approach. The performance shows dramatic improvement over last year on the event detection task. For event detection in internet multimedia pilot, our system is purely based on textual information in the form of Automatic Speech Recognition (ASR) and Optical Character Recognition (OCR). We submitted three runs; a run based on a simple combination of three different ASR transcripts, a run based on OCR only and a run that combines ASR and OCR. We noticed that both ASR and OCR contribute to the goals of this task. However the video collection is very challenging for those features, resulting in a low recall but high precision.
- Published
- 2010
- Full Text
- View/download PDF
4. Informedia @ TRECVID 2009: Analyzing Video Motions
- Author
-
Chen, Ming-Yu, Li, Huan, and Hauptmann, Alexander
- Subjects
FOS: Computer and information sciences ,89999 Information and Computing Sciences not elsewhere classified - Abstract
The Informedia team participated in the tasks of high-level feature extraction and event detection in surveillance video. This year, we especially put our focus on analyzing motions in videos. We developed a robust new descriptor called MoSIFT, which explicitly encodes appearance features together with motion information. For the high-level feature detection, we trained multi-modality classifiers which include traditional static features and MoSIFT. The experimental result shows that MoSIFT has solid performance on motion related concepts and is complementary to static features. For event detection, we trained event classifiers in sliding windows using a bag-of-video-word approach. To reduce the number of false alarms, we aggregated short positive windows to favor long segmentation and applied a cascade classifier approach. The performance shows dramatic improvement over last year on the event detection task.
- Published
- 2006
- Full Text
- View/download PDF
5. Discriminative Fields for Modeling Semantic Concepts in Video
- Author
-
Chen, Ming-Yu and Hauptmann, Alexander
- Subjects
FOS: Computer and information sciences ,89999 Information and Computing Sciences not elsewhere classified - Abstract
According to some current thinking, a very large number of semantic concepts could provide researcher a novel way to characterize video and be utilized for video retrieval and understanding. These semantic concepts do not isolate to each other and thus exploiting relationships between multiple semantic concepts in video could be a very useful source to enhance the concept detection performance. In this paper we present a discriminative learning framework called Multi-concept Discriminative Random Field (MDRF) for building probabilistic models on video semantic concept detections by incorporating related concepts as well as the observation. The proposed model exploits the power of discriminative graphical models to simultaneously capture the associations of concept with observed data and the interactions between related concepts. Compared with previous methods, this model can not only capture the co-occurrence between concepts but also incorporate the data observation in a unified framework. We also present an approximate parameter estimation algorithm and apply it to TRECVID 2005 data. Our experiments show promising results compared to the single concept learning approach for video semantic detection.
- Published
- 2003
- Full Text
- View/download PDF
6. Informedia @ TRECVID2008: Exploring New Frontiers
- Author
-
Hauptmann, Alexander, Baron, Robert V., Chen, Ming-Yu, Christel, Michael G, Wei-Hao Lin, Mummert, Lily, Schlosser, Steve, Xinghua Sun, Valdes, Victor, and Yang, Sun
- Subjects
FOS: Computer and information sciences ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,89999 Information and Computing Sciences not elsewhere classified - Abstract
The Informedia team participated in the tasks of Rushes summarization, high-level feature extraction and event detection in surveillance video. For the rushes summarization, our basic idea was to use subsampled video at the appropriate rate, showing almost the whole video faster, and then modify the result to remove garbage frames. Sinply subsampling the frames proved to be the best method for summarizing BBC rushes video, with other improvements not improving the basic inclusion rate, nor appreciably affecting the other subjective metrics. For the high-level feature detection, we trained exclusively on TRECVID’05 data and trying to assess and predict the reliability of the detectors. The voting scheme for combining multiple classifiers performed best, marginally better than trying to predict the best classifier based on a robustness calculation from within dataset cross-domain performance. For event detection, we found that the overall approach was effective at characterizing a presegmented event in the training data, but lack of event segmentation (information about the duration of an event and the existence of a known event resulted in a dramatically lower score in the official evaluation.
- Published
- 1998
- Full Text
- View/download PDF
7. MoSIFT: Recognizing Human Actions in Surveillance Videos
- Author
-
Chen, Ming-Yu and Hauptmann, Alexander
- Subjects
FOS: Computer and information sciences ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,89999 Information and Computing Sciences not elsewhere classified - Abstract
The goal of this paper is to build robust human action recognition for real world surveillance videos. Local spatio-temporal features around interest points provide compact but descriptive representations for video analysis and motion recognition. Current approaches tend to extend spatial descriptions by adding a temporal component for the appearance descriptor, which only implicitly captures motion information. We propose an algorithm called MoSIFT, which detects interest points and encodes not only their local appearance but also explicitly models local motion. The idea is to detect distinctive local features through local appearance and motion. We construct MoSIFT feature descriptors in the spirit of the well-known SIFT descriptors to be robust to small deformations through grid aggregation. We also introduce a bigram model to construct a correlation between local features to capture the more global structure of actions. The method advances the state of the art result on the KTH dataset to an accuracy of 95.8%. We also applied our approach to 100 hours of surveillance data as part of the TRECVID Event Detection task with very promising results on recognizing human actions in the real world surveillance videos. 2
- Published
- 1995
- Full Text
- View/download PDF
8. Toward Robust Face Recognition from Multiple Views
- Author
-
Chen, Ming-Yu and Hauptmann, Alexander
- Subjects
FOS: Computer and information sciences ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,89999 Information and Computing Sciences not elsewhere classified - Abstract
The paper presents a novel approach to aid face recognition: Using multiple views of a face, we construct a 3D model instead of directly using the 2D images for recognition. Our framework is designed for videos, which contain many instances of a target face from a sequence of slightly differing views, as opposed to a single static picture of the face. Specifically, we reconstruct the 3D face shapes from two orthogonal views and select features based on pairwise distances between landmark points on the model using Fisher's linear discriminant. While 3D face shape reconstruction is sensitive to the quality of the feature point localization, our experiments show that 3D reconstruction together with the regularized Fisher's linear discriminant can provide highly accurate face recognition from multiple facial views. Experiments on the Carnegie Mellon PIE (pose, illumination and expressions) database containing the faces of 68 people, with at least 3 expressions under varying lighting conditions, demonstrate vastly improved performance
- Published
- 1973
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.