Descriptor: "Video indexing" / Language: english - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Video indexing"' showing total 270 results

Start Over Descriptor "Video indexing" Language english

270 results on '"Video indexing"'

1. Efficient feature based video retrieval and indexing using pattern change with invariance algorithm.

Author: Namala, Vasu and Karuppusamy, S. Anbu
Subjects: *VIDEOS, *FEATURE extraction, *TAGS (Metadata), *GRAPH labelings, *FUZZY logic, *ALGORITHMS
Abstract: The amount of audio visual content kept in networked repositories has increased dramatically in recent years. Many video hosting websites exist, such as YouTube, Metacafe, and Google Video. Currently, indexing and categorising these videos is a time-consuming task. The system either asks the user to provide tags for the videos they submit, or manual labelling is used. The aim of this research is to develop a classifier that can accurately identify videos. Every video has content that is either visual, audio, or text. Researchers categorised the videos based on any of these three variables. With the Pattern Change with Size Invariance (PCSI) algorithm, this study provides a hybrid model that takes into account all three components of the video: audio, visual, and textual content. This study tries to classify videos into broad categories such as education, sports, movies, and amateur videos. Key feature extraction and pattern matching would be used to accomplish this. A fuzzy logic and ranking system would be used to assign the tag to the video. The proposed system is tested only on a virtual device in addition a legitimate distributed cluster for the aim of reviewing real-time performance, especially once the amount and duration of films are considerable. The efficiency of video retrieval is measured with metrics like accuracy, precision, and recall is over 99% success. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

2. Video shot-boundary detection: issues, challenges and solutions

Author: Kar, T., Kanungo, P., Mohanty, Sachi Nandan, Groppe, Sven, and Groppe, Jinghua
Published: 2024
Full Text: View/download PDF

3. Multi-Object Semantic Video Detection and Indexing Using a 3D Deep Learning Model.

Author: Mofreh, Eslam, Abozeid, Amr, Farouk, Hesham, and ElDahshan, Kamal A.
Subjects: DEEP learning, VIDEOS, OBJECT recognition (Computer vision), FRAMES (Linguistics), SOCIAL media
Abstract: Within the exponential growth in raw data production, attributed in no small part to social media - Facebook, Youtube, and others. Video is proving to be the most important data type thanks to the substantial amount of raw data it contains, requiring an efficient way to be understood, organized, structured, and stored for ease of retrieval. Hence, an efficient video indexing architecture is thus crucial for video datasets. This paper proposes an efficient Multi-Object Semantic Video Detection (MOSD) that leverages the deep learning power to achieve effective indexing on the semantic concept level. MOSD is multi-detection network of video semantics in multiple frames. MOSD exploits a 3D convolution operation to do multiple detections among multiple frames with higher performance. The detected semantics then structured and used for indexing the video segments. MOSD has been trained and evaluated on ImageNet VID dataset and has been compared to peers. MOSD showed efficiency in exploiting the temporal context of a video to do simultaneous detections of consecutive frames which speeds up the detection of semantic objects. MOSD also showed performance efficiency in terms of mAP which is 85.2%. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

4. INDEXED DATASET FROM YOUTUBE FOR A CONTENT-BASED VIDEO SEARCH ENGINE.

Author: Adly, Ahmad Sedky, Hegazy, Islam, Elarif, Taha, and Abdelwahab, M. S.
Subjects: SEARCH engines, STREAMING video & television, BIG data, COMPUTER vision
Abstract: Numerous researches on content-based video indexing and retrieval besides video search engines are tied to a large-scaled video dataset. Unfortunately, reduction in open-sourced datasets resulted in complications for novel approaches exploration. Although, video datasets that index video files located on public video streaming services have other purposes, such as annotation, learning, classification, and other computer vision areas, with little interest in indexing public video links for purpose of searching and retrieval. This paper introduces a novel large-scaled dataset based on YouTube video links to evaluate the proposed content-based video search engine, gathered 1088 videos, that represent more than 65 hours of video, 11,000 video shots, and 66,000 unmarked and marked keyframes, 80 different object names used for marking. Moreover, a state-of-the-art features vector, and combinational-based matching, beneficial to the accuracy, speed, and precision of the video retrieval process. Any video record in the dataset is represented by three features: temporal combination vector, object combination vector with shot annotations, and 6 keyframes, sideways with other metadata. Video classification for the dataset was also imposed to expand the efficiency of retrieval of video-based queries. A two-phased approach has been used based on object and event classification, storing video records in aggregations related to feature vectors extracted. While object aggregation stores video records with the maximal occurrence of extracted object/concept from all shots, event aggregation classify based on groups according to the number of shots per video. This study indexed 58 out of 80 different object/concept categories, each has 9 shot number groups. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

5. Video content analysis and retrieval system using video storytelling and indexing techniques.

Author: Jacob, Jaimon, Elayidom, M. Sudheep, and Devassia, V. P.
Subjects: VIDEOS, CONTENT analysis, SYSTEM analysis, SEARCH algorithms, WORD frequency, VIDEO excerpts
Abstract: Videos are used often for communicating ideas, concepts, experience, and situations, because of the significant advances made in video communication technology. The social media platforms enhanced the video usage expeditiously. At, present, recognition of a video is done, using the metadata like video title, video descriptions, and video thumbnails. There are situations like video searcher requires only a video clip on a specific topic from a long video. This paper proposes a novel methodology for the analysis of video content and using video storytelling and indexing techniques for the retrieval of the intended video clip from a long duration video. Video storytelling technique is used for video content analysis and to produce a description of the video. The video description thus created is used for preparation of an index using wormhole algorithm, guarantying the search of a keyword of definite length L, within the minimum worst-case time. This video index can be used by video searching algorithm to retrieve the relevant part of the video by virtue of the frequency of the word in the keyword search of the video index. Instead of downloading and transferring a whole video, the user can download or transfer the specifically necessary video clip. The network constraints associated with the transfer of videos are considerably addressed. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

6. Joint Multi-View Hashing for Large-Scale Near-Duplicate Video Retrieval.

Author: Nie, Xiushan, Jing, Weizhen, Cui, Chaoran, Zhang, Chen Jason, Zhu, Lei, and Yin, Yilong
Subjects: *HASHING, *STREAMING video & television, *WEB databases, *VIDEOS
Abstract: Multi-view hashing can well support large-scale near-duplicate video retrieval, due to its desirable advantages of mutual reinforcement of multiple features, low storage cost, and fast retrieval speed. However, there are still two limitations that impede its performance. First, existing methods only consider local structures in multiple features. They ignore the global structure that is important for near-duplicate video retrieval, and cannot fully exploit the dependence and complementarity of multiple features. Second, existing works always learn hashing functions bit by bit, which unfortunately increases the time complexity of hash function learning. In this paper, we propose a supervised hashing scheme, termed as joint multi-view hashing (JMVH), to address the aforementioned problems. It jointly preserves the global and local structures of multiple features while learning hashing functions efficiently. Specially, JMVH considers features of video as items, based on which an underlying Hamming space is learned by simultaneously preserving their local and global structures. In addition, a simple but efficient multi-bit hash function learning based on generalized eigenvalue decomposition is devised to learn multiple hash functions within a single step. It can significantly reduce the time complexity of conventional hash function learning processes that sequentially learn multiple hash functions bit by bit. The proposed JMVH is evaluated on two public databases: CC_WEB_VIDEO and UQ_VIDEO. Experimental results demonstrate that the proposed JMVH achieves more than a 5 percent improvement compared to several state-of-the-art methods which indicates the superior performance of JMVH. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

7. Improved Fuzzy-Based SVM Classification System Using Feature Extraction for Video Indexing and Retrieval.

Author: Gayathri, N. and Mahesh, K.
Subjects: SUPPORT vector machines, FEATURE extraction, MULTIMEDIA systems, HISTOGRAMS, FUZZY systems
Abstract: Various researches have been performed with video abstraction with the constant development of multimedia technology. However, there are some deficiencies that have been encountered in the pre-processing of video frames before attaining classified video archives. To overcome the drawbacks in pre-processing, feature extraction and classification approaches are considered. Here, video indexing has been anticipated with several features' extraction with dominant frame generation for the input video frame. Fuzzy-based SVM classifier is utilized to categorize frame set into dominant structures. Multi-dimensional Histogram of Oriented Gradients (HOG) and colour feature extraction are used to extract texture features from the video frame. Using the frame sequence, the vector space of structures is captured; dominant frameworks are utilized in video indexing. Shot transitions' classification is done with a fuzzy system. Experimental outcomes demonstrate that shot boundary detection accuracy increases with an increase in iterations. The simulation was carried out in MATLAB environment. This technique attains an accuracy of about 95.4%, the precision of 100%, and the F1 score of 100% and a recall of 100%. The misclassification rate is 4.6%. The proposed method shows better trade-off than the existing techniques. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

8. Video genre identification using clustering-based shot detection algorithm.

Author: Daudpota, Sher Muhammad, Muhammad, Atta, and Baber, Junaid
Abstract: Rapid growth in storage technology and data acquisition has significantly increased the volume of multimedia data online. A challenging problem is to analyze that multimedia data which are in massive quantity. In recent years, indexing of video files based on contents has gained increased popularity in research community. There are also attempts at identifying if a video clip is containing a specific genre of video, e.g., an sports video, movie, drama, animation or talk show. These proposed techniques, however, use a long list of audio visual features in achieving this classification task, which obviously decreases processing efficiency. Based on certain patterns in audio visual features and basic grammar of talk show, this research differentiates multimedia contents of talk shows from rest of the video genres. Our multimodal rule-based classification approach exploits shots and scenes in a video as classification features. The contents from popular multimedia databases like DailyMotion, YouTube and movies from Hollywood and Bollywood are used as dataset to test overall system of genre identification. The system achieves precision and recall of 98% and 100%, respectively, on 600 selected videos of more than 600 h of duration to classify multimedia content as either 'TalkShow' or 'OtherVideo' category. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

9. Solar panel monitoring: real-time system using video watermarking and Mosaicing.

Author: Lafkih, Sara and Zaz, Youssef
Subjects: SOLAR cells, DISCRETE cosine transforms, DIGITAL watermarking, SOLAR energy, SOLAR power plants
Abstract: In this paper, we propose a new approach to monitor a solar plant in real-time using an embedded vision system, comprised mainly of raspberry Pi3 card, GPS and thermometer sensors, and an HD camera module. This approach consists on using image processing techniques on solar energy fields. It is implemented on two steps, the first apply the digital watermarking technique based on discrete cosine transform (DCT) to indexing a captured video. We start by capturing all information from the solar plant (GPS coordinates, temperature, date, and time), then we embed on each video frame their related data. This method depends on transforming the video frames from the spatial to the frequency domain using DCT and embedding data in 8 × 8 block coefficients. The second step consists on applying a mosaicing techniques based on the scale invariant features transform (SIFT), in order to generate a panoramic image of the solar plant from video frames. Remotely, the supervisor can visualize the image of the whole, part, or panel of the solar plant, and exploits the embedded data to ensure an efficient real-time monitoring. The proposed method allows a high efficiency in terms of accuracy, data security, and retrieving data quickly, also it offers a high capacity, less imperceptibility and good robustness. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

10. New fusional framework combining sparse selection and clustering for key frame extraction

Author: Mengjuan Fei, Wei Jiang, Weijie Mao, and Zhendong Song
Subjects: fusional framework, sparse selection, video indexing, video content preservation, mutual information based agglomerative hierarchical clustering, optimal key frame, Computer applications to medicine. Medical informatics, R858-859.7, Computer software, QA76.75-76.765
Abstract: Key frame extraction can facilitate rapid browsing and efficient video indexing in many applications. However, to be effective, key frames must preserve sufficient video content while also being compact and representative. This study proposes a syncretic key frame extraction framework that combines sparse selection (SS) and mutual information‐based agglomerative hierarchical clustering (MIAHC) to generate effective video summaries. In the proposed framework, the SS algorithm is first applied to the original video sequences to obtain optimal key frames. Then, using content‐loss minimisation and representativeness ranking, several candidate key frames are efficiently selected and grouped as initial clusters. A post‐processor – an improved MIAHC – subsequently performs further processing to eliminate redundant images and generate the final key frames. The proposed framework overcomes issues such as information redundancy and computational complexity that afflict conventional SS methods by first obtaining candidate key frames instead of accurate key frames. Subsequently, application of the improved MIAHC to these candidate key frames rather than the original video not only results in the generation of accurate key frames, but also reduces the computation time for clustering large videos. The results of comparative experiments conducted on two benchmark datasets verify that the performance of the proposed SS–MIAHC framework is superior to that of conventional methods.
Published: 2016
Full Text: View/download PDF

11. Histogram difference with Fuzzy rule base modeling for gradual shot boundary detection in video cloud applications.

Author: Kethsy Prabavathy, A. and Devi Shree, J.
Subjects: *VIDEO compression, *FUZZY sets, *HISTOGRAMS, *SYSTEM identification, *VIDEOS, *CONTENT analysis
Abstract: In the field of shot boundary detection the fundamental step is video content analysis towards video indexing, summarization and retrieval as to be carried out for video cloud based applications. However, there are several beneficial in the previous work; reliable detection of video shot is still a challenging issue. In this paper the focus is carried out on the problem of gradual transition detection from video. The proposed approach is fuzzy-rule based system with gradual identification and a set of fuzzy rules are evaluated with dissolve and wipes (fad-in and fad-out) during gradual transition. First, extracting the features from the video frames then applying the fuzzy rules in to the frames for identifying the gradual transitions. The main advantage of the proposed method is its level of accuracy in the gradual detection getting increased. Also, the existing gradual detection algorithms are mainly based on the threshold component, but the proposed method is rule based. The proposed method is evaluated on variety of video sequences from different genres and compared with existing techniques from the literature. Experimental results proved for its effectiveness on calculating performance in terms of the precision and recall rates. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

12. Learning a Multi-Concept Video Retrieval Model with Multiple Latent Variables.

Author: MAZAHERI, AMIR, GONG, BOQING, and SHAH, MUBARAK
Subjects: VIDEOS, DETECTORS, IMAGE retrieval, MULTIMEDIA systems, DIGITAL media
Abstract: Effective and efficient video retrieval has become a pressing need in the "big video" era. The objective of this work is to provide a principled model for computing the ranking scores of a video in response to one or more concepts, where the concepts could be directly supplied by users or inferred by the system from the user queries. Indeed, how to deal with multi-concept queries has become a central component in modern video retrieval systems that accept text queries. However, it has been long overlooked and simply implemented by weighted averaging of the corresponding concept detectors' scores. Our approach, which can be considered as a latent ranking SVM, integrates the advantages of various recent works in text and image retrieval, such as choosing ranking over structured prediction, modeling inter-dependencies between querying concepts, and so on. Videos consist of shots, and we use latent variables to account for the mutually complementary cues within and across shots. Concept labels of shots are scarce and noisy. We introduce a simple and effective technique to make our model robust to outliers. Our approach gives superior performance when it is tested on not only the queries seen at training but also novel queries, some of which consist of more concepts than the queries used for training. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

13. A Computational Framework for Simultaneous Real-Time High-Level Video Representation : Extraction of moving objects and related events

Author: Amer, Aishy, Foresti, Gian Luca, editor, Regazzoni, Carlo S., editor, and Varshney, Pramod K., editor
Published: 2003
Full Text: View/download PDF

14. Bridging the Semantic Gap in Content Management Systems : Computational Media Aesthetics

Author: Dorai, Chitra, Venkatesh, Svetha, Shah, Mubarak, editor, Dorai, Chitra, editor, and Venkatesh, Svetha, editor
Published: 2002
Full Text: View/download PDF

15. Improving the robustness of motion vector temporal descriptor.

Author: Rahmani, Farzaneh, Zargari, Farzad, and Ghanbari, Mohammad
Abstract: Motion vectors (MVs) are the most common temporal descriptors in video analysis, indexing and retrieval applications. However, video indexing and analysis based on MVs do not perform well for videos at different dimension ratios (DRs) or even various resolutions. As a result, video indexing and analysis which are based on identifying similar video face with many difficulties at different DRs or resolutions by MVs. In this study, a two‐stage algorithm is introduced to make MV descriptors robust against variations first in DR and then at resolution. In the experiments performed on motion vector histograms, the proposed method improves the performance on identifying similar videos at various spatial specifications by up to 73%. Moreover, in the video retrieval experiments, the proposed modified MV outperforms original MV feature vector. This is an indication of improvement in differentiation of similar and dissimilar videos by the proposed temporal feature vector. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

16. Organizing egocentric videos of daily living activities.

Author: Ortis, Alessandro, Farinella, Giovanni M., D’Amico, Valeria, Addesso, Luca, Torrisi, Giovanni, and Battiato, Sebastiano
Subjects: *VIDEOS, *WEARABLE cameras, *ACTIVITIES of daily living, *IMAGE segmentation, *DIGITAL image processing
Abstract: Egocentric videos are becoming popular since the possibility to observe the scene flow from the user’s point of view (First Person Vision). Among the different applications of egocentric vision is the daily living monitoring of a user wearing the camera. We propose a system able to automatically organize egocentric videos acquired by the user over different days. Through an unsupervised temporal segmentation, each egocentric video is divided in chapters by considering the visual content. The obtained video segments related to the different days are hence connected according to the scene context in which the user acts. Experiments on a challenging egocentric video dataset demonstrate the effectiveness of the proposed approach that outperforms with a good margin the state of the art in accuracy and computational time. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

17. Surveillance Video Synopsis Techniques : A Review.

Author: Gandhi, Shefali and Ratanpara, Tushar
Subjects: VIDEO surveillance, CRIMINAL investigation
Abstract: This is the era of video surveillance, not just security. The arrival of inexpensive surveillance cameras and increasing demands of security has caused an explosive growth of surveillance videos, which are used by government or other organizations for prevention or investigation of crime. As browsing such lengthy videos is very time consuming, most of the videos are never watched and analyzed. The video synopsis is a technique to represent such lengthy videos in a condensed way by showing multiple activities simultaneously. The purpose of this paper is to explore development stages, various algorithms of it, framework and tools used to implement them, challenges and limitations of existing video synopsis techniques and its application in the field of surveillance video analysis. [ABSTRACT FROM AUTHOR]
Published: 2017

18. EFFECTIVE VIDEO RETERIVAL USING CLUSTERING TECHNIQUE.

Author: Joseph, Dennis and Saravanan, D.
Subjects: INFORMATION retrieval, VIDEOS, DOCUMENT clustering, PSEUDOCODE (Computer program language), DATA mining
Abstract: Knowledge extraction is one of the fastest growing fields today. Proper extraction of the needed information presents a challenging job for many researchers. This becomes even more complicated in the case of multimedia content. Clustering is a useful technique for finding useful patterns from the given data set. Video clustering is done on video files for video data mining. Existing clustering techniques works well for only a few particular types of inputs. It has been experimentally verified that the proposed clustering technique offers the best clustering solution for a majority of input files. [ABSTRACT FROM AUTHOR]
Published: 2017

19. A motion and illumination resilient framework for automatic shot boundary detection.

Author: Kar, T. and Kanungo, P.
Abstract: Detecting and locating a desired information in hefty amount of video data through manual procedure is very cumbersome. This necessitates segregation of large video into shots and finding the boundary between the shots. But shot boundary detection problem is unable to achieve satisfactory performance for video sequences consisting of flash light and complex object/camera motion. The proposed method is intended for recognising abrupt boundary between shots in the presence of motion and illumination change in an automatic way. Typically any scene change detection algorithm assimilates time separation in a shot resemblance metric. In this communication, absolute sum gradient orientation feature difference is matched to automatically generated threshold for sensing a cut. Experimental study on TRECVid 2001 data set and other publicly available data set certifies the potentiality of the proposed scheme that identifies scene boundaries efficiently, in a complex environment while preserving a good trade-off between recall and precision measure. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

20. A Branch-and-Bound Framework for Unsupervised Common Event Discovery.

Author: Chu, Wen-Sheng, Torre, Fernando, Cohn, Jeffrey, and Messinger, Daniel
Subjects: *HUMAN behavior, *SUPERVISED learning, *FACE-to-face communication, *BRANCH & bound algorithms, *GLOBAL optimization
Abstract: Event discovery aims to discover a temporal segment of interest, such as human behavior, actions or activities. Most approaches to event discovery within or between time series use supervised learning. This becomes problematic when relevant event labels are unknown, are difficult to detect, or not all possible combinations of events have been anticipated. To overcome these problems, this paper explores Common Event Discovery (CED), a new problem that aims to discover common events of variable-length segments in an unsupervised manner. A potential solution to CED is searching over all possible pairs of segments, which would incur a prohibitive quartic cost. In this paper, we propose an efficient branch-and-bound (B&B) framework that avoids exhaustive search while guaranteeing a globally optimal solution. To this end, we derive novel bounding functions for various commonality measures and provide extensions to multiple commonality discovery and accelerated search. The B&B framework takes as input any multidimensional signal that can be quantified into histograms. A generalization of the framework can be readily applied to discover events at the same or different times (synchrony and event commonality, respectively). We consider extensions to video search and supervised event detection. The effectiveness of the B&B framework is evaluated in motion capture of deliberate behavior and in video of spontaneous facial behavior in diverse interpersonal contexts: interviews, small groups of young adults, and parent-infant face-to-face interaction. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

21. Desafios e avanços na recuperação automática da informação audiovisual Challenges and advancements in automatic retrieval of audiovisual information

Author: Juliano Serra Barreto
Subjects: Sistemas de recuperação da informação visual, Indexação de vídeos, Recuperação do conteúdo audiovisual, Content based image retrieval, Video indexing, Multimedia content retrieval, Bibliography. Library science. Information resources, Information resources (General), ZA3040-5185
Abstract: Exposição sobre processos e métodos utilizados para a indexação e recuperação textual da informação semântica em vídeo, tendo como base a identificação e classificação do seu conteúdo visual e sonoro.Presentation of methods and processes applied to classification and retrieval of semantic information of video programs, through identification of sound and visual content.
Published: 2007
Full Text: View/download PDF

22. Character-based indexing and browsing with movie ontology.

Author: Quang Dieu Tran, Dosam Hwang, and Jung, Jason J.
Subjects: *ONTOLOGY, *INFORMATION theory, *THEORY of knowledge, *PHILOSOPHY, *SEMANTICS
Abstract: Various content-based methods (e.g., audio-visual feature recognition and dialogs analysis) have been proposed for analyzing, indexing and understanding movies. Our previous approach introduced a method for character indexing based on manual annotation. However, this method has shown several unsatisfactoriness on performing of indexing. For addressing this issue, in this work, we take into account using image processing techniques for semi-automatically character-based indexing. Besides, a movie ontological model is created for connecting character appearances and character's roles in the movie. Moreover, we propose a system for assist user indexing manually. On the other hand, a searching and browsing tool is also introduced. Using this tool, user can query character-based semantic indexing. Experimental results are shown that our proposed method is able to assist user in consuming index time and providing a method for automatic indexing, searching and browsing based on semantic queries. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

23. Indexed Captioned Searchable Videos: A Learning Companion for STEM Coursework.

Author: Tuna, Tayfun, Subhlok, Jaspal, Barker, Lecia, Shah, Shishir, Johnson, Olin, and Hovey, Christopher
Subjects: *FILMED lectures, *STEM education, *LEARNING, *EDUCATION, *INDEXING, *EQUIPMENT & supplies
Abstract: Videos of classroom lectures have proven to be a popular and versatile learning resource. A key shortcoming of the lecture video format is accessing the content of interest hidden in a video. This work meets this challenge with an advanced video framework featuring topical indexing, search, and captioning (ICS videos). Standard optical character recognition (OCR) technology was enhanced with image transformations for extraction of text from video frames to support indexing and search. The images and text on video frames is analyzed to divide lecture videos into topical segments. The ICS video player integrates indexing, search, and captioning in video playback providing instant access to the content of interest. This video framework has been used by more than 70 courses in a variety of STEM disciplines and assessed by more than 4000 students. Results presented from the surveys demonstrate the value of the videos as a learning resource and the role played by videos in a students learning process. Survey results also establish the value of indexing and search features in a video platform for education. This paper reports on the development and evaluation of ICS videos framework and over 5 years of usage experience in several STEM courses. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

24. Efficient processing of video containment queries by using composite ordinal features.

Author: Seo, Jung and Kim, Myoung
Subjects: VIDEO processing, CONTENT-based image retrieval, INDEXING, VIDEO compression, QUERYING (Computer science)
Abstract: Video containment queries find the videos that have similar sequence of frames to query video clips. Applying sequence matching to all possible subsequences for video containment queries is computationally expensive for large volumes of video data. In this paper, we propose an efficient candidate segment selection scheme, which selects only a small set of subsequences to be matched to the query sequence, by using a cluster of similar frames, called a frame cluster. We also propose a new type of the ordinal feature, called a composite ordinal feature that allows multiple ranks to certain cells. In experiments with large scale video data sets, we show our method improves the query response time by efficiently selecting a set of subsequences for sequence matching. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

25. Fuzzy reasoning framework to improve semantic video interpretation.

Author: Zarka, Mohamed, Ben Ammar, Anis, and Alimi, Adel
Subjects: FUZZY control systems, MULTIMEDIA communications, INFORMATION retrieval, KNOWLEDGE management, VIDEOS
Abstract: A video retrieval system user hopes to find relevant information when the proposed queries are ambiguous. The retrieval process based on detecting concepts remains ineffective in such a situation. Potential relationships between concepts have been shown as a valuable knowledge resource that can enhance the retrieval effectiveness, even for ambiguous queries. Recent researches in multimedia retrieval have focused on ontology modeling as a common framework to manage knowledge. Handling these ontologies has to cope with issues related to generic knowledge management and processing scalability. Considering these issues, we suggest a context-based fuzzy ontology framework for video content analysis and indexing. In this paper, we focused on the way in which we modeled our fuzzy ontology: First, we populate automatically the generated ontology by gathering various available video annotation datasets. Then, the ontology content was used to infer enhanced video semantic interpretation. Finally, considering user feedback, the content of the ontology was improved. Experimental results showed that our approach achieves the goal of scalability while at the same time allowing better video content semantic interpretation. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

26. Temporal Segmentation of MPEG Video Streams

Author: Janko Calic and Ebroul Izquierdo
Subjects: shot detection, video indexing, compressed domain, MPEG stream., Telecommunication, TK5101-6720, Electronics, TK7800-8360
Abstract: Many algorithms for temporal video partitioning rely on the analysis of uncompressed video features. Since the information relevant to the partitioning process can be extracted directly from the MPEG compressed stream, higher efficiency can be achieved utilizing information from the MPEG compressed domain. This paper introduces a real-time algorithm for scene change detection that analyses the statistics of the macroblock features extracted directly from the MPEG stream. A method for extraction of the continuous frame difference that transforms the 3D video stream into a 1D curve is presented. This transform is then further employed to extract temporal units within the analysed video sequence. Results of computer simulations are reported.
Published: 2002
Full Text: View/download PDF

27. Temporal mapping of surveillance video for indexing and summarization.

Author: Bagheri, Saeid, Zheng, Jiang Yu, and Sinha, Shivank
Subjects: MATHEMATICAL mappings, VIDEO surveillance, INDEXING, IMAGE processing, HUMAN-machine systems, FRAMES (Video), PATTERN recognition systems, REAL-time computing
Abstract: This work converts the surveillance video to a temporal domain image called temporal profile that is scrollable and scalable for quick searching of long surveillance video by human operators. Such a profile is sampled with linear pixel lines located at critical locations in the video frames. It has precise time stamp on the target passing events through those locations in the field of view, shows target shapes for identification, and facilitates the target search in long videos. In this paper, we first study the projection and shape properties of dynamic scenes in the temporal profile so as to set sampling lines. Then, we design methods to capture target motion and preserve target shapes for target recognition in the temporal profile. It also provides the uniformed resolution of large crowds passing through so that it is powerful in target counting and flow measuring. We also align multiple sampling lines to visualize the spatial information missed in a single line temporal profile. Finally, we achieve real time adaptive background removal and robust target extraction to ensure long-term surveillance. Compared to the original video or the shortened video, this temporal profile reduced data by one dimension while keeping the majority of information for further video investigation. As an intermediate indexing image, the profile image can be transmitted via network much faster than video for online video searching task by multiple operators. Because the temporal profile can abstract passing targets with efficient computation, an even more compact digest of the surveillance video can be created. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

28. An Efficient Video Retrieval Algorithm Using Key Frame Matching for Video Content Management.

Author: Sang Hyun Kim
Subjects: DIGITAL video, INFORMATION retrieval, ALGORITHMS, EUCLIDEAN metric, HISTOGRAMS, IMAGE registration
Abstract: To manipulate large video contents, effective video indexing and retrieval are required. A large number of video indexing and retrieval algorithms have been presented for frame-wise user query or video content query whereas a relatively few video sequence matching algorithms have been proposed for video sequence query. In this paper, we propose an efficient algorithm that extracts key frames using color histograms and matches the video sequences using edge features. To effectively match video sequences with a low computational load, we make use of the key frames extracted by the cumulative measure and the distance between key frames, and compare two sets of key frames using the modified Hausdorff distance. Experimental results with real sequence show that the proposed video sequence matching algorithm using edge features yields the higher accuracy and performance than conventional methods such as histogram difference, Euclidean metric, Battachaya distance, and directed divergence methods. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

29. SAPTE: A multimedia information system to support the discourse analysis and information retrieval of television programs.

Author: Pereira, Moisés, Souza, Celso, Pádua, Flávio, Silva, Giani, Assis, Guilherme, and Pereira, Adriano
Subjects: MULTIMEDIA systems, DISCOURSE analysis, INFORMATION retrieval research, TELEVISION programs research, TELEVISION networks, METADATA
Abstract: This paper presents a novel multimedia information system, called SAPTE, for supporting the discourse analysis and information retrieval of television programs from their corresponding video recordings. Unlike most common systems, SAPTE uses both content independent and dependent metadata, which are determined by the application of discourse analysis techniques as well as image and audio analysis methods. The proposed system was developed in partnership with the free-to-air Brazilian TV channel Rede Minas in an attempt to provide TV researchers with computational tools to assist their studies about this media universe. The system is based on the Matterhorn framework for managing video libraries, combining: (1) discourse analysis techniques for describing and indexing the videos, by considering aspects, such as, definitions of the subject of analysis, the nature of the speaker and the corpus of data resulting from the discourse; (2) a state of the art decoder software for large vocabulary continuous speech recognition, called Julius; (3) image and frequency domain techniques to compute visual signatures for the video recordings, containing color, shape and texture information; and (4) hashing and k-d tree methods for data indexing. The capabilities of SAPTE were successfully validated, as demonstrated by our experimental results, indicating that SAPTE is a promising computational tool for TV researchers. [ABSTRACT FROM AUTHOR]
Published: 2015
Full Text: View/download PDF

30. Automatic measure of imitation during social interaction: A behavioral and hyperscanning-EEG benchmark.

Author: Delaherche, Emilie, Dumas, Guillaume, Nadel, Jacqueline, and Chetouani, Mohamed
Subjects: *SOCIAL interaction, *ELECTROENCEPHALOGRAPHY, *HUMAN-computer interaction, *PATTERN recognition systems, *SUPERVISED learning
Abstract: Social neuroscience shows a growing interest for the study of social interaction. Investigating its neural underpinnings has been greatly facilitated through the development of hyperscanning, a neuroimaging technique allowing to record simultaneously the brain activity of multiple humans engaged in a social exchange. However, the analysis of spontaneous social interaction requires the indexing of the ongoing behavior. Since spontaneous exchanges are intrinsically unconstrained, only a manual indexing by frame-by-frame analysis has been used so far. Here we present an automatic measure of imitation during spontaneous social interaction. Participants gestures are characterized with Bag of Words and 1-Class SVM models. Then a measure of imitation is derived from the likelihood ratio between these models. We apply this method to hyperscanning EEG recordings of spontaneous imitation of bimanual hand movements. The comparison with manual indexing validates the method at both behavioral and neural levels, demonstrating its ability to discriminate significantly the periods of imitation and non-imitation during social interaction. [ABSTRACT FROM AUTHOR]
Published: 2015
Full Text: View/download PDF

31. Automatic detection of slide transitions in lecture videos.

Author: Jeong, Hyun, Kim, Tak-Eun, Kim, Hyeon, and Kim, Myoung
Subjects: DIGITAL image processing, DIGITIZATION, DATA visualization, COMPUTER simulation, DIGITAL video
Abstract: This paper presents a method to automatically detect slide changes in lecture videos. For accurate detection, the regions capturing slide images are first identified from video frames. Then, SIFT features are extracted from the regions, which are invariant to image scaling and rotation. These features are used to compare similarity between frames. If the similarity is smaller than a threshold, slide transition is detected. The threshold is estimated based on the mean and standard deviation of sample frames' similarities. Using this method, high detection accuracy can be obtained without any supplementary slide images. The proposed method also supports detection of backward slide transitions that occur when a speaker returns to a previous slide to emphasize its contents. In experiments conducted on our test collection, the proposed method showed 87 % accuracy in forward transition detection and 86 % accuracy in backward transition detection. [ABSTRACT FROM AUTHOR]
Published: 2015
Full Text: View/download PDF

32. Characterization and recognition of dynamic textures based on the 2D+T curvelet transform.

Author: Dubois, Sloven, Péteri, Renaud, and Ménard, Michel
Abstract: The research context of this article is the recognition and description of dynamic textures. In image processing, the wavelet transform has been successfully used for characterizing static textures. To our best knowledge, only two works are using spatio-temporal multiscale decomposition based on the tensor product for dynamic texture recognition. One contribution of this article is to analyze and compare the ability of the 2D+T curvelet transform, a geometric multiscale decomposition, for characterizing dynamic textures in image sequences. Two approaches using the 2D+T curvelet transform are presented and compared using three new large databases. A second contribution is the construction of these three publicly available benchmarks of increasing complexity. Existing benchmarks are either too small not available or not always constructed using a reference database. Feature vectors used for recognition are described as well as their relevance, and performances of the different methods are discussed. Finally, future prospects are exposed. [ABSTRACT FROM AUTHOR]
Published: 2015
Full Text: View/download PDF

33. Transform Invariant Approach to Video Fingerprinting.

Author: Pathirana W. P. D. M. and Kodikara N. D.
Subjects: *HUMAN fingerprints, *INDEXING, *VIDEO recording, *VIDEOS, *PHOTOMETRY
Abstract: Estimation of spatial-temporal transformation consistency in object level between the original video and its copies is one of the main issues arise in video fingerprinting. The key purpose of this research is to discover an efficient and excellent approach for the above mentioned problematic situation even though the comparison happens with longer length videos under various types of photometric and geometric transformations. Convert classical copy detection into high level pattern matching task was the proposed and applied approach in this scenario. In here frame wise SIFT interest point trajectory data provide precise and invariant information of relevant video in pattern matching phase. Finally classical challenges of video copy detection like brightness changes, blurs, zooming, size changes, cropping, illumination changes, noise, ratio changes, etc. were addressed by proposed methodology successfully. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

34. Dynamic detection of visual entities.

Author: Bursuc, Andrei, Zaharia, Titus, and Preteux, Francoise
Abstract: This paper tackles the issue of retrieving different instances of an object of interest within a given video document or in a video database. The principle consists of considering a semi-global image representation based on an over-segmentation of video frames. An aggregation mechanism is then applied in order to group a set of segments into an object similar to the query, under a global similarity criterion. We test the effectiveness of three different aggregation strategies, two of them based on a greedy approach, and the third one involving simulated annealing optimization. Experimental results on different color spaces show promising performances, with First Tier and Bull Eye detection rates of up to 70% and 88%, respectively. The integration of the method in a web-based video navigation system, allowing fast video object retrieval, is finally described. [ABSTRACT FROM PUBLISHER]
Published: 2012

35. 2D/3D semantic categorization of visual objects.

Author: Diana Petre, Raluca and Zaharia, Titus
Abstract: In the context of content-based indexing applications, the automatic classification and interpretation of visual content is a key issue that needs to be solved. This paper proposes a novel approach for semantic video object interpretation. The principle consists of exploiting the a priori information contained in categorized 3D model data sets, in order to transfer the semantic labels from such models to unknown video objects. Each 3D model is represented as a set of 2D views, described with the help of shape descriptors. A matching technique is used in order to perform an association between categorized 3D models and 2D video objects. The experimental evaluation shows the interest of our approach, which yields recognition rates of up to 92.5%. [ABSTRACT FROM PUBLISHER]
Published: 2012

36. Automatic affective video indexing: Sound energy and object motion correlation discovery.

Author: French, Jean H.
Abstract: No longer is video creation and storage solely in the hands of professionals. Video repositories are growing at an astounding rate due advances in multimedia technologies. The vast size of video repositories presents challenges for users attempting to identify preferred content. Automated methods for content discovery are necessary to meet the needs of users. One of the more challenging areas of video content discovery is in identifying affective, or emotional, video content. Automatic affective video indexing techniques attempt to use computer-based methods to automatically identify content in videos that is affective in nature. This is the first known automatic affective video indexing study that focuses on slapstick, one of the most popular types of humor techniques. The study shows positive results and contributes to the field by identifying the targeted affective content without relying on actual human emotional responses. [ABSTRACT FROM PUBLISHER]
Published: 2012
Full Text: View/download PDF

37. Video indexing using salient region based spatio-temporal segmentation approach.

Author: Sameh, Megrhi, Wided, Souidene, Beghdadi, Azeddine, and Amar, Chokri B.
Abstract: This paper proposes an original method for video indexing based on a spatio-temporal segmentation scheme. The basic idea is to extract salient regions from the video content and use them as scene descriptors for indexing. The obtained results confirm the efficiency of the proposed approach and open new perspectives for video summarizing and indexing. [ABSTRACT FROM PUBLISHER]
Published: 2012
Full Text: View/download PDF

38. Real swing extraction for video indexing in golf practice video.

Author: Chotimanus, Peera, Cooharojananone, Nagul, and Phimoltares, Suphakant
Abstract: This paper describes a method that extracts the start of every real golf swing in golf practice video recorded from face-on camera angle for video indexing. This new method accelerates and improves the process of golf swing analysis by allowing golfer to quickly navigate to the starting point of each real swing. Real swing is the most important part of golf practice video because it truly shows the problem of golfer's swing. Each start of real swing is extracted by detecting takeaway action at the start of the swing. The system consists of five main processes which are golf club head detection, golf ball detection, takeaway detection, takeaway validation and hit detection. First the golf club shaft is detected by using frame difference, morphological operation, canny edge detection and Hough line transform respectively. After that golf club head is extracted and in order to detect takeaway action accurately, the system also detects golf balls position using HSV color space to extract near white color. Then the information of detected golf clubs and golf balls are used to identify the start of each real swing. The result of experiments on golf practice video with different camera setups and light conditions shows that this method can efficiently extract the beginning of real swings. [ABSTRACT FROM PUBLISHER]
Published: 2012
Full Text: View/download PDF

39. An hierarchical approach towards road image segmentation.

Author: Rahman, Ashfaqur, Verma, Brijesh, and Stockwell, David
Abstract: The segmentation of road images from vehicle mounted video is a challenging and difficult problem. One of the problems is the presence of different types of objects and not all objects are present in the same frame. For example, road sign is not visible in all frames. In this paper, we propose a novel framework for segmenting road images in a hierarchical manner that can separate the following objects: sky, road, road signs, and vegetation from the video data. Each frame in the video is analysed separately. The hierarchical approach does not assume the presence of a certain number of objects in a single frame. We have also developed a segmentation framework based on SVM learning. The proposed framework has been tested on the Transport and Main Roads Queensland's video data. The experimental results indicate that the proposed framework can detect different objects with an accuracy of 95.65%. [ABSTRACT FROM PUBLISHER]
Published: 2012
Full Text: View/download PDF

40. Indexing and keyword search to ease navigation in lecture videos.

Author: Tuna, Tayfun, Subhlok, Jaspal, and Shah, Shishir
Abstract: Lecture videos have been commonly used to supplement in-class teaching and for distance learning. Videos recorded during in-class teaching and made accessible online are a versatile resource on par with a textbook and the classroom itself. Nonetheless, the adoption of lecture videos has been limited, in large part due to the difficulty of quickly accessing the content of interest in a long video lecture. In this work, we present “video indexing” and “keyword search” that facilitate access to video content and enhances user experience. Video indexing divides a video lecture into segments indicating different topics by identifying scene changes based on the analysis of the difference image from a pair of video frames. We propose an efficient indexing algorithm that leverages the unique features of lecture videos. Binary search with frame sampling is employed to efficiently analyze long videos. Keyword search identifies video segments that match a particular keyword. Since text in a video frame often contains a diversity of colors, font sizes and backgrounds, our text detection approach requires specialized preprocessing followed by the use of off-the-shelf OCR engines, which are designed primarily for scanned documents. We present image enhancements: text segmentation and inversion, to increase detection accuracy of OCR tools. Experimental results on a suite of diverse video lectures were used to validate the methods developed in this work. Average processing time for a one-hour lecture is around 14 minutes on a typical desktop. Search accuracy of three distinct OCR engines - Tesseract, GOCR and MODI increased significantly with our preprocessing transformations, yielding an overall combined accuracy of 97%. The work presented here is part of a video streaming framework deployed at multiple campuses serving hundreds of lecture videos. [ABSTRACT FROM PUBLISHER]
Published: 2011
Full Text: View/download PDF

41. Key technique of video index based on scenario segmentation and content indexing.

Author: Hua Liu
Abstract: This paper covers the application of corpus-based and corpus-driven methods to the research of collocation in modern Chinese, summarizing and commenting on the basic concepts, terms, formulae, approaches and processes in collocation research. Based on previous studies, this paper raises suggestions in terms of the method of extracting collocates, determination of span, colligation and semantic prosody, and emphasizes on the use of statistics, manual proofreading and the combination of quantitative and qualitative methods in collocation study based on the characteristics of collocation in Chinese. [ABSTRACT FROM PUBLISHER]
Published: 2011
Full Text: View/download PDF

42. UAV video coverage quality maps and prioritized indexing for wilderness search and rescue.

Author: Morse, Bryan S., Engh, Cameron H., and Goodrich, Michael A.
Abstract: Video-equipped mini unmanned aerial vehicles (mini-UAVs) are becoming increasingly popular for surveillance, remote sensing, law enforcement, and search and rescue operations, all of which rely on thorough coverage of a target observation area. However, coverage is not simply a matter of seeing the area (visibility) but of seeing it well enough to allow detection of targets of interest, a quality we here call "see-ability". Video flashlights, mosaics, or other geospatial compositions of the video may help place the video in context and convey that an area was observed, but not necessarily how well or how often. This paper presents a method for using UAV-acquired video georegistered to terrain and aerial reference imagery to create geospatial video coverage quality maps and indices that indicate relative video quality based on detection factors such as image resolution, number of observations, and variety of viewing angles. When used for offline post-analysis of the video, or for online review, these maps also enable geospatial quality-filtered or prioritized non-sequential access to the video. We present examples of static and dynamic see-ability coverage maps in wilderness search-and-rescue scenarios, along with examples of prioritized non-sequential video access. We also present the results of a user study demonstrating the correlation between see-ability computation and human detection performance. [ABSTRACT FROM AUTHOR]
Published: 2010
Full Text: View/download PDF

43. An Effective Video Text Tracking Algorithm Based on SIFT Feature and Geometric Constraint.

Author: Na, Yinan and Wen, Di
Abstract: Video text provides important clues for semantic-based video analysis, indexing and retrieval. And text tracking is performed to locate specific text information across video frames and enhance text segmentation and recognition over time. This paper presents a multilingual video text tracking algorithm based on the extraction and tracking of Scale Invariant Feature Transform (SIFT) features description through video frames. SIFT features are extracted from video frames to correspond the region of interests across frames. Meanwhile, a global matching method using geometric constraint is proposed to decrease false matches, which effectively improves the accuracy and stability of text tracking results. Based on the correct matches, the motion of text is estimated in adjacent frames and a match score of text is calculated to determine Text Change Boundary (TCB). Experimental results on a large number of video frames show that the proposed text tracking algorithm is robust to different text forms, including multilingual captions, credits, scene texts with shift, rotation and scale change, under complex backgrounds and light changing. [ABSTRACT FROM AUTHOR]
Published: 2010
Full Text: View/download PDF

44. AUTOMATIC ADAPTIVE ASSESSMENT IN M-LEARNING.

Author: Shuangbao Wang and Behrmann, Michael
Subjects: MOBILE learning, COLLABORATIVE learning, POCKET computers, INSTRUCTIONAL systems, INTERNET
Abstract: Adaptive assessment of video based m-training is an area which involves many technologies. This paper presents the design and implementation of an m-Learning system that has video indexing and automatic caption creation features. The system is able to adjust the m-Learning content based on the progress of the participants. Unlike most mobile devices, in which only downloading is enabled, the system is able to not only download the content to the devices, but also upload data such as learning results, answers to quizzes, etc. back to the remote databases. It provides a two-way interactive learning experience. In addition, the system is capable of creating video quizzes and searching videos using text from the speech. The research also focuses on collaborative learning, video indexing and automatic caption creation. The m-Learning a modules, together with other resources construct a one-point interface for people to access by both PDAs and PCs. This system is being tested by educational professionals on an online system with monthly traffic of over 30,000 hits. [ABSTRACT FROM AUTHOR]
Published: 2009

45. A Comparison of Wavelet Based Spatio-temporal Decomposition Methods for Dynamic Texture Recognition.

Author: Dubois, Sloven, Péteri, Renaud, and Ménard, Michel
Abstract: This paper presents four spatio-temporal wavelet decompositions for characterizing dynamic textures. The main goal of this work is to compare the influence of spatial and temporal variables in the wavelet decomposition scheme. Its novelty is to establish a comparison between the only existing method [11] and three other spatio-temporal decompositions. The four decomposition schemes are presented and successfully applied on a large dynamic texture database. Construction of feature descriptors are tackled as well their relevance, and performances of the methods are discussed. Finally, future prospects are exposed. [ABSTRACT FROM AUTHOR]
Published: 2009
Full Text: View/download PDF

46. Performance Prediction for Unsupervised Video Indexing.

Author: Ewerth, Ralph and Freisleben, Bernd
Abstract: Recently, performance prediction has been successfully applied in the field of information retrieval for content analysis and retrieval tasks. This paper discusses how performance prediction can be realized for unsupervised learning approaches in the context of video content analysis and indexing. Performance prediction helps in identifying the number of detection errors and can thus support post-processing. This is demonstrated for the example of temporal video segmentation by presenting an approach for automatically predicting the precision and recall of a video cut detection result. It is shown for the unsupervised cut detection approach that the related clustering validity measure is highly correlated with the precision of a detection result. Three regression methods are investigated to exploit the observed correlation. Experimental results demonstrate the feasibility of the proposed performance prediction approach. [ABSTRACT FROM AUTHOR]
Published: 2009
Full Text: View/download PDF

47. Unsupervised Detection of Gradual Video Shot Changes with Motion-Based False Alarm Removal.

Author: Ewerth, Ralph and Freisleben, Bernd
Abstract: The temporal segmentation of a video into shots is a fundamental prerequisite for video retrieval. There are two types of shot boundaries: abrupt shot changes (˵cuts″) and gradual transitions. Several high-quality algorithms have been proposed for detecting cuts, but the successful detection of gradual transitions remains a surprisingly difficult problem in practice. In this paper, we present an unsupervised approach for detecting gradual transitions. It has several advantages. First, in contrast to alternative approaches, no training stage and hence no training data are required. Second, no thresholds are needed, since the used clustering approach separates classes of gradual transitions and non-transitions automatically and adaptively for each video. Third, it is a generic approach that does not employ a specialized detector for each transition type. Finally, the issue of removing false alarms caused by camera motion is addressed: in contrast to related approaches, it is not only based on low-level features, but on the results of an appropriate algorithm for camera motion estimation. Experimental results show that the proposed approach achieves very good performance on TRECVID shot boundary test data. [ABSTRACT FROM AUTHOR]
Published: 2009
Full Text: View/download PDF

48. Extraction of Motion Activity from Scalable-Coded Video Sequences.

Author: Herranz, Luis, Tiburzi, Fabricio, and Bescós, Jesús
Abstract: This work presents an efficient approach for the calculation of the MPEG-7 descriptor for motion activity from scalable-coded video sequences, which include scalable motion vectors and variable block sizes. We first describe the adaptation of the constant block-size assumption of the MPEG-7 descriptor to this new coding domain. Then we compare the results obtained with those for MPEG-1 videos in the context of a common application of this descriptor: video summarization. The comparable quality of these results and the gain in efficiency support the presented approach. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

49. Derin öğrenmeye dayalı Türkçe video indeksleme ve bilgi getirimi sistemi

Author: Rasheed, Jawad and Akhtar, Jamil
Subjects: Text recognition, Video sunma, Evrişimsel sinir ağları, Metin tespiti, Convolutional neural network, Deep learning, Makine öğrenmesi, Derin öğrenme, Metin tanıma, Video indexing, Machine learning, Video retrieval, Video indeksleme, Text detection
Abstract: The continual technological advancement of handheld devices and personal computers over past few decades has reshaped the world's communication system by enabling the humans and robots to capture and share images and videos in digitized form at large. Practically, annotation-based video indexing and retrieval systems are widely being used to maintain the ongoing growth of multimedia content. These systems grant multimedia content retrieval using textual annotations, but are limited to predefined annotation/keywords. The online multimedia content libraries require manual annotation of video while uploading, which is a hectic and time-consuming assignment that sometimes even does not align with the visual content. This limits the searching capacity, as user may be unable to retrieve video because of incomplete video description at the time of annotation. Therefore, it strongly requires an efficient and sophisticated video indexing and retrieval system. To accomplish it, content-based video indexing is an optimal solution by detecting text appearing in videos. This dissertation demonstrates a new text detection system based on advance deep learning approach to bridge the gap by building an automatic and efficient content-based video indexing and retrieval system for Turkish videos. The text appearing in videos provides useful information that can be exploited for developing automatic video indexing and retrieval system. Therefore, this study integrates heuristic and deep learning-based approaches that utilizes CNN for automatic text detection and extraction. To train the proposed CNN-based model, a new dataset is generated by collecting videos from various Turkish channels related to News, financial and business, sports and cartoon channels. The dataset is fed to proposed model that first generates features maps and then classifies the image as textual or non-textual class. Extensive trails and experiments are carried out with different structural combination of convolutional layers, thus ended up with a best model out of three proposed models that can accurately detect the text. Next, the extracted text is fed to publicly available Tesseract OCR for recognition, which is then indexed in database along with video information such as file storage location. Lastly, a web-based user interface is provided for querying purposes. For each user query, the proposed system retrieved the most relevant videos based on its textual content appearing inside. Besides displaying the retrieved videos in provided user interface, the system also informs the user about the appearance time of queried words inside each retrieved video so that user can directly jump to the point of interest by using sleek bar. All basic functionalities are provided to play, pause, maximize, minimize, and download the retrieved video with additional controls for volume and sleek bar. Moreover, various conventional machine-learning algorithms such as SVM and LR, and few state-of-the-art image classification models (including VGG16, ResNet50 and DenseNet121) are also implemented and trained with identical datasets. The proposed models outperformed the prior state-of-the-art deep learning frameworks and machine learning classifiers. Son zamanlarda el cihazlarının ve kişisel bilgisayarların teknolojik gelişimi, insanların ve robotların görüntüler ve videoları yakalayıp büyük ölçüde dijitalleştirilmiş biçimde paylaşılmasını sağlayarak dünyanın iletişim sistemini yeninden şekillendirmiştir. Pratikte, açıklama tabanlı video indeksleme ve bilgi erişim sistemleri günümüzde büyüyen multimedya içeriklerinin sürdürebilirliğini devam ettirmek amacıyla kullanılmaktadırlar. İlgili sistemler, metin notlarını kullanarak multimedya içeriğinin elde edilmesini sağlamaktadırlar, ancak ek açıklamaları tanımlamada sınırlıdırlar. Çevrimiçi multimedya içerikli kütüphanelerde, video yükleme işleminde videonun açıklaması manuel olarak gerçekleşmesi gerekmektedir. Bu durum, zaman gerektiren bir işlem olduğu gibi bazı durumlarda video açıklaması görsel içerik ile uyuşmamaktadır. Aynı zamanda, videolarda oluşacak eksik açıklamalar nedeniyle kullanıcıların arama yapma kapasitelerini sınırlamaktadır. Bu nedenle, verimli ve sofistike bir video indeksleme ve erişim sistemi gereklidir. Bu problemi çözüme kavuşturmak için, videolarda mevcut olan metinleri tespit ederek içerik tabanlı video indeksleme sistemi geliştirmek en uygun bir çözümdür. Bu tez çalışmasında, Türkçe videolar için otomatik ve verimli içerik tabanlı video indeksleme ve bilgi erişim sistemi oluşturmak amaçlı derin öğrenmeye dayalı yeni bir metin algılama sistemi geliştirilmektedir. Videolarda görünen metin, otomatik video indeksleme ve bilgi erişim sistemini geliştirmek için kullanılabilecek faydalı bilgiler sağlamaktadır. Dolayısıyla, bu çalışma otomatik metin algılama ve çıkarma işlemlerini gerçekleştirmek için Evrişimsel Sinir Ağlarından yararlanarak, sezgisel ve Derin Öğrenmeye dayalı yaklaşımları bütünleştirmektedir. Önerilen Evrişimsel Sinir Ağı tabanlı modeli eğitmek için, Haberler, finans ve iş, spor ve çizgi ile ilgili çeşitli Türk televizyon kanallarından videolar toplanarak yeni bir veri kümesi oluşturulmuştur. Oluşturulan veri seti, ilk aşamada önerilen modele özellik haritalarının elde edilmesi amacıyla beslenmektedir, devamında önerilen model görüntüyü metinsel veya metinsel olmayan sınıf olarak sınıflandırmaktadır. Evrişimsel katmanların farklı yapısal kombinasyonları ile kapsamlı deneyler yapıldıktan sonra önerilen üç modelden metni en doğru bir şekilde algılayabilen model elde edildi. Devamında, çıkarılan (elde edilen) metnin tanınması için Tesseract OCR'a beslenir, ve dosya depolama konumu gibi video bilgileri ile birlikte veri tabanında indekslenir. Son olarak, sorgulama amacıyla web tabanlı bir kullanıcı arayüzü geliştirilir. Her kullanıcı sorgusu için, önerilen sistem görüntü içinde görünen metin içeriğine göre en alakalı videoları kullanıcıya sunmaktadır. Buna ek olarak, sistem kullanıcın sorguladığı kelimeleri videonun hangi süreleri arasında geçtiğine dair bilgi vermektedir, dolayısıyla kullanıcı arama çubuğunu kullanarak doğrudan ilgi alanına gidebilir. Sunulan videoyu oynatmak, duraklatmak, büyütmek, küçültmek ve indirmek için ek ses ve kontrol çubuğu kontrolleriyle birlikte tüm temel işlevler sağlanmıştır. Ayrıca, Destek Vektör Makineleri ve Lojistik Regresyon gibi çeşitli geleneksel makine öğrenimi algoritmaları ve birkaç son teknoloji görüntü sınıflandırma modeli (VGG16, ResNet50 ve DenseNet121) aynı veri setiyle uygulanır ve eğitilir. Önerilen modeller, son teknoloji derin öğrenme modeleri ve makine öğrenimi sınıflandırıcılarından daha iyi performans göstermiştir.
Published: 2021

50. An efficient compressed domain video indexing method.

Author: Akrami, Farahnaz and Zargari, Farzad
Subjects: VIDEO compression, INFORMATION retrieval, DECODERS & decoding, AUTOMATIC indexing, PIXELS, FEATURE extraction
Abstract: Video indexing is employed to represent the features of video sequences. Motion vectors derived from compressed video are preferred for video indexing because they can be accessed by partial decoding; thus, they are used extensively in various video analysis and indexing applications. In this study, we introduce an efficient compressed domain video indexing method and implement it on the H.264/AVC coded videos. The video retrieval experimental evaluations indicate that the video retrieval based on the proposed indexing method outperforms motion vector based video retrieval in 74 % of queries with little increase in computation time. Furthermore, we compared our method with a pixel level video indexing method which employs both temporal and spatial features. Experimental evaluation results indicate that our method outperforms the pixel level method both in performance and speed. Hence considering the speed and precision characteristics of indexing methods, the proposed method is an efficient indexing method which can be used in various video indexing and retrieval applications. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

270 results on '"Video indexing"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources