86 results on '"Tian, Qi"'
Search Results
2. Contextual modeling on auxiliary points for robust image reranking
- Author
-
Li, Ying, Kong, Xiangwei, Fu, Haiyan, and Tian, Qi
- Published
- 2019
- Full Text
- View/download PDF
3. Sparse Matrix Based Hashing for Approximate Nearest Neighbor Search
- Author
-
Wang, Min, Zhou, Wengang, Tian, Qi, Li, Houqiang, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Chen, Enqing, editor, Gong, Yihong, editor, and Tie, Yun, editor
- Published
- 2016
- Full Text
- View/download PDF
4. Hybrid-Indexing Multi-type Features for Large-Scale Image Search
- Author
-
Luo, Qingjun, Zhang, Shiliang, Huang, Tiejun, Gao, Wen, Tian, Qi, Cremers, Daniel, editor, Reid, Ian, editor, Saito, Hideo, editor, and Yang, Ming-Hsuan, editor
- Published
- 2015
- Full Text
- View/download PDF
5. Deep hashing with top similarity preserving for image retrieval
- Author
-
Li, Qiang, Fu, Haiyan, Kong, Xiangwei, and Tian, Qi
- Published
- 2018
- Full Text
- View/download PDF
6. COGE: A Novel Binary Feature Descriptor Exploring Anisotropy and Non-uniformity
- Author
-
Mao, Zhendong, Zhang, Yongdong, Tian, Qi, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Huet, Benoit, editor, Ngo, Chong-Wah, editor, Tang, Jinhui, editor, Zhou, Zhi-Hua, editor, Hauptmann, Alexander G., editor, and Yan, Shuicheng, editor
- Published
- 2013
- Full Text
- View/download PDF
7. Image Histogram Constrained SIFT Matching
- Author
-
Luo, Ye, Xue, Ping, Tian, Qi, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Qiu, Guoping, editor, Lam, Kin Man, editor, Kiya, Hitoshi, editor, Xue, Xiang-Yang, editor, Kuo, C.-C. Jay, editor, and Lew, Michael S., editor
- Published
- 2010
- Full Text
- View/download PDF
8. Stripe: Image Feature Based on a New Grid Method and Its Application in ImageCLEF
- Author
-
Qiu, Bo, Racoceanu, Daniel, Xu, Chang Sheng, Tian, Qi, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Dough, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Ng, Hwee Tou, editor, Leong, Mun-Kew, editor, Kan, Min-Yen, editor, and Ji, Donghong, editor
- Published
- 2006
- Full Text
- View/download PDF
9. Combining Visual Features for Medical Image Retrieval and Annotation
- Author
-
Xiong, Wei, Qiu, Bo, Tian, Qi, Xu, Changsheng, Ong, S. H., Foong, Kelvin, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Dough, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Peters, Carol, editor, Gey, Fredric C., editor, Gonzalo, Julio, editor, Müller, Henning, editor, Jones, Gareth J. F., editor, Kluck, Michael, editor, Magnini, Bernardo, editor, and de Rijke, Maarten, editor
- Published
- 2006
- Full Text
- View/download PDF
10. Tensor index for large scale image retrieval
- Author
-
Zheng, Liang, Wang, Shengjin, Guo, Peizhen, Liang, Hanyue, and Tian, Qi
- Published
- 2015
- Full Text
- View/download PDF
11. Deep Relation Embedding for Cross-Modal Retrieval.
- Author
-
Zhang, Yifan, Zhou, Wengang, Wang, Min, Tian, Qi, and Li, Houqiang
- Subjects
IMAGE retrieval ,FEATURE extraction ,TASK analysis ,COSINE function - Abstract
Cross-modal retrieval aims to identify relevant data across different modalities. In this work, we are dedicated to cross-modal retrieval between images and text sentences, which is formulated into similarity measurement for each image-text pair. To this end, we propose a Cross-modal Relation Guided Network (CRGN) to embed image and text into a latent feature space. The CRGN model uses GRU to extract text feature and ResNet model to learn the globally guided image feature. Based on the global feature guiding and sentence generation learning, the relation between image regions can be modeled. The final image embedding is generated by a relation embedding module with an attention mechanism. With the image embeddings and text embeddings, we conduct cross-modal retrieval based on the cosine similarity. The learned embedding space well captures the inherent relevance between image and text. We evaluate our approach with extensive experiments on two public benchmark datasets, i.e., MS-COCO and Flickr30K. Experimental results demonstrate that our approach achieves better or comparable performance with the state-of-the-art methods with notable efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
12. Node-Sensitive Graph Fusion via Topo-Correlation for Image Retrieval.
- Author
-
Li, Ying, Kong, Xiangwei, Fu, Haiyan, and Tian, Qi
- Subjects
IMAGE retrieval ,CONTENT-based image retrieval ,WEIGHTED graphs ,COMPUTATIONAL complexity ,BINARY codes - Abstract
Various kinds of features prove to be effective for content-based image retrieval. However, due to the diversity of image contents, a descriptor may achieve impressive performance on specific images while becoming invalid on others. Although some efforts have been made to combine features as complementary counterparts, proper weighting scheme is still a challenge for fast and accurate retrieval. In this paper, we propose an effective fusion method, termed as Topo-correlation (Topo), where the importance of each feature is measured by cross-view correlations on local affinity graphs. Specifically, the weights of similarities are node-sensitive as well as modality-sensitive, thus boosting the results of good cues while depressing adverse factors for individual images. By estimating the consensus of similarity scores with regard to a query-driven criterion, the weighted graphs are generated efficiently with low computational complexity. Extensive experimental results on four benchmarks demonstrate the superiority of the proposed approach over the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
13. Neighborhood Pyramid Preserving Hashing.
- Author
-
Wang, Min, Zhou, Wengang, Tian, Qi, and Li, Houqiang
- Abstract
In this paper, we devote our efforts to the approximate nearest neighbour (ANN) search problem and propose a new unsupervised binary hashing method, i.e., Neighbourhood Pyramid preserving Hashing (NPH). We represent the nearest neighbours of each data point in a pyramid, and as the learning objective, we impose that the pyramid neighbourhood in each level is consistently preserved across the original Euclidean space and the transformed Hamming space. The neighbourhood is quantitatively characterized by its size, defined as the average distance from the involved nearest neighbours to the referred data point. Our approach is consistent with the distance-preserving principle of binary hashing and achieves stricter neighbourhood structure preserving over previous graph hashing algorithms. The experiments on several large-scale benchmark datasets demonstrate that NPH achieves promising performances compared with those of the existing state-of-the-art unsupervised binary hashing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
14. Effective Image Retrieval via Multilinear Multi-Index Fusion.
- Author
-
Zhang, Zhizhong, Xie, Yuan, Zhang, Wensheng, and Tian, Qi
- Abstract
Multi-index fusion has demonstrated impressive performances in the retrieval task by integrating different visual representations in a unified framework. However, previous works mainly consider propagating similarities via a neighbor structure, ignoring the high-order information among different visual representations. In this paper, we propose a new multi-index fusion scheme for image retrieval. By formulating this procedure as a multilinear-based optimization problem, the complementary information hidden in different indexes can be explored more thoroughly. Specifically, we first build our multiple indexes from various visual representations. Then, a so-called index-specific functional matrix, which aims to propagate similarities, is introduced to update the original index. The functional matrices are then optimized in a unified tensor space to achieve a refinement, such that the relevant images can be pushed closer. The optimization problem can be efficiently solved by the augmented Lagrangian method with a theoretical convergence guarantee. Unlike the traditional multi-index fusion scheme, our approach embeds the multi-index subspace structure into the new indexes with sparse constraint and, thus, it has little additional memory consumption in the online query stage. Experimental evaluation on three benchmark datasets reveals that the proposed approach achieves state-of-the-art performance, that is, N-score 3.94 on UKBench, mAP 94.1% on Holiday, and 62.39% on Market-1501. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
15. Adaptive Hashing With Sparse Matrix Factorization.
- Author
-
Liu, Huawen, Li, Xuelong, Zhang, Shichao, and Tian, Qi
- Subjects
MATRIX decomposition ,SPARSE matrices ,HASHING ,BINARY codes ,REGULARIZATION parameter - Abstract
Hashing offers a desirable and effective solution for efficiently retrieving the nearest neighbors from large-scale data because of its low storage and computation costs. One of the most appealing techniques for hashing learning is matrix factorization. However, most hashing methods focus only on building the mapping relationships between the Euclidean and Hamming spaces and, unfortunately, underestimate the naturally sparse structures of the data. In addition, parameter tuning is always a challenging and head-scratching problem for sparse hashing learning. To address these problems, in this article, we propose a novel hashing method termed adaptively sparse matrix factorization hashing (SMFH), which exploits sparse matrix factorization to explore the parsimonious structures of the data. Moreover, SMFH adopts an orthogonal transformation to minimize the quantization loss while deriving the binary codes. The most distinguished property of SMFH is that it is adaptive and parameter-free, that is, SMFH can automatically generate sparse representations and does not require human involvement to tune the regularization parameters for the sparse models. Empirical studies on four publicly available benchmark data sets show that the proposed method can achieve promising performance and is competitive with a variety of state-of-the-art hashing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
16. Social Anchor-Unit Graph Regularized Tensor Completion for Large-Scale Image Retagging.
- Author
-
Tang, Jinhui, Shu, Xiangbo, Li, Zechao, Jiang, Yu-Gang, and Tian, Qi
- Subjects
TAGS (Metadata) ,IMAGE databases ,ASSOCIATION rule mining ,IMAGE ,MATHEMATICAL regularization - Abstract
Image retagging aims to improve the tag quality of social images by completing the missing tags, rectifying the noise-corrupted tags, and assigning new high-quality tags. Recent approaches simultaneously explore visual, user and tag information to improve the performance of image retagging by mining the tag-image-user associations. However, such methods will become computationally infeasible with the rapidly increasing number of images, tags and users. It has been proven that the anchor graph can significantly accelerate large-scale graph-based learning by exploring only a small number of anchor points. Inspired by this, we propose a novel Social anchor-Unit GrAph Regularized Tensor Completion (SUGAR-TC) method to efficiently refine the tags of social images, which is insensitive to the scale of data. First, we construct an anchor-unit graph across multiple domains (e.g., image and user domains) rather than traditional anchor graph in a single domain. Second, a tensor completion based on Social anchor-Unit GrAph Regularization (SUGAR) is implemented to refine the tags of the anchor images. Finally, we efficiently assign tags to non-anchor images by leveraging the relationship between the non-anchor units and the anchor units. Experimental results on a real-world social image database well demonstrate the effectiveness and efficiency of SUGAR-TC, outperforming the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
17. Regularized Diffusion Process on Bidirectional Context for Object Retrieval.
- Author
-
Bai, Song, Bai, Xiang, Tian, Qi, and Latecki, Longin Jan
- Subjects
DIFFUSION processes ,IMAGE retrieval ,IMAGE processing ,MACHINE learning ,GRAPHIC methods - Abstract
Diffusion process has advanced object retrieval greatly as it can capture the underlying manifold structure. Recent studies have experimentally demonstrated that tensor product diffusion can better reveal the intrinsic relationship between objects than other variants. However, the principle remains unclear, i.e., what kind of manifold structure is captured. In this paper, we propose a new affinity learning algorithm called Regularized Diffusion Process (RDP). By deeply exploring the properties of RDP, our first yet basic contribution is providing a manifold-based explanation for tensor product diffusion. A novel criterion measuring the smoothness of the manifold is defined, which simultaneously regularizes four vertices in the affinity graph. Inspired by this observation, we further contribute two variants towards two specific goals. While ARDP can learn similarities across heterogeneous domains, HRDP performs affinity learning on tensor product hypergraph, considering the relationships between objects are generally more complex than pairwise. Consequently, RDP, ARDP and HRDP constitute a generic tool for object retrieval in most commonly-used settings, no matter the input relationships between objects are derived from the same domain or not, and in pairwise formulation or not. Comprehensive experiments on 10 retrieval benchmarks, especially on large scale data, validate the effectiveness and generalization of our work. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
18. Improving Object Retrieval Quality by Integration of Similarity Propagation and Query Expansion.
- Author
-
Pang, Shanmin, Ma, Jin, Zhu, Jihua, Xue, Jianru, and Tian, Qi
- Abstract
Re-ranking is an essential step for accurate image retrieval, due to its well-known power in performance improvement. Although numerous works have been proposed for re-ranking, many of them are only customized for a certain image representation model. In contrast to most existing techniques, we develop generalized re-ranking algorithms that are applicable to different kinds of image encodings in this paper. We first employ a quite successful theory of similarity propagation to reconstruct vectors of a query and its top ranked images and, subsequently, get a re-ranked list by comparing the new image vectors. Furthermore, considering that the just mentioned strategy is directly compatible with query expansion and, thus, in order to leverage advantages of this milestone, we then propose integrating them into a unified framework for maximizing re-ranking benefits. Our re-ranking algorithms are memory and computation efficient, and experimental results on benchmark datasets demonstrate that they compare favorably with the state of the art. Our code is available at https://github.com/MaJinWakeUp/rerank. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
19. Automatic Ensemble Diffusion for 3D Shape and Image Retrieval.
- Author
-
Bai, Song, Zhou, Zhichao, Wang, Jingdong, Bai, Xiang, Latecki, Longin Jan, and Tian, Qi
- Subjects
DIFFUSION ,IMAGE retrieval ,ARTIFICIAL neural networks ,MACHINE learning ,VISUALIZATION - Abstract
As a post-processing procedure, the diffusion process has demonstrated its ability of substantially improving the performance of various visual retrieval systems. Whereas, great efforts are also devoted to similarity (or metric) fusion, seeing that only one individual type of similarity cannot fully reveal the intrinsic relationship between objects. This stimulates a great research interest of considering similarity fusion in the framework of the diffusion process (i.e., fusion with diffusion) for robust retrieval. In this paper, we first revisit representative methods about fusion with diffusion and provide new insights which are ignored by previous researchers. Then, observing that existing algorithms are susceptible to noisy similarities, the proposed regularized ensemble diffusion (RED) is bundled with an automatic weight learning paradigm, so that the negative impacts of noisy similarities are suppressed. Though formulated as a convex optimization problem, one advantage of RED is that it converts back into the iteration-based solver with the same computational complexity as the conventional diffusion process. At last, we integrate several recently-proposed similarities with the proposed framework. The experimental results suggest that we can achieve new state-of-the-art performances on various retrieval tasks, including 3D shape retrieval on the ModelNet data set, and image retrieval on the Holidays and Ukbench data sets. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
20. SIFT Meets CNN: A Decade Survey of Instance Retrieval.
- Author
-
Zheng, Liang, Yang, Yi, and Tian, Qi
- Subjects
CONTENT-based image retrieval ,SIGNAL convolution ,ARTIFICIAL neural networks ,DETECTORS ,DATA visualization - Abstract
In the early days, content-based image retrieval (CBIR) was studied with global features. Since 2003, image retrieval based on local descriptors (de facto SIFT) has been extensively studied for over a decade due to the advantage of SIFT in dealing with image transformations. Recently, image representations based on the convolutional neural network (CNN) have attracted increasing interest in the community and demonstrated impressive performance. Given this time of rapid evolution, this article provides a comprehensive survey of instance retrieval over the last decade. Two broad categories, SIFT-based and CNN-based methods, are presented. For the former, according to the codebook size, we organize the literature into using large/medium-sized/small codebooks. For the latter, we discuss three lines of methods, i.e., using pre-trained or fine-tuned CNN models, and hybrid methods. The first two perform a single-pass of an image to the network, while the last category employs a patch-based feature extraction scheme. This survey presents milestones in modern instance retrieval, reviews a broad selection of previous works in different categories, and provides insights on the connection between SIFT and CNN-based methods. After analyzing and comparing retrieval performance of different categories on several datasets, we discuss promising directions towards generic and specialized instance retrieval. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
21. Collaborative Index Embedding for Image Retrieval.
- Author
-
Zhou, Wengang, Li, Houqiang, Sun, Jian, and Tian, Qi
- Subjects
IMAGE retrieval ,SIGNAL convolution ,ARTIFICIAL neural networks ,FEATURE extraction ,MATHEMATICAL optimization ,ACCURACY - Abstract
In content-based image retrieval, SIFT feature and the feature from deep convolutional neural network (CNN) have demonstrated promising performance. To fully explore both visual features in a unified framework for effective and efficient retrieval, we propose a collaborative index embedding method to implicitly integrate the index matrices of them. We formulate the index embedding as an optimization problem from the perspective of neighborhood sharing and solve it with an alternating index update scheme. After the iterative embedding, only the embedded CNN index is kept for on-line query, which demonstrates significant gain in retrieval accuracy, with very economical memory cost. Extensive experiments have been conducted on the public datasets with million-scale distractor images. The experimental results reveal that, compared with the recent state-of-the-art retrieval algorithms, our approach achieves competitive accuracy performance with less memory overhead and efficient query computation. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
22. Coherent Semantic-Visual Indexing for Large-Scale Image Retrieval in the Cloud.
- Author
-
Hong, Richang, Li, Lei, Cai, Junjie, Tao, Dapeng, Wang, Meng, and Tian, Qi
- Subjects
IMAGE retrieval ,CLOUD computing ,SEMANTICS ,BINARY codes ,MATHEMATICAL optimization - Abstract
The rapidly increasing number of images on the internet has further increased the need for efficient indexing for digital image searching of large databases. The design of a cloud service that provides high efficiency but compact image indexing remains challenging, partly due to the well-known semantic gap between user queries and the rich semantics of large-scale data sets. In this paper, we construct a novel joint semantic-visual space by leveraging visual descriptors and semantic attributes, which narrows the semantic gap by combining both attributes and indexing into a single framework. Such a joint space embraces the flexibility of coherent semantic-visual indexing, which employs binary codes to boost retrieval speed while maintaining accuracy. To solve the proposed model, we make the following contributions. First, we propose an interactive optimization method to find the joint semantic and visual descriptor space. Second, we prove convergence of our optimization algorithm, which guarantees a good solution after a certain number of iterations. Third, we integrate the semantic-visual joint space system with spectral hashing, which finds an efficient solution to search up to billion-scale data sets. Finally, we design an online cloud service to provide a more efficient online multimedia service. Experiments on two standard retrieval datasets (i.e., Holidays1M, Oxford5K) show that the proposed method is promising compared with the current state-of-the-art and that the cloud system significantly improves performance. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
23. Codebook Guided Feature-Preserving for Recognition-Oriented Image Retargeting.
- Author
-
Yan, Bo, Tan, Weimin, Li, Ke, and Tian, Qi
- Subjects
IMAGE processing ,IMAGE registration ,IMAGE recognition (Computer vision) ,IMAGE retrieval ,BIG data - Abstract
Traditional image resizing methods, such as uniform scaling and content-aware image retargeting, are designed to preserve the visually salient contents of an image while resizing it. In this paper, we propose a novel image resizing approach called recognition-oriented image retargeting. Its goal is to preserve the distinctive local features for recognition instead of the traditional visual saliency during resizing. Moreover, we also apply our approach to image matching and image retrieval applications to verify its performance. Meanwhile, using our approach to these applications is able to solve some of the challenging problems in their fields. In image matching application, we find that our approach shows promising preservation of local feature descriptors. In image retrieval task, extensive experiments on Oxford5K, Holidays, Paris, and Flickr100k data sets demonstrate that our approach consistently outperforms other image retargeting methods by large margins in the aspects of retrieval precision and query bits. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
24. Democratic Diffusion Aggregation for Image Retrieval.
- Author
-
Gao, Zhanning, Xue, Jianru, Zhou, Wengang, Pang, Shanmin, and Tian, Qi
- Abstract
Content-based image retrieval is an important research topic in the multimedia field. In large-scale image search using local features, image features are encoded and aggregated into a compact vector to avoid indexing each feature individually. In the aggregation step, sum-aggregation is wildly used in many existing works and demonstrates promising performance. However, it is based on a strong and implicit assumption that the local descriptors of an image are identically and independently distributed in descriptor space and image plane. To address this problem, we propose a new aggregation method named democratic diffusion aggregation (DDA) with weak spatial context embedded. The main idea of our aggregation method is to re-weight the embedded vectors before sum-aggregation by considering the relevance among local descriptors. Different from previous work, by conducting a diffusion process on the improved kernel matrix, we calculate the weighting coefficients more efficiently without any iterative optimization. Besides considering the relevance of local descriptors from different images, we also discuss an efficient query fusion strategy which uses the initial top-ranked image vectors to enhance the retrieval performance. Experimental results show that our aggregation method exhibits much higher efficiency (about \mathbf \times \,14 faster) and better retrieval accuracy compared with previous methods, and the query fusion strategy consistently improves the retrieval quality. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
25. Scalable Feature Matching by Dual Cascaded Scalar Quantization for Image Retrieval.
- Author
-
Zhou, Wengang, Yang, Ming, Wang, Xiaoyu, Li, Houqiang, Lin, Yuanqing, and Tian, Qi
- Subjects
IMAGE retrieval ,FEATURE extraction ,PATTERN matching ,NEAREST neighbor analysis (Statistics) ,VECTOR analysis - Abstract
In this paper, we investigate the problem of scalable visual feature matching in large-scale image search and propose a novel cascaded scalar quantization scheme in dual resolution. We formulate the visual feature matching as a range-based neighbor search problem and approach it by identifying hyper-cubes with a dual-resolution scalar quantization strategy. Specifically, for each dimension of the PCA-transformed feature, scalar quantization is performed at both coarse and fine resolutions. The scalar quantization results at the coarse resolution are cascaded over multiple dimensions to index an image database. The scalar quantization results over multiple dimensions at the fine resolution are concatenated into a binary super-vector and stored into the index list for efficient verification. The proposed cascaded scalar quantization (CSQ) method is free of the costly visual codebook training and thus is independent of any image descriptor training set. The index structure of the CSQ is flexible enough to accommodate new image features and scalable to index large-scale image database. We evaluate our approach on the public benchmark datasets for large-scale image retrieval. Experimental results demonstrate the competitive retrieval performance of the proposed method compared with several recent retrieval algorithms on feature quantization. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
26. Semantic-Aware Co-Indexing for Image Retrieval.
- Author
-
Zhang, Shiliang, Yang, Ming, Wang, Xiaoyu, Lin, Yuanqing, and Tian, Qi
- Subjects
IMAGE retrieval ,INFORMATION storage & retrieval systems ,MATHEMATICAL models ,SEMANTICS ,SUBJECT headings ,VOCABULARY - Abstract
In content-based image retrieval, inverted indexes allow fast access to database images and summarize all knowledge about the database. Indexing multiple clues of image contents allows retrieval algorithms search for relevant images from different perspectives, which is appealing to deliver satisfactory user experiences. However, when incorporating diverse image features during online retrieval, it is challenging to ensure retrieval efficiency and scalability. In this paper, for large-scale image retrieval, we propose a semantic-aware co-indexing algorithm to jointly embed two strong cues into the inverted indexes: 1) local invariant features that are robust to delineate low-level image contents, and 2) semantic attributes from large-scale object recognition that may reveal image semantic meanings. Specifically, for an initial set of inverted indexes of local features, we utilize semantic attributes to filter out isolated images and insert semantically similar images to this initial set. Encoding these two distinct and complementary cues together effectively enhances the discriminative capability of inverted indexes. Such co-indexing operations are totally off-line and introduce small computation overhead to online retrieval, because only local features but no semantic attributes are employed for the query. Hence, this co-indexing is different from existing image retrieval methods fusing multiple features or retrieval results. Extensive experiments and comparisons with recent retrieval methods manifest the competitive performance of our method. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
27. Polar Embedding for Aurora Image Retrieval.
- Author
-
Yang, Xi, Gao, Xinbo, and Tian, Qi
- Subjects
IMAGE retrieval ,EMBEDDING theorems ,DATA visualization ,FEATURE extraction ,SIGNAL quantization - Abstract
Exploring the multimedia techniques to assist scientists for their research is an interesting and meaningful topic. In this paper, we focus on the large-scale aurora image retrieval by leveraging the bag-of-visual words (BoVW) framework. To refine the unsuitable representation and improve the retrieval performance, the BoVW model is modified by embedding the polar information. The superiority of the proposed polar embedding method lies in two aspects. On the one hand, the polar meshing scheme is conducted to determine the interest points, which is more suitable for images captured by circular fisheye lens. Especially for the aurora image, the extracted polar scale-invariant feature transform (polar-SIFT) feature can also reflect the geomagnetic longitude and latitude, and thus facilitates the further data analysis. On the other hand, a binary polar deep local binary pattern (polar-DLBP) descriptor is proposed to enhance the discriminative power of visual words. Together with the 64-bit polar-SIFT code obtained via Hamming embedding, the multifeature index is performed to reduce the impact of false positive matches. Extensive experiments are conducted on the large-scale aurora image data set. The experimental result indicates that the proposed method improves the retrieval accuracy significantly with acceptable efficiency and memory cost. In addition, the effectiveness of the polar-SIFT scheme and polar-DLBP integration are separately demonstrated. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
28. Cross Indexing With Grouplets.
- Author
-
Zhang, Shiliang, Wang, Xiaoyu, Lin, Yuanqing, and Tian, Qi
- Abstract
Most of the current image indexing systems for retrieval view a database as a set of individual images. It limits the flexibility of the retrieval framework to conduct sophisticated cross-image analysis, resulting in higher memory consumption and sub-optimal retrieval accuracy. To conquer this issue, we propose cross indexing with grouplets, where the core idea is to view the database images as a set of grouplets, each of which is defined as a group of highly relevant images. Because a grouplet groups similar images together, the number of grouplets is smaller than the number of images, thus naturally leading to less memory cost. Moreover, the definition of a grouplet could be based on customized relations, allowing for seamless integration of advanced image features and data mining techniques like the deep convolutional neural network (DCNN) in off-line indexing . To validate the proposed framework, we construct three different types of grouplets , which are respectively based on local similarity , regional relation, and global semantic modeling. Extensive experiments on public benchmark datasets demonstrate the efficiency and superior performance of our approach. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
29. Fast Image Retrieval: Query Pruning and Early Termination.
- Author
-
Zheng, Liang, Wang, Shengjin, Liu, Ziqiong, and Tian, Qi
- Abstract
Efficiency is of great importance for image retrieval systems. For this pragmatic issue, this paper proposes a fast image retrieval framework to speed up the online retrieval process. To this end, an impact score for local features is proposed in the first place, which considers multiple properties of a local feature, including TF-IDF, scale, saliency, and ambiguity. Then, to decrease memory consumption, the impact score is quantized to an integer, which leads to a novel inverted index organization, called Q-Index. Importantly, based on the impact score, two closely complementary strategies are introduced: query pruning and early termination. On one hand, query pruning discards less important features in the query. On the other hand, early termination visits indexed features only with high impact scores, resulting in the partial traversing of the inverted index. Our approach is tested on two benchmark datasets populated with an additional 1 million images to account as negative examples. Compared with full traversal of the inverted index, we show that our system is capable of visiting less than 10% of the “should-visit” postings, thus achieving a significant speed-up in query time while providing competitive retrieval accuracy. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
30. Multi-order visual phrase for scalable partial-duplicate visual search.
- Author
-
Zhang, Shiliang, Tian, Qi, Huang, Qingming, Gao, Wen, and Rui, Yong
- Subjects
- *
IMAGE storage & retrieval systems , *INFORMATION storage & retrieval systems , *IMAGE retrieval , *IMAGE databases , *MULTIMEDIA systems - Abstract
Visual phrase considers multiple visual words and captures extra spatial clues among them. Thus, visual phrase shows better discriminative power than single visual word in image retrieval and matching. Not withstanding their success, existing visual phrases still show obvious shortcomings: (1) limited flexibility, i.e., visual phrases are considered for matching only if they contain the same number of visual words; (2) large quantization error and low repeatability, i.e., quantization errors in visual words are aggregated in visual word combinations and visual phrases, making them harder to be matched than single visual words. To avoid these issues, we propose multi-order visual phrase (MVP) which contains two complementary clues: center visual word quantized from the local descriptor of each image keypoint and the visual and spatial clues of multiple nearby keypoints. Two MVPs are flexibly matched by first matching their center visual words, then estimating a match confidence by checking the spatial and visual consistency of their neighbor keypoints. Therefore, center visual word matching equals to traditional visual word matching, but the neighbor spatial and visual clues checking significantly boosts the discriminative power. MVP does not scarify the repeatability of single visual word and is more robust to quantization error than existing visual phrases. We test our approach in three image retrieval tasks on UKbench, Oxford5K, and 1 million distractor images collected from Flickr. Comparisons with recent retrieval approaches and existing visual phrase features clearly demonstrate the competitive accuracy and significantly better efficiency of MVP. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
31. Saliency-Aware Nonparametric Foreground Annotation Based on Weakly Labeled Data.
- Author
-
Cao, Xiaochun, Zhang, Changqing, Fu, Huazhu, Guo, Xiaojie, and Tian, Qi
- Subjects
IMAGE processing ,MATHEMATICAL models ,NONPARAMETRIC estimation ,MATHEMATICAL statistics ,IMAGE databases ,IMAGE storage & retrieval systems - Abstract
In this paper, we focus on annotating the foreground of an image. More precisely, we predict both image-level labels (category labels) and object-level labels (locations) for objects within a target image in a unified framework. Traditional learning-based image annotation approaches are cumbersome, because they need to establish complex mathematical models and be frequently updated as the scale of training data varies considerably. Thus, we advocate the nonparametric method, which has shown potential in numerous applications and turned out to be attractive thanks to its advantages, i.e., lightweight training load and scalability. In particular, we exploit the salient object windows to describe images, which is beneficial to image retrieval and, thus, the subsequent image-level annotation and localization tasks. Our method, namely, saliency-aware nonparametric foreground annotation, is practical to alleviate the full label requirement of training data, and effectively addresses the problem of foreground annotation. The proposed method only relies on retrieval results from the image database, while pretrained object detectors are no longer necessary. Experimental results on the challenging PASCAL VOC 2007 and PASCAL VOC 2008 demonstrate the advance of our method. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
32. Encoding Spatial Context for Large-Scale Partial-Duplicate Web Image Retrieval.
- Author
-
Zhou, Wen-Gang, Li, Hou-Qiang, Lu, Yijuan, and Tian, Qi
- Subjects
SPATIAL analysis (Statistics) ,IMAGE retrieval ,SCALE invariance (Statistical physics) ,AFFINE geometry ,IMAGE databases - Abstract
Many recent state-of-the-art image retrieval approaches are based on Bag-of-Visual-Words model and represent an image with a set of visual words by quantizing local SIFT (scale invariant feature transform) features. Feature quantization reduces the discriminative power of local features and unavoidably causes many false local matches between images, which degrades the retrieval accuracy. To filter those false matches, geometric context among visual words has been popularly explored for the verification of geometric consistency. However, existing studies with global or local geometric verification are either computationally expensive or achieve limited accuracy. To address this issue, in this paper, we focus on partial duplicate Web image retrieval, and propose a scheme to encode the spatial context for visual matching verification. An efficient affine enhancement scheme is proposed to refine the verification results. Experiments on partial-duplicate Web image search, using a database of one million images, demonstrate the effectiveness and efficiency of the proposed approach. Evaluation on a 10-million image database further reveals the scalability of our approach. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
33. Fast and accurate near-duplicate image search with affinity propagation on the ImageWeb.
- Author
-
Xie, Lingxi, Tian, Qi, Zhou, Wengang, and Zhang, Bo
- Subjects
IMAGE processing ,COMPUTER algorithms ,IMAGE retrieval ,IMAGE quality analysis ,COMPUTER storage devices - Abstract
Highlights: [•] We propose an efficient system for large-scale image search. [•] We adopt document retrieval algorithms to improve the image search quality. [•] We study the tradeoff strategy in search process to accelerate our algorithm. [•] The proposed algorithm achieves the state-of-the-art search performance. [•] The time and memory complexity of the algorithm is very low. [Copyright &y& Elsevier]
- Published
- 2014
- Full Text
- View/download PDF
34. Cascade Category-Aware Visual Search.
- Author
-
Zhang, Shiliang, Tian, Qi, Huang, Qingming, Gao, Wen, and Rui, Yong
- Subjects
- *
SEARCH algorithms , *PROBLEM solving , *COMPUTER systems , *SEMANTICS , *INFORMATION retrieval , *SYSTEMS design - Abstract
Incorporating image classification into image retrieval system brings many attractive advantages. For instance, the search space can be narrowed down by rejecting images in irrelevant categories of the query. The retrieved images can be more consistent in semantics by indexing and returning images in the relevant categories together. However, due to their different goals on recognition accuracy and retrieval scalability, it is hard to efficiently incorporate most image classification works into large-scale image search. To study this problem, we propose cascade category-aware visual search, which utilizes weak category clue to achieve better retrieval accuracy, efficiency, and memory consumption. To capture the category and visual clues of an image, we first learn category-visual words, which are discriminative and repeatable local features labeled with categories. By identifying category-visual words in database images, we are able to discard noisy local features and extract image visual and category clues, which are hence recorded in a hierarchical index structure. Our retrieval system narrows down the search space by: 1) filtering the noisy local features in query; 2) rejecting irrelevant categories in database; and 3) preforming discriminative visual search in relevant categories. The proposed algorithm is tested on object search, landmark search, and large-scale similar image search on the large-scale LSVRC10 data set. Although the category clue introduced is weak, our algorithm still shows substantial advantages in retrieval accuracy, efficiency, and memory consumption than the state-of-the-art. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
35. ObjectPatchNet: Towards scalable and semantic image annotation and retrieval.
- Author
-
Zhang, Shiliang, Tian, Qi, Hua, Gang, Huang, Qingming, and Gao, Wen
- Subjects
SEMANTICS ,IMAGE retrieval ,IMAGE analysis ,INTERNET ,LARGE scale systems ,APPLICATION software - Abstract
Highlights: [•] We build ObjectPatchNet (OPN) from large-scale loosely annotated Internet images. [•] OPN preserves regionlevel semantic labels and contextual clues for image annotation. [•] OPN could be utilized as visual words with semantics for scalable image retrieval. [•] Though not exhaustively tuned, OPN performs decently in these two applications. [•] OPN could be an open platform for different applications and further improvements. [Copyright &y& Elsevier]
- Published
- 2014
- Full Text
- View/download PDF
36. Image Annotation by Input–Output Structural Grouping Sparsity.
- Author
-
Han, Yahong, Wu, Fei, Tian, Qi, and Zhuang, Yueting
- Subjects
THREE-dimensional imaging ,FEATURE extraction ,STATISTICAL correlation ,SEMANTICS ,FEATURE selection ,ANNOTATIONS ,IMAGE retrieval ,IMAGE processing - Abstract
Automatic image annotation (AIA) is very important to image retrieval and image understanding. Two key issues in AIA are explored in detail in this paper, i.e., structured visual feature selection and the implementation of hierarchical correlated structures among multiple tags to boost the performance of image annotation. This paper simultaneously introduces an input and output structural grouping sparsity into a regularized regression model for image annotation. For input high-dimensional heterogeneous features such as color, texture, and shape, different kinds (groups) of features have different intrinsic discriminative power for the recognition of certain concepts. The proposed structured feature selection by structural grouping sparsity can be used not only to select group-of-features but also to conduct within-group selection. Hierarchical correlations among output labels are well represented by a tree structure, and therefore, the proposed tree-structured grouping sparsity can be used to boost the performance of multitag image annotation. In order to efficiently solve the proposed regression model, we relax the solving process as a framework of the bilayer regression model for multilabel boosting by the selection of heterogeneous features with structural grouping sparsity (Bi-MtBGS). The first-layer regression is to select the discriminative features for each label. The aim of the second-layer regression is to refine the feature selection model learned from the first layer, which can be taken as a multilabel boosting process. Extensive experiments on public benchmark image data sets and real-world image data sets demonstrate that the proposed approach has better performance of multitag image annotation and leads to a quite interpretable model for image understanding. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
37. Task-Dependent Visual-Codebook Compression.
- Author
-
Ji, Rongrong, Yao, Hongxun, Liu, Wei, Sun, Xiaoshuai, and Tian, Qi
- Subjects
IMAGE compression ,DIGITAL image processing ,COMPUTER vision ,FEATURE extraction ,REGRESSION analysis ,PATTERN recognition systems ,IMAGE retrieval ,DATA dictionaries ,SEMANTICS - Abstract
A visual codebook serves as a fundamental component in many state-of-the-art computer vision systems. Most existing codebooks are built based on quantizing local feature descriptors extracted from training images. Subsequently, each image is represented as a high-dimensional bag-of-words histogram. Such highly redundant image description lacks efficiency in both storage and retrieval, in which only a few bins are nonzero and distributed sparsely. Furthermore, most existing codebooks are built based solely on the visual statistics of local descriptors, without considering the supervise labels coming from the subsequent recognition or classification tasks. In this paper, we propose a task-dependent codebook compression framework to handle the above two problems. First, we propose to learn a compression function to map an originally high-dimensional codebook into a compact codebook while maintaining its visual discriminability. This is achieved by a codeword sparse coding scheme with Lasso regression, which minimizes the descriptor distortions of training images after codebook compression. Second, we propose to adapt our codebook compression to the subsequent recognition or classification tasks. This is achieved by introducing a label constraint kernel (LCK) into our compression loss function. In particular, our LCK can model heterogeneous kinds of supervision, i.e., (partial) category labels, correlative semantic annotations, and image query logs. We validated our codebook compression in three computer vision tasks: 1) object recognition in PASCAL Visual Object Class 07; 2) near-duplicate image retrieval in UKBench; and 3) web image search in a collection of 0.5 million Flickr photographs. Our compressed codebook has shown superior performances over several state-of-the-art supervised and unsupervised codebooks. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
38. Generating Descriptive Visual Words and Visual Phrases for Large-Scale Image Applications.
- Author
-
Zhang, Shiliang, Tian, Qi, Hua, Gang, Huang, Qingming, and Gao, Wen
- Subjects
- *
IMAGE retrieval , *COMPUTER vision , *MULTIMEDIA systems , *DATABASES , *SEARCH algorithms , *MATERIALS , *INFORMATION retrieval - Abstract
Bag-of-visual Words (BoWs) representation has been applied for various problems in the fields of multimedia and computer vision. The basic idea is to represent images as visual documents composed of repeatable and distinctive visual elements, which are comparable to the text words. Notwithstanding its great success and wide adoption, visual vocabulary created from single-image local descriptors is often shown to be not as effective as desired. In this paper, descriptive visual words (DVWs) and descriptive visual phrases (DVPs) are proposed as the visual correspondences to text words and phrases, where visual phrases refer to the frequently co-occurring visual word pairs. Since images are the carriers of visual objects and scenes, a descriptive visual element set can be composed by the visual words and their combinations which are effective in representing certain visual objects or scenes. Based on this idea, a general framework is proposed for generating DVWs and DVPs for image applications. In a large-scale image database containing 1506 object and scene categories, the visual words and visual word pairs descriptive to certain objects or scenes are identified and collected as the DVWs and DVPs. Experiments show that the DVWs and DVPs are informative and descriptive and, thus, are more comparable with the text words than the classic visual words. We apply the identified DVWs and DVPs in several applications including large-scale near-duplicated image retrieval, image search re-ranking, and object recognition. The combination of DVW and DVP performs better than the state of the art in large-scale near-duplicated image retrieval in terms of accuracy, efficiency and memory consumption. The proposed image search re-ranking algorithm: DWPRank outperforms the state-of-the-art algorithm by 12.4% in mean average precision and about 11 times faster in efficiency. [ABSTRACT FROM PUBLISHER]
- Published
- 2011
- Full Text
- View/download PDF
39. Modeling spatial and semantic cues for large-scale near-duplicated image retrieval.
- Author
-
Zhang, Shiliang, Tian, Qi, Hua, Gang, Zhou, Wengang, Huang, Qingming, Li, Houqiang, and Gao, Wen
- Subjects
IMAGE retrieval ,SEMANTICS ,IMAGE processing ,COMPUTER vision ,FEATURE extraction ,CLUSTER analysis (Statistics) - Abstract
Abstract: Bag-of-visual Words (BoW) image representation has been illustrated as one of the most promising solutions for large-scale near-duplicated image retrieval. However, the traditional visual vocabulary is created in an unsupervised way by clustering a large number of image local features. This is not ideal because it largely ignores the semantic and spatial contexts between local features. In this paper, we propose the geometric visual vocabulary which captures the spatial contexts by quantizing local features in bi-space, i.e., in descriptor space and orientation space. Then, we propose to capture the semantic context by learning a semantic-aware distance metric between local features, which could reasonably measure the semantic similarities between image patches, from which the local features are extracted. The learned distance is hence utilized to cluster the local features for semantic visual vocabulary generation. Finally, we combine the spatial and semantic contexts in a unified framework by extracting local feature groups, computing the spatial configurations between the local features inside the group, and learning a semantic-aware distance between groups. The learned group distance is then utilized to cluster the extracted local feature groups to generate a novel visual vocabulary, i.e., the contextual visual vocabulary. The proposed visual vocabularies, i.e., geometric visual vocabulary, semantic visual vocabulary and contextual visual vocabulary are tested in large-scale near-duplicated image retrieval applications. The geometric visual vocabulary and semantic visual vocabulary achieve better performance than the traditional visual vocabulary. Moreover, the contextual visual vocabulary, which combines both spatial and semantic clues outperforms the state-of-the-art bundled feature in both retrieval precision and efficiency. [Copyright &y& Elsevier]
- Published
- 2011
- Full Text
- View/download PDF
40. Building descriptive and discriminative visual codebook for large-scale image applications.
- Author
-
Tian, Qi, Zhang, Shiliang, Zhou, Wengang, Ji, Rongrong, Ni, Bingbing, and Sebe, Nicu
- Subjects
IMAGE retrieval ,SEARCH algorithms ,GEOMETRIC quantization ,VISUAL acuity ,VISUAL analytics ,VISUAL communication ,CONTENT analysis ,SCALABILITY ,DESCRIPTIVE cataloging ,DESCRIPTIVE geometry - Abstract
Inspired by the success of textual words in large-scale textual information processing, researchers are trying to extract visual words from images which function similar as textual words. Visual words are commonly generated by clustering a large amount of image local features and the cluster centers are taken as visual words. This approach is simple and scalable, but results in noisy visual words. Lots of works are reported trying to improve the descriptive and discriminative ability of visual words. This paper gives a comprehensive survey on visual vocabulary and details several state-of-the-art algorithms. A comprehensive review and summarization of the related works on visual vocabulary is first presented. Then, we introduce our recent algorithms on descriptive and discriminative visual word generation, i.e., latent visual context analysis for descriptive visual word identification [], descriptive visual words and visual phrases generation [], contextual visual vocabulary which combines both semantic contexts and spatial contexts [], and visual vocabulary hierarchy optimization []. Additionally, we introduce two interesting post processing strategies to further improve the performance of visual vocabulary, i.e., spatial coding [] is proposed to efficiently remove the mismatched visual words between images for more reasonable image similarity computation; user preference based visual word weighting [] is developed to make the image similarity computed based on visual words more consistent with users' preferences or habits. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
41. Fast large-scale object retrieval with binary quantization.
- Author
-
Zhou, Shifu, Zeng, Dan, Shen, Wei, Zhang, Zhijiang, and Tian, Qi
- Subjects
IMAGE databases ,IMAGE retrieval ,SIGNAL quantization ,DIGITAL image processing ,SIGNAL processing - Abstract
The objective of large-scale object retrieval systems is to search for images that contain the target object in an image database. Where state-of-the-art approaches rely on global image representations to conduct searches, we consider many boxes per image as candidates to search locally in a picture. In this paper, a feature quantization algorithm called binary quantization is proposed. In binary quantization, a scale-invariant feature transform (SIFT) feature is quantized into a descriptive and discriminative bit-vector, which allows itself to adapt to the classic inverted file structure for box indexing. The inverted file, which stores the bit-vector and box ID where the SIFT feature is located inside, is compact and can be loaded into the main memory for efficient box indexing. We evaluate our approach on available object retrieval datasets. Experimental results demonstrate that the proposed approach is fast and achieves excellent search quality. Therefore, the proposed approach is an improvement over state-of-the-art approaches for object retrieval. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
42. Content-based image retrieval of multiphase CT images for focal liver lesion characterization.
- Author
-
Chi, Yanling, Zhou, Jiayin, Venkatesh, Sudhakar K., Tian, Qi, and Liu, Jimin
- Subjects
COMPUTED tomography ,IMAGE retrieval ,LIVER disease diagnosis ,MEDICAL practice ,RADIOLOGISTS ,IMAGE reconstruction ,MEDICAL databases - Abstract
Purpose: Characterization of focal liver lesions with various imaging modalities can be very challenging in the clinical practice and is experience-dependent. The authors' aim is to develop an automatic method to facilitate the characterization of focal liver lesions (FLLs) using multiphase computed tomography (CT) images by radiologists. Methods: A multiphase-image retrieval system is proposed to retrieve a preconstructed database of FLLs with confirmed diagnoses, which can assist radiologists' decision-making in FLL characterization. It first localizes the FLL on multiphase CT scans using a hybrid generative-discriminative FLL detection method and a nonrigid B-spline registration method. Then, it extracts the multiphase density and texture features to numerically represent the FLL. Next, it compares the query FLL with the model FLLs in the database in terms of the feature and measures their similarities using the L1-norm based similarity scores. The model FLLs are ranked by similarities and the top results are finally provided to the users for their evidence studies. Results: The system was tested on a database of 69 four-phase contrast-enhanced CT scans, consisting of six classes of liver lesions, and evaluated in terms of the precision-recall curve and the Bull's Eye Percentage Score (BEP). It obtained a BEP score of 78%. Compared with any single-phase based representation, the multiphase-based representation increased the BEP scores of the system, from 63%-65% to 78%. In a pilot study, two radiologists performed characterization of FLLs without and with the knowledge of the top five retrieved results. The results were evaluated in terms of the diagnostic accuracy, the receiver operating characteristic (ROC) curve and the mean diagnostic confidence. One radiologist's accuracy improved from 75% to 92%, the area under ROC curves (AUC) from 0.85 to 0.95 (p = 0.081), and the mean diagnostic confidence from 4.6 to 7.3 (p = 0.039). The second radiologist's accuracy did not change, at 75%, with AUC increasing from 0.72 to 0.75 (p = 0.709), and the mean confidence from 4.5 to 4.9 (p = 0.607). Conclusions: Multiphase CT images can be used in content-based image retrieval for FLL's categorization and result in good performance in comparison with single-phase CT images. The proposed method has the potential to improve the radiologists' diagnostic accuracy and confidence by providing visually similar lesions with confirmed diagnoses for their interpretation of clinical studies. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
43. Fine-residual VLAD for image retrieval.
- Author
-
Liu, Ziqiong, Wang, Shengjin, and Tian, Qi
- Subjects
- *
IMAGE retrieval , *CLUSTER analysis (Statistics) , *DIMENSIONS , *CODING theory , *AGGREGATION operators - Abstract
This paper revisits the vector of locally aggregated descriptors (VLAD), which aggregates the residuals of local descriptors to their cluster centers. Since VLAD usually adopts a small-size codebook, the clusters are coarse and residuals not discriminative. To address this problem, this paper proposes to generate a number of residual codebooks descended from the original clusters. After quantizing local descriptors with these codebooks, we pool the resulting secondary residuals as well as the primary ones to obtain the fine residuals. We show that, with two-step aggregation, the fine-residual VLAD has the same dimension as the original. Experiments on two image search benchmarks confirm the improved discriminative power of our method: we observe consistent superiority to the baseline and competitive performance to the state-of-the-arts. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
44. E2BoWs: An end-to-end Bag-of-Words model via deep convolutional neural network for image retrieval.
- Author
-
Liu, Xiaobin, Zhang, Shiliang, Huang, Tiejun, and Tian, Qi
- Subjects
- *
CONVOLUTIONAL neural networks , *IMAGE retrieval , *DEEP learning , *MACHINE learning , *FEATURE extraction - Abstract
Traditional Bag-of-Words (BoWs) model is commonly generated with many steps, including local feature extraction, codebook generation and feature quantization, etc. Those steps are relatively independent with each other and are hard to be jointly optimized. Moreover, the dependency on hand-crafted local feature makes BoWs model not effective in conveying high-level semantics. These issues largely hinder the performance of BoWs model in large-scale image applications. To conquer these issues, we propose an End-to-End BoWs (E2BoWs) model based on Deep Convolutional Neural Network (DCNN). Our model takes an image as input, then identifies and separates semantic objects in it, and finally outputs visual words with high semantic discriminative power. Specifically, our model firstly generates Semantic Feature Maps (SFMs) corresponding to different object categories through convolutional layers, then introduces Bag-of-Words Layers (BoWL) to generate visual words from each individual feature map. We also introduce a novel learning algorithm to reinforce the sparsity of the generated E2BoWs model, which further ensures the time and memory efficiency. We evaluate the proposed E2BoWs model on several image search datasets including MNIST, SVHN, CIFAR-10, CIFAR-100, MIRFLICKR-25K and NUS-WIDE. Experimental results show that our method achieves promising accuracy and efficiency compared with recent deep learning based retrieval works. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
45. Applying Visual User Interest Profiles for Recommendation and Personalisation
- Author
-
Zhou, Jiang, Albatal, Rami, Gurrin, Cathal, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tian, Qi, editor, Sebe, Nicu, editor, Qi, Guo-Jun, editor, Huet, Benoit, editor, Hong, Richang, editor, and Liu, Xueliang, editor
- Published
- 2016
- Full Text
- View/download PDF
46. Image Retrieval Using Color-Aware Tag on Progressive Image Search and Recommendation System
- Author
-
Ku, Shih-Yu, Chen, Kai-Hsiang, Huang, Jen-Wei, Tsao, Yu, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tian, Qi, editor, Sebe, Nicu, editor, Qi, Guo-Jun, editor, Huet, Benoit, editor, Hong, Richang, editor, and Liu, Xueliang, editor
- Published
- 2016
- Full Text
- View/download PDF
47. Consensus Guided Multiple Match Removal for Geometry Verification in Image Retrieval
- Author
-
Wu, Hong, Heng, Xing, Xu, Zenglin, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tian, Qi, editor, Sebe, Nicu, editor, Qi, Guo-Jun, editor, Huet, Benoit, editor, Hong, Richang, editor, and Liu, Xueliang, editor
- Published
- 2016
- Full Text
- View/download PDF
48. Robust Sketch-Based Image Retrieval by Saliency Detection
- Author
-
Zhang, Xiao, Chen, Xuejin, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tian, Qi, editor, Sebe, Nicu, editor, Qi, Guo-Jun, editor, Huet, Benoit, editor, Hong, Richang, editor, and Liu, Xueliang, editor
- Published
- 2016
- Full Text
- View/download PDF
49. Aggregating hierarchical binary activations for image retrieval.
- Author
-
Li, Ying, Kong, Xiangwei, Fu, Haiyan, and Tian, Qi
- Subjects
- *
IMAGE retrieval , *QUANTIZATION (Physics) , *DIFFUSION processes , *ARTIFICIAL neural networks , *NASH equilibrium , *DATA structures - Abstract
Highlights • We propose a simple yet effective quantization to embed deep binary codes. • Activations from multiple CNN layers function together through weighted score fusion in the proposed framework. • Handcrafted local descriptor SIFT, as a kind of low level feature, can also be combined in our fusion procedure. • Regularized diffusion process are customized on the ranking list to make the similarity estimation vary smoothly. • Extensive experiments are conducted on four public datasets, and state-of-the-art results are obtained on Holidays and UKBench datasets. Abstract Convolutional Neural Networks (CNNs) have achieved a breakthrough on a large number of image retrieval benchmarks. However, most previous works make use of the CNNs following the image classification strategy, where the last fully connected layer activations of the whole image are occupied as a single holistic feature vector. To improve the representation power of CNNs, this paper proposes a Multi-layer Fusion (MF) approach to aggregate deep activations for image retrieval task. The key insight of our approach is that different layers of a CNN are sensitive to specific patterns, and are complementary with each other for image representation. Specifically, our approach transforms CNN activations to deep binary codes embedded in the inverted index of Bag-of-Words structure for fast retrieval. Those activations are derived from multiple layers of a CNN on local patches, for features from orderless local areas have proved superior to global ones in the low level handcrafted cases. Corresponding weights and diffusion process are thereafter utilized to penalize and re-rank the individual similarity scores of layers. Our method is efficient, which extracts visual features from different layers only once. Furthermore, the proposed MF approach can be easily extended to include SIFT features to enhance the representation power. Extensive experiments on four public retrieval datasets quantitatively evaluate the effectiveness of our contributions, and the proposed algorithm prove to be the new state-of-the-art on the Holidays and UKBench datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
50. Improving context-sensitive similarity via smooth neighborhood for object retrieval.
- Author
-
Bai, Song, Sun, Shaoyan, Bai, Xiang, Zhang, Zhaoxiang, and Tian, Qi
- Subjects
- *
IMAGE retrieval , *MANIFOLDS (Mathematics) , *THREE-dimensional imaging , *ALGORITHMS , *MACHINE learning - Abstract
Due to the ability of capturing the geometry structure of data manifold, context-sensitive similarity has demonstrated impressive performances in the retrieval task. The key idea of context-sensitive similarity is that the similarity between two data points can be more reliably estimated with the local context of other points in the affinity graph. Therefore, neighborhood selection is a crucial factor for those algorithms, which affects the performance dramatically. In this paper, we propose a new algorithm called Smooth Neighborhood (SN) that mines the neighborhood structure to satisfy the manifold assumption. By doing so, nearby points on the underlying manifold are guaranteed to yield similar neighbors as much as possible. Moreover, SN is adjusted to tackle multiple affinity graphs by imposing a weight learning paradigm, and this is the primary difference compared with related works which are only applicable with one affinity graph. Finally, we integrate SN with Sparse Contextual Activation (SCA), a representative context-sensitive similarity proposed recently. Extensive experimental results and comparisons manifest that with the neighborhood structure generated by SN, the proposed framework can yield state-of-the-art performances on shape retrieval, image retrieval and 3D model retrieval. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.