In this paper we propose a large-scale high-dimensional indexing algorithm based on sparse approximation and inverted indexing. Our goal was to devise a method that smoothly scales to handle databases with over 100 million descriptors on a single machine. To meet this goal, we implemented an inverted indexed based on a sparsifying dictionary with l (0) regression to assign documents to buckets. The sparsifying dictionary is optimized to reduce the data dimensionality, by concentrating the energy of the original vector on a few coefficients of a higher dimensional representation. These descriptors are added to an inverted index explores the locality of the coefficients of sparse representations to enable efficient pruned search. Evaluation on four large-scale datasets with multiple types of features showed that our method compares favorably to state-of-the-art techniques. On a 100 million dataset of SIFT descriptors, our method achieved 47.6 % precision at 50, by inspecting only 1 % of the full dataset, and by using only 1/20 of the time of a linear search.
|Number of pages||24|
|Journal||Multimedia Tools And Applications|
|Publication status||Published - Nov 2017|
- Sparse approximation
- Multimedia indexing
- l(0) penalty