Large-scale high-dimensional indexing by sparse hashing with l 0 approximation

P. Borges, J. Magalhães

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)


In this paper we propose a large-scale high-dimensional indexing algorithm based on sparse approximation and inverted indexing. Our goal was to devise a method that smoothly scales to handle databases with over 100 million descriptors on a single machine. To meet this goal, we implemented an inverted indexed based on a sparsifying dictionary with l (0) regression to assign documents to buckets. The sparsifying dictionary is optimized to reduce the data dimensionality, by concentrating the energy of the original vector on a few coefficients of a higher dimensional representation. These descriptors are added to an inverted index explores the locality of the coefficients of sparse representations to enable efficient pruned search. Evaluation on four large-scale datasets with multiple types of features showed that our method compares favorably to state-of-the-art techniques. On a 100 million dataset of SIFT descriptors, our method achieved 47.6 % precision at 50, by inspecting only 1 % of the full dataset, and by using only 1/20 of the time of a linear search.
Original languageEnglish
Pages (from-to)24389-24412
Number of pages24
JournalMultimedia Tools And Applications
Issue number22
Publication statusPublished - Nov 2017


  • Sparse approximation
  • Multimedia indexing
  • k-SVD
  • l(0) penalty


Dive into the research topics of 'Large-scale high-dimensional indexing by sparse hashing with l 0 approximation'. Together they form a unique fingerprint.

Cite this