Paris Descartes University Seminar Series on Data Analytics
in collaboration with the diNo group

Invited Seminar Talk




Vector Similarity Search in the Compressed Domain - Tradeoffs and Algorithms
Dr. Matthijs Douze, Facebook AI Research (France)


when: 7 February 2018, 11am
where: room Turing Reunion, 7th floor, Paris Descartes University, 45 Rue Des Saints Peres, Paris 75006


Abstract

We use a product quantization-based approach for approximate nearest neighbor search. The idea is to decompose the space into a Cartesian product of low-dimensional subspaces and to quantize each subspace separately. A vector is represented by a short code composed of its subspace quantization indices. The Euclidean distance between two vectors can be efficiently estimated from their codes. We introduced several extensions to this method: using an inverted index to make the search non-exhaustive; combining product quantization with binary encodings; and using it with a very efficient knn-graph based indexing method. All these methods are available in the Faiss library, a platform for similarity search that tackles all speed/accuracy/memory usage operating points.

Short Bio

Matthijs is a research scientist at the Facebook AI Research (FAIR) lab in Paris since November 2015. He obtained a master degree from the ENSEEIHT engineering school and a PhD at the University of Toulouse in 2004. In 2005-2015 he joined the LEAR team at INRIA Grenoble where he worked on a variety of topics, including image indexing, large-scale vector indexing, event recognition in videos and similar video search. In 2010-2015 he also managed Kinovis, a large 3D motion capture studio at INRIA and developed high-performance geometric algorithms for constructive solid geometry operations. At FAIR he works mainly on large-scale indexing and on applications to learning for image datasets with little to no supervision.


Hosted by: Themis Palpanas

List of past seminars