Tools

Pinecone is a service that will do this:

Overview

Facebook has a tool for it:

Faiss: A library for efficient similarity search

Getting Vectors

Here’s what Pinecone recommends:

1. Text Embeddings – API Based

If you'd like to use an API, we recommend OpenAI's text-embedding-ada-002.

2. Text Embeddings – Open Source

If you'd rather opt for an open source model, all-MiniLM-L6-v2 and all-mpnet-base-v2 are good options.

3. Images or Multimodal – Open Source

We suggest using OpenAI's CLIP model to generate vector embeddings for images or multimodal use cases. Here's a how-to guide.

Algorithm

Links

Product Quantizers for k-NN Tutorial Part 1 · Chris McCormick