Pinecone is a service that will do this:
Facebook has a tool for it:
Faiss: A library for efficient similarity search
Here’s what Pinecone recommends:
1. Text Embeddings – API Based
If you'd like to use an API, we recommend OpenAI's text-embedding-ada-002.
2. Text Embeddings – Open Source
If you'd rather opt for an open source model, all-MiniLM-L6-v2 and all-mpnet-base-v2 are good options.
3. Images or Multimodal – Open Source
We suggest using OpenAI's CLIP model to generate vector embeddings for images or multimodal use cases. Here's a how-to guide.
Product Quantizers for k-NN Tutorial Part 1 · Chris McCormick