Made possible with Rust and a few optimization tricks.
- Qdrant as a vector search engine
- ONNX inference in Rust
- Embeddings cache & lookup
- Parallel & Batch requests
- Hybrid search with full-text filtering + vector re-scoring
Code repo https://github.com/qdrant/page-search
This a interesting article!
How do you build the prefixes when multiple words share the same prefix?
If my understanding is correct, the method is:
1. Find common search terms
2. For each search term (eg test, compute it's vector [1.23, 4.56...] and it's prefixes [t, te, tes...])
3. Store these in Qdrant as t->[1.23, 4.56...] , te->[1.23, 4.56...] , tes->[1.23, 4.56...] and so on. Here, each of the prefixes are used as point_ids
4. When a search query comes in, call /recommend and pass in the partial query as the point id