Vector Database Benchmarks
Compare latency, throughput, recall, memory usage, and indexing speed across 10 popular vector databases. Sort by any metric to find the best performer for your workload.
| Database | P50 Latency(ms) | P99 Latency(ms) | QPS(q/s) | Memory/1M(GB) | Indexing(vec/s) | Recall@10(%) | Index |
|---|---|---|---|---|---|---|---|
![]() | 0.8 | 2.5 | 12.0K | 2.1 | 35.0K | 96.8% | HNSW / FLAT |
![]() | 1.2 | 3.8 | 8.5K | 0.6 | 45.0K | 98.5% | HNSW |
![]() | 1.4 | 4.8 | 7.5K | 0.75 | 50.0K | 97.9% | IVF_FLAT / HNSW |
![]() | 1.5 | 5.2 | 7.2K | 0.8 | 52.0K | 97.8% | IVF_FLAT / HNSW |
![]() | 2 | 7 | 6.0K | 0.85 | 40.0K | 97% | HNSW |
![]() | 1.8 | 6.1 | 5.8K | 0.9 | 38.0K | 97.2% | HNSW |
![]() | 2.5 | 8 | 4.5K | 0.7 | 30.0K | 96.5% | Proprietary |
![]() | 3.5 | 12 | 2.8K | 1.5 | 20.0K | 95.5% | HNSW |
![]() | 3 | 10 | 2.2K | 1 | 15.0K | 96% | HNSW |
![]() | 4.2 | 15 | 1.8K | 1.2 | 12.0K | 95% | HNSW / IVFFlat |
Frequently Asked Questions
Which vector database is the fastest?
Redis Vector offers the lowest latency (sub-millisecond P50) due to its in-memory architecture, but uses the most memory. Qdrant provides the best balance of speed and efficiency with its Rust implementation. For highest QPS throughput, Redis and Qdrant lead the benchmarks.
Which vector database has the best recall?
Most modern vector databases achieve 95-99% recall at k=10 with HNSW indexes. Qdrant leads at ~98.5%, followed closely by Zilliz/Milvus at ~97.9% and Weaviate at ~97.2%. Higher recall typically comes at the cost of increased latency.
How do vector database benchmarks compare for production use?
Benchmark numbers are a starting point — production performance depends on dataset size, dimensionality, hardware, and query patterns. For production workloads, focus on P99 latency (worst case), QPS under load, and memory efficiency rather than just P50 latency.
What is a good QPS for a vector database?
For most applications, 1,000-5,000 QPS is sufficient. High-traffic applications may need 10,000+ QPS. Redis Vector (12K QPS) and Qdrant (8.5K QPS) lead in throughput, while pgvector (~1.8K QPS) is suitable for smaller workloads.









