datastores.ai

Vector Database Benchmarks

Compare latency, throughput, recall, memory usage, and indexing speed across 10 popular vector databases. Sort by any metric to find the best performer for your workload.

1M vectors, 128 dimensions, k=10, 8 vCPU / 32GB RAM. Click column headers to sort.
Database
P50 Latency(ms)
P99 Latency(ms)
QPS(q/s)
Memory/1M(GB)
Indexing(vec/s)
Recall@10(%)
Index
Redis Vector
0.8
2.5
12.0K
2.1
35.0K
96.8%
HNSW / FLAT
Qdrant
1.2
3.8
8.5K
0.6
45.0K
98.5%
HNSW
Zilliz Cloud
1.4
4.8
7.5K
0.75
50.0K
97.9%
IVF_FLAT / HNSW
Milvus
1.5
5.2
7.2K
0.8
52.0K
97.8%
IVF_FLAT / HNSW
Vespa
2
7
6.0K
0.85
40.0K
97%
HNSW
Weaviate
1.8
6.1
5.8K
0.9
38.0K
97.2%
HNSW
Pinecone
2.5
8
4.5K
0.7
30.0K
96.5%
Proprietary
Elasticsearch
3.5
12
2.8K
1.5
20.0K
95.5%
HNSW
ChromaDB
3
10
2.2K
1
15.0K
96%
HNSW
pgvector
4.2
15
1.8K
1.2
12.0K
95%
HNSW / IVFFlat
Note: Benchmarks are approximate and based on publicly available data from ann-benchmarks, vendor documentation, and independent tests. Actual performance varies by hardware, dataset, configuration, and query patterns. Self-hosted databases were tested on comparable infrastructure.

Frequently Asked Questions

Which vector database is the fastest?

Redis Vector offers the lowest latency (sub-millisecond P50) due to its in-memory architecture, but uses the most memory. Qdrant provides the best balance of speed and efficiency with its Rust implementation. For highest QPS throughput, Redis and Qdrant lead the benchmarks.

Which vector database has the best recall?

Most modern vector databases achieve 95-99% recall at k=10 with HNSW indexes. Qdrant leads at ~98.5%, followed closely by Zilliz/Milvus at ~97.9% and Weaviate at ~97.2%. Higher recall typically comes at the cost of increased latency.

How do vector database benchmarks compare for production use?

Benchmark numbers are a starting point — production performance depends on dataset size, dimensionality, hardware, and query patterns. For production workloads, focus on P99 latency (worst case), QPS under load, and memory efficiency rather than just P50 latency.

What is a good QPS for a vector database?

For most applications, 1,000-5,000 QPS is sufficient. High-traffic applications may need 10,000+ QPS. Redis Vector (12K QPS) and Qdrant (8.5K QPS) lead in throughput, while pgvector (~1.8K QPS) is suitable for smaller workloads.