MetalANNS is a high-performance, GPU-native vector search engine engineered exclusively for Apple Silicon. By leveraging Metal compute shaders and a CAGRA-inspired graph architecture, it delivers sub-millisecond Approximate Nearest Neighbor Search (ANNS) directly on the device.
Español | 日本語 | Português (Brasil) | 中文
- Fast: 10x-20x faster query throughput than CPU-based HNSW by exploiting GPU parallelism.
- Unified Memory: Optimized for Apple’s UMA — zero unnecessary memory copies between CPU and GPU.
- Type-Safe State: Uses Swift's generic type-state machine to prevent runtime errors (e.g., searching an unbuilt index).
- Hybrid Search: First-class support for metadata filtering powered by an integrated SQL engine.
MetalANNS is built for the Unified Memory Architecture of M-series and A-series chips. While traditional libraries like HNSW are inherently sequential, MetalANNS uses CAGRA (CUDA-Accelerated Graph-based Approximate) principles, adapted for Metal, to perform massively parallel searches.
Benchmark: M3 Max (30-core GPU). MetalANNS constructs the graph in parallel using compute shaders.
MetalANNS maintains perfect recall at 10x the throughput of competitive CPU libraries.
Designed for the modern Swift developer. Zero boilerplate, fully async/await native, and statically safe.
Initialize with a specific state. The compiler will prevent you from calling .search() on an unbuilt index.
import MetalANNS
let config = IndexConfiguration(degree: 32, metric: .cosine)
let index = VectorIndex<String, VectorIndexState.Unbuilt>(configuration: config)Leverage the GPU to build the search graph from your embeddings in seconds.
let readyIndex = try await index.build(
vectors: myEmbeddings, // [[Float]]
ids: myDocumentIDs // [String]
)Combine vector similarity with SQL-like metadata filtering in a single pass.
// Use the elegant Query DSL
let results = try await readyIndex.search(query: queryVector, topK: 10) {
QueryFilter.equals(Field<String>("category"), "research")
QueryFilter.greaterThan(Field<Float>("relevance"), 0.85)
}
for hit in results {
print("Found \(hit.id) with score: \(hit.score)")
}Save your index to disk and load it instantly using memory-mapping — ideal for memory-constrained iOS devices.
try await readyIndex.save(to: fileURL)
// Instant load with zero memory overhead
let loadedIndex = try await VectorIndex<String, VectorIndexState.Ready>
.loadReadOnly(from: fileURL, mode: .mmap)Why choose MetalANNS over HNSW or FAISS?
| Feature | MetalANNS | HNSWLib (CPU) |
|---|---|---|
| Architecture | CAGRA (GPU Parallel) | HNSW (CPU Sequential) |
| Memory copies | Zero (UMA) | High (PCIe/Bus) |
| Concurrency | Swift 6 Actors | Mutex/Locks |
| Persistence | Zero-copy mmap |
Full memory load |
| API Safety | Type-State Machine | Runtime checks |
Important
CAGRA vs. HNSW: HNSW builds a hierarchical graph that is difficult to parallelize during construction. MetalANNS uses a fixed-degree directed graph (CAGRA) which allows thousands of GPU threads to explore the search space simultaneously.
The MetalANNS Crocodile represents our core philosophy:
- Low Latency: Attacks the search problem with predatory speed.
- Apple Ecosystem: Perfectly adapted to its habitat (Metal/Swift).
- Powerful Grip: High recall that never lets go of accuracy.
Add MetalANNS to your Package.swift:
dependencies: [
.package(url: "https://github.com/christopherkarani/MetalANNS.git", from: "0.1.2")
]MetalANNS is available under the MIT license. See LICENSE for more info.