This repository was archived by the owner on Apr 7, 2026. It is now read-only.
Description Description
Validate HNSW algorithm performance and memory usage with enterprise-scale datasets of 100,000+ vectors to ensure it can handle real-world workloads.
Phase
Phase 2: Large-Scale Stress Testing
Epic
Related to #202
Acceptance Criteria
Test Scenarios
Build Performance - Time to build HNSW index for large datasets
Memory Usage - Peak memory consumption during build and search
Search Accuracy - Precision/recall metrics with large datasets
Concurrent Operations - Multiple searches during large index builds
Persistence - Save/load times for large HNSW indexes
Test Structure
[ Test ]
[ Category ( "Stress" ) ]
[ Explicit ( "Large dataset test - run manually" ) ]
public async Task HNSW_Build_100KVectors_CompletesWithinTimeLimit ( )
{
// Arrange
const int VectorCount = 100_000 ;
const int Dimensions = 384 ; // Common embedding dimension
const int MaxBuildTimeMinutes = 10 ;
var database = new VectorDatabase ( ) ;
var vectors = GenerateLargeTestDataset ( VectorCount , Dimensions ) ;
using var memoryMonitor = new MemoryUsageMonitor ( ) ;
var stopwatch = Stopwatch . StartNew ( ) ;
// Act
foreach ( var vector in vectors )
database . Vectors . Add ( vector ) ;
await database . RebuildSearchIndexAsync ( SearchAlgorithm . HNSW ) ;
stopwatch . Stop ( ) ;
// Assert
Assert . That ( stopwatch . Elapsed , Is . LessThan ( TimeSpan . FromMinutes ( MaxBuildTimeMinutes ) ) ) ;
Assert . That ( memoryMonitor . PeakMemoryMB , Is . LessThan ( 4000 ) ) ; // 4GB limit
Assert . That ( database . Count , Is . EqualTo ( VectorCount ) ) ;
// Verify search functionality
var query = vectors . First ( ) ;
var results = database . Search ( query , 10 , SearchAlgorithm . HNSW ) ;
Assert . That ( results . Count , Is . EqualTo ( 10 ) ) ;
}
Performance Metrics
Build time per vector (target: < 1ms average)
Memory efficiency (target: < 50 bytes per vector overhead)
Search latency with large indexes (target: < 100ms for k=10)
Reactions are currently unavailable
Description
Validate HNSW algorithm performance and memory usage with enterprise-scale datasets of 100,000+ vectors to ensure it can handle real-world workloads.
Phase
Phase 2: Large-Scale Stress Testing
Epic
Related to #202
Acceptance Criteria
Test Scenarios
Test Structure
Performance Metrics