Test fixture repositories in ~/dev/test-fixtures/ for verifying semfora-engine across multiple programming languages. All repos cloned with --depth 1.
- Location:
~/dev/test-fixtures/tokio - Size on disk: 9.5 MB
- Index Results:
- Files found: 808 | Processed: 808 | Errors: 0
- Modules: 121 | Symbols: 7,857
- Exercises: Async patterns, trait bounds, macro expansions, unsafe code blocks, complex call graphs
- Location:
~/dev/test-fixtures/ts-vscode - Size on disk: 206 MB
- Index Results:
- Files found: 8,288 | Processed: 8,283 | Errors: 5 (99.94% success)
- Modules: 1,926 | Symbols: 89,692
- Exercises: TypeScript types, large TS codebase, extension architecture, editor patterns
- Location:
~/dev/test-fixtures/next.js - Size on disk: 334 MB
- Index Results:
- Files found: 23,453 | Processed: 23,452 | Errors: 1 (99.996% success)
- Modules: 10,647 | Symbols: 91,708
- Exercises: Monorepo structure, JSX/TSX, ES modules, React patterns, mixed JS/TS
- Location:
~/dev/test-fixtures/kubernetes - Size on disk: 384 MB
- Index Results:
- Files found: 23,670 | Processed: 23,670 | Errors: 0
- Modules: 4,307 | Symbols: 296,919
- Exercises: Go interfaces, struct embedding, goroutines, large-scale package organization
- Location:
~/dev/test-fixtures/c-linux - Size on disk: 2.0 GB
- Index Results:
- Files found: 72,518 | Processed: 72,518 | Errors: 0
- Modules: 0 | Symbols: 4,463,505
- Exercises: C macros, header dependencies, kernel patterns, massive scale
- Note: Indexing takes 40+ minutes and uses ~16 GB RAM. Modules count is 0 because C files lack a module system — symbols are extracted at file level.
- Location:
~/dev/test-fixtures/llvm-project - Size on disk: 2.7 GB
- Index Results:
- Files found: 72,334 | Processed: 72,320 | Errors: 14 (99.98% success)
- Modules: 8,273 | Symbols: 1,175,855
- Exercises: C++ templates, namespaces, header-heavy patterns, compiler infrastructure
- Note: Indexing takes 15-20 minutes and uses ~5 GB RAM. 14 errors likely from unusual preprocessor constructs.
- Location:
~/dev/test-fixtures/spring-boot - Size on disk: 110 MB
- Index Results:
- Files found: 5,365 | Processed: 5,365 | Errors: 0
- Modules: 1,683 | Symbols: 42,674
- Exercises: Java annotations, generics, inheritance hierarchies, Maven/Gradle patterns, dependency injection
- Location:
~/dev/test-fixtures/rails - Size on disk: 61 MB
- Index Results:
- Files found: 472 | Processed: 472 | Errors: 0
- Modules: 117 | Symbols: 966
- Exercises: Ruby metaprogramming, DSLs, dynamic method generation, module mixins
- Location:
~/dev/test-fixtures/mixed-nickel-rs - Size on disk: 880 KB
- Index Results:
- Files found: 82 | Processed: 82 | Errors: 0
- Modules: 17 | Symbols: 359
- Exercises: Small multi-language project, quick smoke tests, Rust web framework patterns
| Repo | Language | Files | Symbols | Errors | Disk |
|---|---|---|---|---|---|
mixed-nickel-rs |
Rust (mixed) | 82 | 359 | 0 | 880 KB |
rails |
Ruby | 472 | 966 | 0 | 61 MB |
tokio |
Rust | 808 | 7,857 | 0 | 9.5 MB |
spring-boot |
Java | 5,365 | 42,674 | 0 | 110 MB |
ts-vscode |
TypeScript | 8,288 | 89,692 | 5 | 206 MB |
next.js |
TypeScript/JS | 23,453 | 91,708 | 1 | 334 MB |
kubernetes |
Go | 23,670 | 296,919 | 0 | 384 MB |
c-linux |
C | 72,518 | 4,463,505 | 0 | 2.0 GB |
llvm-project |
C/C++ | 72,334 | 1,175,855 | 14 | 2.7 GB |
Total disk usage: ~5.8 GB (shallow clones)
Across all 9 repos, semfora-engine encountered 20 errors out of 230,990 files (99.99% success rate):
- next.js: 1 error (23,453 files)
- ts-vscode: 5 errors (8,288 files)
- llvm-project: 14 errors (72,334 files)
- All other repos: 0 errors
No crashes or panics observed during any indexing run.
For rapid verification, use the smallest repos:
mixed-nickel-rs(82 files) — Rust, under 1 secondrails(472 files) — Rubytokio(808 files) — Rust
Test changes against at least 2-3 repos from different language families:
tokio(Rust) +kubernetes(Go) +spring-boot(Java) — good default set
For performance and scale verification:
kubernetes(296K symbols) — large Go projectc-linux(4.4M symbols) — extreme scale, 40+ min indexing, ~16 GB RAMllvm-project(1.1M symbols) — large C/C++, 15-20 min indexing, ~5 GB RAM
Indexes are stored in ~/.cache/semfora/ (hashed by project path), not inside the repos.
cd ~/dev/test-fixtures/<repo-name>
semfora-engine index generate- Clone to
~/dev/test-fixtures/with--depth 1 - Run
semfora-engine index generateand capture output - Update this file with actual stats from the indexing run
| Language | Repos | Status |
|---|---|---|
| Rust | tokio, mixed-nickel-rs |
✅ Indexed, 0 errors |
| TypeScript/JS | ts-vscode, next.js |
✅ Indexed, 6 errors total |
| Go | kubernetes |
✅ Indexed, 0 errors |
| C | c-linux |
✅ Indexed, 0 errors |
| C/C++ | llvm-project |
✅ Indexed, 14 errors |
| Java | spring-boot |
✅ Indexed, 0 errors |
| Ruby | rails |
✅ Indexed, 0 errors |
| Python | (use existing pytorch fixture) | — |
Before submitting PRs, developers should verify:
- Changes tested against at least 2-3 repos from different language families
- No new indexing errors introduced
- Symbol counts remain stable (±1% acceptable variance)
- No crashes or panics during indexing