Skip to content

Replace 8 PowerShell validation scripts with compiled C repo_validator (39x faster)#353

Draft
mattdurak wants to merge 6 commits intomasterfrom
feature/compiled-repo-validator
Draft

Replace 8 PowerShell validation scripts with compiled C repo_validator (39x faster)#353
mattdurak wants to merge 6 commits intomasterfrom
feature/compiled-repo-validator

Conversation

@mattdurak
Copy link
Copy Markdown
Collaborator

Summary

Replaces 8 of 10 PowerShell repo validation scripts with a single compiled C executable (repo_validator), reducing validation time from ~70s to ~1.8s on the ebs repo (39x faster).

Motivation

The repo_validation/ framework runs validation scripts as CMake custom targets. With PowerShell:

  • Each script re-traverses the entire repo independently (10x redundant I/O)
  • ~500ms-1s startup overhead per PowerShell process
  • Sequential execution of all scripts

Changes

New: repo_validation/src/ — Compiled C validator

  • repo_validator.h — Common types, check interface
  • main.c — CLI parsing (--repo-root, --exclude-folders, --fix, --check <name>)
  • file_walker.c — Single cross-platform directory traversal (Windows + Linux)
  • 8 check implementations in checks/:
    • check_no_tabs.c, check_file_endings.c, check_requirements_naming.c, check_srs_uniqueness.c
    • check_enable_mocks.c, check_no_vld_include.c, check_no_backticks_in_srs.c, check_test_spec_tags.c

Deleted: 8 PowerShell scripts (2,010 lines)

Only validate_aaa_comments.ps1 and validate_srs_consistency.ps1 remain (Phase 3 candidates — these require a lightweight C tokenizer).

Updated

  • repo_validation/CMakeLists.txt — Builds repo_validator, routes migrated checks to C tool, remaining 2 to PowerShell
  • 8 test CMakeLists.txt files — Updated to invoke repo_validator instead of PowerShell

Benchmark (ebs repo: 6,612 source files, 18,356 TEST_FUNCTIONs)

Metric PowerShell C tool Speedup
8 migrated checks ~70s 1.8s 39x
Slowest single check (srs_uniqueness) 16.3s 0.6s 27x

Testing

  • All 8 test suites pass (clean + fix modes)
  • All 8 detection tests correctly identify violations
  • Remaining 2 PowerShell-based tests still pass
  • Tested against ebs repo with 24,345 SRS tags and 18,356 TEST_FUNCTIONs

mattdurak and others added 5 commits March 13, 2026 11:36
Introduce a single C executable (repo_validator) that replaces PowerShell
validation scripts with a compiled tool for significantly faster execution.

Core components:
- repo_validator.h: Common types, check interface, file classification
- main.c: CLI parsing, check registration, orchestration
- file_walker.c: Cross-platform directory traversal (Windows + Linux)
- CMakeLists.txt: Build configuration for the executable

The tool supports:
- --repo-root, --exclude-folders, --fix, --check <name> flags
- Single directory traversal shared across all active checks
- PowerShell-style argument compatibility (-RepoRoot, -Fix, etc.)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ng, srs_uniqueness

Simple validators migrated from PowerShell to C:
- check_no_tabs: Detects tab characters, replaces with 4 spaces in fix mode
- check_file_endings: Validates CRLF line endings, appends in fix mode
- check_requirements_naming: Enforces _requirements.md suffix in devdoc
- check_srs_uniqueness: Detects duplicate SRS tags across markdown files

Benchmark on ebs repo (6,612 files):
  PowerShell (4 scripts sequential): 25.9s
  C tool (single traversal):          1.5s  (17x faster)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ks, test_spec_tags

Moderate-complexity validators migrated from PowerShell to C:
- check_enable_mocks: Flags deprecated #define/#undef ENABLE_MOCKS,
  replaces with include pattern, respects // force exemption
- check_no_vld_include: Detects #include vld.h, removes standalone
  includes and #ifdef USE_VLD blocks in fix mode
- check_no_backticks_in_srs: Strips markdown backticks from SRS text
- check_test_spec_tags: Validates TEST_FUNCTION() has preceding
  Tests_*_DD_DDD spec tags, supports // no-srs exemption

Uses pre-built line index for O(1) line access (vs O(n) rescan).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
These scripts have been replaced by the compiled repo_validator:
- validate_no_tabs.ps1
- validate_file_endings.ps1
- validate_requirements_naming.ps1
- validate_srs_uniqueness.ps1
- validate_enable_mocks_pattern.ps1
- validate_no_vld_include.ps1
- validate_no_backticks_in_srs.ps1
- validate_test_spec_tags.ps1

Only 2 complex scripts remain (Phase 3 candidates):
- validate_aaa_comments.ps1
- validate_srs_consistency.ps1

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- repo_validation/CMakeLists.txt: Build repo_validator exe when
  run_repo_validation=ON, route 8 migrated checks to C tool,
  remaining 2 checks still use PowerShell
- All 8 test CMakeLists.txt updated to invoke repo_validator
  instead of powershell scripts
- Tests use DEPENDS repo_validator for build ordering

All 8 test suites pass (clean + fix), all 8 detection tests
correctly identify violations.

Benchmark (ebs repo, 6,612 files, 18,356 TEST_FUNCTIONs):
  8 PowerShell scripts (sequential): ~70s
  Compiled C tool (single traversal):  1.8s  (39x faster)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Enforce project coding standards across all C source files:
- Single return statement per function (no early returns)
- Every if has an explicit else (with /* do nothing */ where needed)
- Error path in if, success path in else
- (void) cast on all ignored return values (printf, memcpy, fwrite, etc.)
- Allman brace style throughout
- result variable pattern for function return values
- Consistent if/else error handling chains

No behavioral changes - only style/convention fixes.
All 8 test suites verified passing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant