Goal: Simulate events and validate pipeline behavior with comprehensive testing infrastructure
Status: Ready to begin (Phase 1 complete, Phase 1.5 fixtures ready)
Timeline: Estimated 3-5 development sessions
-
Core Pipeline Components
- ✓ TreeCapture - captures UI trees (mock implementation ready for OS APIs)
- ✓ TreeNormalizer - removes transient properties, maps alternative names
- ✓ NodeClassifier - categorizes nodes (interactive, static, container, etc.)
- ✓ NoiseFilters - removes irrelevant elements
- ✓ SignatureGenerator - creates SHA256 fingerprints (structural + content + full)
- ✓ TemplateLoader - loads YAML templates with validation
- ✓ TemplateValidator - validates template structure (screen_id, required_nodes, etc.)
- ✓ Matcher - compares trees to baselines with scoring algorithm
- ✓ DiffEngine - finds differences between trees
- ✓ TransitionChecker - validates state transitions
- ✓ DriftEvent - represents drift occurrences
- ✓ ImmutableLog - append-only logging
- ✓ HashChain - tamper-evident hash chaining
- ✓ EventWriter - writes events to disk
-
Test Infrastructure (Phase 1.5)
- ✓ Mock UI trees: DISCORD_CHAT_TREE, DOORDASH_OFFER_TREE, LOGIN_FORM_TREE, etc.
- ✓ Drift scenarios: MISSING_BUTTON_DRIFT, CONTENT_CHANGE_DRIFT, etc.
- ✓ Template fixtures with real signatures
- ✓ Integration test helpers:
run_pipeline(),create_test_log(), etc.
-
CLI Skeleton
- ✓ cmd_simulate() - stubbed
- ✓ cmd_drift() - stubbed
- ✓ cmd_replay() - stubbed
- ✓ cmd_status() - stubbed
-
Functional CLI Commands
- Interactive simulation mode (
sz simulate <tree.json>) - Drift event viewer (
sz drift --filter=layout) - Log replay with timeline (
sz replay --start=0 --end=100) - System status dashboard (
sz status)
- Interactive simulation mode (
-
End-to-End Pipeline Testing
- Automated test suite running full pipeline
- Validation of each pipeline stage
- Hash chain integrity verification
- Template matching accuracy tests
-
Mock Event Generation
- Simulate accessibility events without OS dependencies
- Generate event sequences for various scenarios
- Test state transitions and edge cases
-
Log Analysis Tools
- Parse and display log entries
- Verify hash chain integrity
- Filter and search drift events
- Export reports
Priority: HIGH | Estimated Time: 1-2 hours
Objective: Run the full pipeline on mock trees and display results
Implementation Steps:
- Read JSON tree from file or use fixture
- Run through normalization → signature → matching → drift detection
- Display results in terminal with Rich formatting:
- Tree structure preview
- Normalization changes
- Signature (truncated)
- Best template match + score
- Detected drift events
- Log entry created
Files to Modify:
interface/cli/commands.py- Implementcmd_simulate()interface/cli/display.py- Add Rich display utilitiesinterface/cli/main.py- Wire simulate command with argparse
Dependencies: Rich library for terminal formatting
Test Cases:
- Simulate with DISCORD_CHAT_TREE → should match discord_chat template
- Simulate with MISSING_BUTTON_DRIFT → should detect layout drift
- Simulate with unknown tree → should return low match scores
Deliverable: python run.py simulate tests/fixtures/mock_trees.json --template=discord_chat
Priority: HIGH | Estimated Time: 1 hour
Objective: Display and filter drift events from immutable log
Implementation Steps:
- Load log file (default: logs/systemzero.log)
- Parse JSON entries
- Filter by drift type if specified
- Display in formatted table with columns:
- Index | Timestamp | Type | Severity | Details
- Support pagination for large logs
- Show summary statistics (counts by type)
Files to Modify:
interface/cli/commands.py- Implementcmd_drift()interface/cli/display.py- Add table formattingcore/logging/immutable_log.py- Addread_entries()method
Test Cases:
- View all drift events
- Filter by type:
sz drift --filter=layout - Filter by severity:
sz drift --severity=critical - Empty log handling
Deliverable: python run.py drift --filter=layout --severity=warning
Priority: MEDIUM | Estimated Time: 1-2 hours
Objective: Replay logged events with timeline navigation
Implementation Steps:
- Load log entries from file
- Display in chronological order with timeline
- For each entry show:
- Entry number / total
- Timestamp (relative and absolute)
- Event type and data
- Drift events if present
- Hash chain link (previous hash)
- Support range selection (--start, --end)
- Add interactive mode with arrow key navigation (optional for Phase 2)
Files to Modify:
interface/cli/commands.py- Implementcmd_replay()interface/cli/display.py- Add timeline formatting
Test Cases:
- Replay all events:
sz replay - Replay range:
sz replay --start=10 --end=20 - Replay single event:
sz replay --entry=15 - Verify hash chain during replay
Deliverable: python run.py replay logs/systemzero.log --start=0 --end=50
Priority: LOW | Estimated Time: 30 minutes
Objective: Display current system state and configuration
Implementation Steps:
- Show configuration summary:
- Log path and size
- Template count and locations
- Pipeline module status
- Show recent activity:
- Last N events
- Drift summary (counts by type)
- Log integrity status
- Format as dashboard with Rich panels
Files to Modify:
interface/cli/commands.py- Implementcmd_status()interface/cli/display.py- Add dashboard layout
Test Cases:
- Status with empty log
- Status with active log
- Status with templates loaded
Deliverable: python run.py status
Priority: HIGH | Estimated Time: 2-3 hours
Objective: Comprehensive pytest suite for pipeline validation
Implementation Areas:
- Test TreeNormalizer removes transient properties
- Test property mapping (label→name, title→name)
- Test NodeClassifier categorization
- Test NoiseFilters removes noise nodes
- Test signature determinism (same tree = same signature)
- Test TemplateLoader loads all YAML files
- Test TemplateValidator catches invalid templates
- Test template matching with mock trees
- Test scoring algorithm accuracy
- Test Matcher finds best match
- Test DiffEngine detects differences
- Test TransitionChecker validates transitions
- Test all drift scenarios from fixtures:
- MISSING_BUTTON_DRIFT → layout drift
- CONTENT_CHANGE_DRIFT → content drift
- INVALID_TRANSITION_DRIFT → sequence drift
- Test ImmutableLog append operation
- Test HashChain integrity
- Test hash verification
- Test EventWriter persistence
- Test log reading and parsing
- Test tampering detection
- Test full pipeline: capture → normalize → match → log
- Test with all fixture trees
- Test drift detection end-to-end
- Test log replay accuracy
Files to Modify:
tests/test_normalization.py- Expand with real teststests/test_baseline.py- Expand with real teststests/test_drift.py- Expand with real teststests/test_logging.py- Expand with real teststests/test_integration.py- Create new file
Test Execution:
pytest tests/ -v # Run all tests
pytest tests/test_drift.py -v # Run specific module
pytest tests/ -k "signature" # Run tests matching keyword
pytest tests/ --cov=core # With coverage reportDeliverable: 50+ passing tests with >80% coverage
Priority: MEDIUM | Estimated Time: 1-2 hours
Objective: Generate realistic accessibility event sequences for testing
Implementation Steps:
- Create event generator in
tests/fixtures/event_generator.py:- Generate window focus events
- Generate element click events
- Generate text input events
- Generate screen transition events
- Create event sequences for scenarios:
- Normal user flow (login → chat → logout)
- Drift injection (normal → manipulative upsell)
- State transition tests
- Integrate with AccessibilityListener mock mode
Files to Create:
tests/fixtures/event_generator.py- EventGenerator classtests/fixtures/event_sequences.py- Pre-built sequences
Test Cases:
- Generate 100 events → all have proper structure
- Generate with drift injection → drift detected
- Generate invalid transition → TransitionChecker catches
Deliverable: Reusable event generator for all testing scenarios
Priority: HIGH | Estimated Time: 1 hour
Objective: Ensure immutable log tamper-evidence works correctly
Implementation Steps:
- Create log with multiple entries
- Verify each entry links to previous
- Test tampering detection:
- Modify log entry → verify_integrity() fails
- Delete log entry → missing link detected
- Reorder entries → chain breaks
- Test genesis hash (first entry)
- Test hash chain reconstruction
Files to Modify:
tests/test_logging.py- Add chain validation testscore/logging/immutable_log.py- Addverify_integrity()method
Test Cases:
- Valid chain with 100 entries → passes
- Modified entry at position 50 → fails at 51
- Missing entry → gap detected
- Reordered entries → hash mismatch
Deliverable: Bulletproof tamper detection system
Priority: HIGH | Estimated Time: 1 hour
Objective: Wire all CLI commands with proper argument parsing
Implementation Steps:
- Update
interface/cli/main.pywith argparse:parser = argparse.ArgumentParser(prog='sz') subparsers = parser.add_subparsers() # simulate command sim_parser = subparsers.add_parser('simulate') sim_parser.add_argument('tree_path') sim_parser.add_argument('--template', '-t') # drift command drift_parser = subparsers.add_parser('drift') drift_parser.add_argument('--filter', choices=['layout', 'content', 'sequence', 'manipulative']) drift_parser.add_argument('--severity', choices=['info', 'warning', 'critical']) # ... etc
- Add command aliases (sz = systemzero)
- Add --help documentation for all commands
- Add --version flag
Files to Modify:
interface/cli/main.py- Implement full argparserun.py- Support direct command invocation
Test Cases:
python run.py --help→ shows all commandspython run.py simulate --help→ shows simulate usage- Invalid command → helpful error message
Deliverable: Professional CLI with full argument support
Phase 2 is complete when:
-
✅ CLI Commands Work
sz simulate <tree>runs full pipeline and displays resultssz driftdisplays and filters drift events from logsz replayreplays log entries with timelinesz statusshows system dashboard
-
✅ Test Suite Passes
- 50+ pytest tests passing
-
80% code coverage on core modules
- All drift scenarios detected correctly
- Hash chain validation works
-
✅ Mock Event System Works
- Can generate realistic event sequences
- Events properly trigger pipeline
- Drift injection works as expected
-
✅ Documentation Complete
- All commands documented with --help
- Test fixtures documented
- CHANGELOG updated with Phase 2 changes
-
✅ Validation Complete
- End-to-end pipeline tested with all fixtures
- Log integrity verified
- Template matching accuracy validated
- No critical bugs remaining
pip install rich pytest pytest-cov pyyaml- pytest - test framework
- pytest-cov - coverage reporting
- Rich - terminal formatting
- GitHub Actions - CI/CD (optional for Phase 2)
- CLI command implementation (straightforward)
- Test suite expansion (infrastructure exists)
- Display formatting (Rich library well-documented)
- Mock event generation (need realistic sequences)
- Hash chain validation edge cases
- Performance with large logs
- Start with simple event sequences, expand gradually
- Test hash chain with various tampering scenarios
- Add pagination/streaming for large log files
- Immediate: Start with Task 1 (cmd_simulate) - highest value, enables testing
- Then: Task 5 (test suite) - validates implementation quality
- Then: Task 2 & 3 (drift/replay viewers) - completes CLI
- Finally: Tasks 4, 6, 7, 8 - polish and validation
Estimated Total Time: 10-15 hours of development work
Phase 2: Mock Pipeline + Test Harness
[ ] Task 1: cmd_simulate implementation
[ ] Read JSON tree files
[ ] Run full pipeline
[ ] Display results with Rich
[ ] Test with all fixture trees
[ ] Task 2: cmd_drift implementation
[ ] Load and parse log files
[ ] Filter by type and severity
[ ] Display in formatted table
[ ] Show summary statistics
[ ] Task 3: cmd_replay implementation
[ ] Load log entries
[ ] Display with timeline
[ ] Support range selection
[ ] Verify hash chain during replay
[ ] Task 4: cmd_status implementation
[ ] Show configuration summary
[ ] Display recent activity
[ ] Format as dashboard
[ ] Task 5: Automated test suite
[ ] Normalization tests
[ ] Baseline tests
[ ] Drift detection tests
[ ] Logging tests
[ ] Integration tests
[ ] Achieve >80% coverage
[ ] Task 6: Mock event generation
[ ] Create EventGenerator
[ ] Build event sequences
[ ] Integrate with listener
[ ] Task 7: Hash chain validation
[ ] Implement verify_integrity()
[ ] Test tampering detection
[ ] Test chain reconstruction
[ ] Task 8: CLI integration
[ ] Full argparse implementation
[ ] Command aliases
[ ] Help documentation
[ ] Version flag
[ ] Documentation
[ ] Update CHANGELOG
[ ] Document CLI usage
[ ] Document test fixtures
[ ] Update ROADMAP
[ ] Final validation
[ ] All tests passing
[ ] End-to-end pipeline verified
[ ] No critical bugs
[ ] Ready for Phase 3Phase 2 Goal: Transform System//Zero from scaffolding into a fully testable, validated, and interactive tool for drift detection.