Skip to content

E2E agentic testing pipeline for CC version validation #4

@buildoak

Description

@buildoak

Summary

Tier 2 testing: a semi-automated pipeline that spins up a real wet claude session in tmux, sends a test prompt, validates the full proxy -> statusline -> compression stack, and reports results.

Pipeline

  1. Open tmux session wet-e2e-test
  2. Start wet claude inside it
  3. Inject minimal test prompt via session-ctl
  4. Poll /_wet/status until session_requests > 0 (timeout 60s)
  5. Validate: requests >= 2, input_tokens > 0, context_window > 0, statusline renders
  6. Optional: run wet compress on the single tool result to validate compression
  7. Collect results, tear down session
  8. Report: pass/fail with diagnostic details

Trigger modes

  • Manual: wet check --e2e or user triggers via TG
  • Automatic: Fired by cron watchdog when static checks detect a new CC version

Cost

~1 minimal API prompt per run (~500 input tokens). Negligible even if daily.

Semi-manual aspect

Pipeline runs and produces a report. User reviews. No auto-remediation.

Implementation

  • scripts/e2e-test.sh or wet check --e2e subcommand
  • Depends on session-ctl for prompt injection
  • Results to ~/.wet/compat-checks.log and optionally TG notification

Priority

Phase 2 — build after static checks (wet check) are solid.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions