Skip to content

Add reporting, retry lifecycle, and dry-run functionality#231

Draft
timkpaine wants to merge 1 commit into
mainfrom
tkp/rep
Draft

Add reporting, retry lifecycle, and dry-run functionality#231
timkpaine wants to merge 1 commit into
mainfrom
tkp/rep

Conversation

@timkpaine

Copy link
Copy Markdown
Member

Introduce a structured reporting layer that captures evaluation metadata, timing, topology, retries, and failures without consuming result payloads, mirroring the existing evaluator/policy architecture.

  • ReportingPolicy shared core with span/run ContextVars for nested, thread/async-local isolation
  • ReportEvent model plus NoOp/InMemory/Logging/Composite/UI reporters and a bounded UI polling buffer
  • Tracing/metrics/alerts policies and OpenTelemetry tracing/metrics integration (exposed via otel/full/develop/test extras)
  • Structural reporting models and a Reporting{Evaluator,Model} taxonomy with placeholder vendor classes
  • Refactor LoggingEvaluator onto LoggingPolicy to share formatting and enable LoggingModel while preserving the existing import path and log output
  • Retry lifecycle events now carry run_id and child depth via current_span_depth(); reporter failures are isolated on reporting/retry paths
  • DryRunEvaluator with context-local planning guard; synthetic mode is non-transparent so results are not cached under real-run keys; node_key strips the dry-run evaluator layer while preserving non-evaluator options so it matches cache_key() for the logical node
  • ReportingStateStore preserves terminal outcomes while allowing retry streams to progress
  • Docs: reporting workflow, reporter options, OpenTelemetry install, reserved run/graph phases, extra payload keys, and dry-run synthetic-result warnings
  • Tests across utils/evaluators/models covering success/error flows, dry-run override recursion, cache composition, concurrent dry-run reuse, node-key semantics, retry event nesting, reporter failure isolation, and state folding

Introduce a structured reporting layer that captures evaluation metadata,
timing, topology, retries, and failures without consuming result payloads,
mirroring the existing evaluator/policy architecture.

- ReportingPolicy shared core with span/run ContextVars for nested,
  thread/async-local isolation
- ReportEvent model plus NoOp/InMemory/Logging/Composite/UI reporters and
  a bounded UI polling buffer
- Tracing/metrics/alerts policies and OpenTelemetry tracing/metrics
  integration (exposed via otel/full/develop/test extras)
- Structural reporting models and a <Vendor><Signal>Reporting{Evaluator,Model}
  taxonomy with placeholder vendor classes
- Refactor LoggingEvaluator onto LoggingPolicy to share formatting and
  enable LoggingModel while preserving the existing import path and log output
- Retry lifecycle events now carry run_id and child depth via
  current_span_depth(); reporter failures are isolated on reporting/retry paths
- DryRunEvaluator with context-local planning guard; synthetic mode is
  non-transparent so results are not cached under real-run keys; node_key
  strips the dry-run evaluator layer while preserving non-evaluator options
  so it matches cache_key() for the logical node
- ReportingStateStore preserves terminal outcomes while allowing retry streams
  to progress
- Docs: reporting workflow, reporter options, OpenTelemetry install, reserved
  run/graph phases, extra payload keys, and dry-run synthetic-result warnings
- Tests across utils/evaluators/models covering success/error flows, dry-run
  override recursion, cache composition, concurrent dry-run reuse, node-key
  semantics, retry event nesting, reporter failure isolation, and state folding
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Test Results

  1 files  ± 0    1 suites  ±0   2m 38s ⏱️ -1s
962 tests +83  960 ✅ +83  2 💤 ±0  0 ❌ ±0 
968 runs  +83  966 ✅ +83  2 💤 ±0  0 ❌ ±0 

Results for commit fe2c974. ± Comparison against base commit 0662c2a.

♻️ This comment has been updated with latest results.

@codecov

codecov Bot commented Jun 8, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.73540% with 38 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.40%. Comparing base (0662c2a) to head (fe2c974).

Files with missing lines Patch % Lines
ccflow/utils/reporting.py 93.57% 17 Missing and 8 partials ⚠️
ccflow/evaluators/reporting.py 91.46% 5 Missing and 2 partials ⚠️
ccflow/tests/evaluators/test_reporting.py 98.07% 4 Missing ⚠️
ccflow/tests/models/test_reporting.py 97.01% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #231      +/-   ##
==========================================
+ Coverage   94.19%   94.40%   +0.21%     
==========================================
  Files         150      156       +6     
  Lines       12094    13196    +1102     
  Branches      665      706      +41     
==========================================
+ Hits        11392    12458    +1066     
- Misses        570      596      +26     
- Partials      132      142      +10     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant