feat: add eval progress tracking by joshuatowner · Pull Request #5858 · aws/sagemaker-python-sdk

joshuatowner · 2026-05-14T21:07:16Z

Add eval progress tracking

Replaces the Rich Live spinner in the terminal wait() path with plain-text status prints that stream correctly to agents and piped processes.

Problem

The eval wait() terminal path used a Rich Live spinner that:

Outputs ANSI escape sequences agents can't parse
Uses transient=True so output disappears after completion
Only shows "Current status: Executing" with no step details

Changes

Remove Rich Live spinner from terminal wait() path
Print pipeline name and execution ARN at start
Print step transitions with ✓/⋯, durations, and job ARNs on each poll
Print "Running... Xs" elapsed timer for in-progress steps
Print S3 output path on success
Print failed step details with log group and CloudWatch link on failure
All prints use flush=True for agent compatibility
Log group derived from job ARN type (training/processing/transform)

Jupyter path is unchanged.

Example output

Evaluation started: eval-meta-5190c33c
Pipeline: SagemakerEvaluation-LLMAJEvaluation-565d3249
Execution ARN: arn:aws:sagemaker:us-west-2:099324990371:pipeline/.../execution/rstzamudixmz

--------------------------------------

Status Transitions:
  ✓ CreateEvaluationAction: Succeeded (3.0s)
  ⋯ EvaluateCustomInferenceModel: Executing (Running... 17s)
    Job ARN: arn:aws:sagemaker:us-west-2:099324990371:training-job/CustomInference-rstzamudixmz-6ZP4YbszZc

Status: Executing (Elapsed: 21.5s)

On failure:

Failed
Failure reason: Step failure: One or multiple steps failed.

Failed step: EvaluateCustomInferenceModel
Failure reason: ClientError: No S3 objects found under S3 URL...
Job ARN: arn:aws:sagemaker:us-west-2:099324990371:training-job/CustomInference-xyz
Log group: /aws/sagemaker/TrainingJobs
Log stream prefix: CustomInference-xyz
CloudWatch Logs: https://us-west-2.console.aws.amazon.com/cloudwatch/...

Testing

382 existing unit tests pass (0 regressions)
4 new tests covering start, step transitions, success, and failure output
Manually validated with real LLM-as-Judge eval jobs

…ing training format)

joshuatowner had a problem deploying to manual-approval May 14, 2026 21:07 — with GitHub Actions Error

joshuatowner changed the title ~~feat: replace eval Rich spinner with plain print observability (match…~~ feat: replace eval Rich spinner with rich terminal output matching training May 14, 2026

joshuatowner force-pushed the eval-observability branch from 42a3f42 to 4d20106 Compare May 15, 2026 06:40

joshuatowner had a problem deploying to manual-approval May 15, 2026 06:41 — with GitHub Actions Error

feat: replace eval Rich spinner with plain print observability (match…

436bdfe

…ing training format)

joshuatowner force-pushed the eval-observability branch from 4d20106 to 436bdfe Compare May 15, 2026 06:53

joshuatowner had a problem deploying to manual-approval May 15, 2026 06:53 — with GitHub Actions Error

joshuatowner changed the title ~~feat: replace eval Rich spinner with rich terminal output matching training~~ feat: add eval progress tracking May 15, 2026

Merge branch 'master' into eval-observability

d701a67

joshuatowner had a problem deploying to manual-approval May 18, 2026 07:39 — with GitHub Actions Error

joshuatowner marked this pull request as ready for review May 18, 2026 07:39

joshuatowner had a problem deploying to manual-approval May 18, 2026 07:39 — with GitHub Actions Error

Merge branch 'master' into eval-observability

d554b48

joshuatowner had a problem deploying to manual-approval May 18, 2026 23:47 — with GitHub Actions Failure

joshuatowner temporarily deployed to manual-approval May 18, 2026 23:48 — with GitHub Actions Inactive

jam-jee approved these changes May 19, 2026

View reviewed changes

jam-jee merged commit 9c9c50d into aws:master May 19, 2026
19 of 31 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add eval progress tracking#5858

feat: add eval progress tracking#5858
jam-jee merged 3 commits into
aws:masterfrom
joshuatowner:eval-observability

joshuatowner commented May 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

joshuatowner commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add eval progress tracking

Problem

Changes

Example output

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joshuatowner commented May 14, 2026 •

edited

Loading