Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.ko.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ Runtime은 Forge `agent_manifest.json`을 선택적으로 읽어 기존 Lab-comp

이 기능은 reliable edge agent runtime 방향의 첫 Runtime-side contract입니다. `agent_id`, `task_id`, `agent_type`, priority, latency budget, queue wait, fallback usage, telemetry context를 기록하지만 기존 `result.json`의 top-level compare/report 필드는 변경하지 않습니다.

Runtime result JSON에는 `runtime_health_snapshot`, `runtime_error_classification`, `runtime_events`, `runtime_operation_summary`도 additive evidence로 기록됩니다. 이제 health snapshot은 backend availability, latency budget/deadline observation, tegrastats evidence availability와 `health_reason`을 함께 남기고, runtime events는 sequential `event_index`를 가진 lifecycle trace로 기록됩니다. `runtime_operation_summary`는 Lab/Orchestrator/AIGuard handoff용 compact index로 `risk_labels`, `evidence_gaps`, retryability, conservative `recommended_action`을 남기되 `decision_owner: lab`, `scheduler_owner: orchestrator`, `production_cancellation: false`를 유지합니다. `runtime_telemetry.coverage`는 expected / observed / missing telemetry fields를 기록하고 `comparability_owner: edgeenv`, `missing_telemetry_is_failure: false`를 명시합니다. `--timeout-ms`는 latency timeout 관측 기준을 남기는 옵션이며, production request cancellation을 의미하지 않습니다. 실행이 `skipped`로 끝나면 Runtime은 `runtime_execution_skipped`, `retryable: true`, `retry_hint: check_backend_availability`를 남겨 Lab/Orchestrator가 failure handling evidence로 해석할 수 있게 합니다.
Runtime result JSON에는 `runtime_health_snapshot`, `runtime_error_classification`, `runtime_events`, `runtime_operation_summary`도 additive evidence로 기록됩니다. 이제 health snapshot은 backend availability, latency budget/deadline observation, tegrastats evidence availability와 `health_reason`을 함께 남기고, runtime events는 sequential `event_index`를 가진 lifecycle trace로 기록됩니다. `runtime_operation_summary`는 Lab/Orchestrator/AIGuard handoff용 compact index로 `risk_labels`, `evidence_gaps`, retryability, conservative `recommended_action`을 남기되 `decision_owner: lab`, `scheduler_owner: orchestrator`, `production_cancellation: false`를 유지합니다. `runtime_telemetry.coverage`는 expected / observed / missing telemetry fields를 기록하고 `comparability_owner: edgeenv`, `missing_telemetry_is_failure: false`를 명시합니다. `runtime_telemetry.history_seed`는 `registry_owner: edgeenv`, `decision_owner: lab`, `production_monitoring: false`를 유지하며 EdgeEnv telemetry history accumulation으로 넘길 수 있는 single-result replay point를 제공합니다. `--timeout-ms`는 latency timeout 관측 기준을 남기는 옵션이며, production request cancellation을 의미하지 않습니다. 실행이 `skipped`로 끝나면 Runtime은 `runtime_execution_skipped`, `retryable: true`, `retry_hint: check_backend_availability`를 남겨 Lab/Orchestrator가 failure handling evidence로 해석할 수 있게 합니다.

예시:

Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -488,7 +488,7 @@ This is the first bridge toward the reliable edge agent runtime direction. It re
Runtime result JSON also includes additive operation evidence blocks:

- `runtime_health_snapshot`: execution health, backend/device context, backend availability, run count, latency/FPS summary, latency-budget/deadline observation, tegrastats evidence availability, and explicit timeout observation status. `--timeout-ms` records an observation threshold; it does not claim production request cancellation.
- `runtime_telemetry`: single-result telemetry seed for Runtime Intelligence history/replay. It records timestamp, execution sequence id, latency rolling seed values, power mode, tegrastats-derived resource evidence when available, operation flags, and explicit `missing_fields` for telemetry that the current device/run did not provide. This is local-first evidence, not a production monitoring stream.
- `runtime_telemetry`: single-result telemetry seed for Runtime Intelligence history/replay. It records timestamp, execution sequence id, latency rolling seed values, power mode, tegrastats-derived resource evidence when available, operation flags, and explicit `missing_fields` for telemetry that the current device/run did not provide. The additive `history_seed` block packages the same single-result evidence as a one-point replay seed for EdgeEnv telemetry history accumulation. This is local-first evidence, not a production monitoring stream.
- `runtime_error_classification`: structured success/error category, severity, retryability, retry hint, observed mean latency, and timeout budget for downstream report context. Skipped execution is recorded as `runtime_execution_skipped` with `retry_hint: check_backend_availability` so Lab/Orchestrator can explain runtime failure handling without treating Runtime as a worker daemon.
- `runtime_events`: compact indexed lifecycle event log for configuration, benchmark completion, error classification, optional agent context, telemetry recording, operation summary, and tegrastats parsing.
- `runtime_operation_summary`: compact handoff index for Lab/Orchestrator/AIGuard with `health_reason`, `risk_labels`, `evidence_gaps`, retryability, and a conservative `recommended_action`. It keeps `decision_owner: lab`, `scheduler_owner: orchestrator`, and `production_cancellation: false`.
Expand All @@ -501,6 +501,7 @@ Runtime Intelligence boundary:
- `collection_mode` starts as `single_result_export`; EdgeEnv owns telemetry history accumulation and comparability-first regression.
- Missing device telemetry remains explicit in `missing_fields` instead of being fabricated.
- `runtime_telemetry.coverage` records expected / observed / missing telemetry fields, with `comparability_owner: edgeenv` and `missing_telemetry_is_failure: false`.
- `runtime_telemetry.history_seed` uses `inferedge-runtime-telemetry-history-seed-v1`, keeps `registry_owner: edgeenv`, `decision_owner: lab`, `production_monitoring: false`, and exposes a single replay point that EdgeEnv can later accumulate into a local telemetry history.
- Runtime exports telemetry evidence only. AIGuard may turn it into deterministic anomaly evidence, and Lab remains the deployment decision owner.

The committed fixture
Expand Down
1 change: 1 addition & 0 deletions docs/agent_runtime_result_contract.md
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,7 @@ When provided, Runtime appends:
- `runtime_operation_summary` is an additive handoff index for Lab/Orchestrator/AIGuard. It repeats the health reason, retryability, risk labels, evidence gaps, and a conservative `recommended_action` without making the deployment decision itself.
- `runtime_operation_summary.decision_owner` must remain `lab`, and `scheduler_owner` must remain `orchestrator`.
- `runtime_operation_summary.production_cancellation` is always `false`; Runtime records observations only.
- `runtime_telemetry.history_seed` is an additive `inferedge-runtime-telemetry-history-seed-v1` block for EdgeEnv telemetry history/replay. It keeps `registry_owner: edgeenv`, `decision_owner: lab`, `production_monitoring: false`, and one single-result telemetry point so downstream tools can accumulate history without Runtime becoming a telemetry store.
- Runtime does not claim production request cancellation. `--timeout-ms` is an observation threshold: if a successful benchmark mean latency exceeds the configured threshold, Runtime records `timeout_observed: true`, `runtime_error_classification.category: runtime_timeout_observed`, and `retryable: true` for downstream reliability reporting.
- If execution is skipped because Runtime cannot complete the configured benchmark, Runtime records `runtime_error_classification.category: runtime_execution_skipped`, `severity: warning`, `retryable: true`, and `retry_hint: check_backend_availability`. This is failure-handling evidence for Lab/Orchestrator reporting, not a production worker retry loop.
- Without `--timeout-ms`, results record `timeout_policy: not_configured`, `timeout_budget_ms: null`, and `timeout_observed: false`.
Expand Down
22 changes: 22 additions & 0 deletions scripts/smoke_default.sh
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,22 @@ assert "queue_depth" in coverage["expected_fields"], coverage
assert "queue_depth" in coverage["missing_fields"], coverage
assert "telemetry_timestamp" in coverage["observed_fields"], coverage
assert coverage["missing_fields"] == telemetry["missing_fields"], coverage
history_seed = telemetry["history_seed"]
assert history_seed["schema_version"] == "inferedge-runtime-telemetry-history-seed-v1", history_seed
assert history_seed["registry_owner"] == "edgeenv", history_seed
assert history_seed["decision_owner"] == "lab", history_seed
assert history_seed["source_telemetry_schema_version"] == telemetry["schema_version"], history_seed
assert history_seed["production_monitoring"] is False, history_seed
assert history_seed["missing_telemetry_is_failure"] is False, history_seed
assert history_seed["replay_ready"] is True, history_seed
assert "compare_key" in history_seed["recommended_registry_key_fields"], history_seed
assert "latency.mean_ms" in history_seed["time_series_fields"], history_seed
assert history_seed["source_result"]["compare_key"] == data["compare_key"], history_seed
assert history_seed["source_result"]["backend_key"] == data["backend_key"], history_seed
assert history_seed["points"][0]["telemetry_timestamp"] == telemetry["telemetry_timestamp"], history_seed
assert history_seed["points"][0]["execution_sequence_id"] == telemetry["execution_sequence_id"], history_seed
assert history_seed["points"][0]["mean_ms"] == telemetry["latency"]["mean_ms"], history_seed
assert history_seed["points"][0]["timeout_observed"] == telemetry["operation"]["timeout_observed"], history_seed
assert events["runtime_telemetry_recorded"]["observed_field_count"] == coverage["observed_field_count"]
assert events["runtime_telemetry_recorded"]["missing_field_count"] == coverage["missing_field_count"]
assert events["runtime_telemetry_recorded"]["schema"] == "inferedge-runtime-telemetry-v1"
Expand Down Expand Up @@ -178,6 +194,12 @@ coverage = telemetry["coverage"]
assert coverage["schema_version"] == "inferedge-runtime-telemetry-coverage-v1", coverage
assert coverage["comparability_owner"] == "edgeenv", coverage
assert coverage["missing_fields"] == telemetry["missing_fields"], coverage
history_seed = telemetry["history_seed"]
assert history_seed["registry_owner"] == "edgeenv", history_seed
assert history_seed["decision_owner"] == "lab", history_seed
assert history_seed["source_result"]["compare_key"] == data["compare_key"], history_seed
assert history_seed["points"][0]["p99_ms"] == telemetry["latency"]["p99_ms"], history_seed
assert history_seed["points"][0]["deadline_missed"] == telemetry["operation"]["deadline_missed"], history_seed
assert "runtime_telemetry_recorded" in events, events
assert data["extra"]["agent_manifest_recorded"] is True
PY
Expand Down
127 changes: 127 additions & 0 deletions src/result_writer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -560,6 +560,122 @@ void write_runtime_telemetry_coverage_json(
<< indent << "}";
}

void write_runtime_telemetry_history_seed_json(
std::ostream& output,
const RuntimeConfig& config,
const EngineMetadata& engine_metadata,
const BenchmarkResult& benchmark_result,
const TegrastatsSummary& tegrastats_summary,
const std::string& timestamp,
int indent_spaces) {
const std::string indent(static_cast<std::size_t>(indent_spaces), ' ');
const bool has_tegrastats = tegrastats_summary.status == "parsed";
const std::string precision = config.manifest_precision.empty() ? "fp32" : config.manifest_precision;
output
<< "{\n"
<< indent << " \"schema_version\": \"inferedge-runtime-telemetry-history-seed-v1\",\n"
<< indent << " \"evidence_role\": \"runtime_telemetry_history_seed\",\n"
<< indent << " \"registry_owner\": \"edgeenv\",\n"
<< indent << " \"decision_owner\": \"lab\",\n"
<< indent << " \"source_result_schema_version\": \"inferedge-runtime-result-v1\",\n"
<< indent << " \"source_telemetry_schema_version\": \"inferedge-runtime-telemetry-v1\",\n"
<< indent << " \"replay_scope\": \"single_result_to_history\",\n"
<< indent << " \"replay_ready\": true,\n"
<< indent << " \"production_monitoring\": false,\n"
<< indent << " \"missing_telemetry_is_failure\": false,\n"
<< indent << " \"source_result\": {\n"
<< indent << " \"compare_key\": " << json_string(make_compare_key(config)) << ",\n"
<< indent << " \"backend_key\": " << json_string(make_backend_key(engine_metadata, config)) << ",\n"
<< indent << " \"engine_backend\": " << json_string(engine_metadata.backend) << ",\n"
<< indent << " \"device\": " << json_string(config.device) << ",\n"
<< indent << " \"precision\": " << json_string(precision) << ",\n"
<< indent << " \"power_mode\": " << json_string(config.power_mode) << "\n"
<< indent << " },\n"
<< indent << " \"recommended_registry_key_fields\": ";
write_string_array_json(output, {
"compare_key",
"backend_key",
"device",
"precision",
"power_mode",
"run_config",
});
output
<< ",\n"
<< indent << " \"time_series_fields\": ";
write_string_array_json(output, {
"telemetry_timestamp",
"execution_sequence_id",
"latency.mean_ms",
"latency.p95_ms",
"latency.p99_ms",
"latency.fps",
"latency.inference_interval_ms",
"latency.rolling_latency_mean_ms",
"latency.rolling_latency_std_ms",
"resource.ram_used_mb",
"resource.max_temperature_c",
"resource.vdd_in_mw_avg",
"operation.queue_depth",
"operation.runtime_uptime_sec",
"operation.timeout_observed",
"operation.latency_budget_exceeded",
"operation.deadline_missed",
});
output
<< ",\n"
<< indent << " \"points\": [\n"
<< indent << " {\n"
<< indent << " \"execution_sequence_id\": 0,\n"
<< indent << " \"telemetry_timestamp\": " << json_string(timestamp) << ",\n"
<< indent << " \"mean_ms\": " << benchmark_result.mean_ms << ",\n"
<< indent << " \"p95_ms\": " << benchmark_result.p95_ms << ",\n"
<< indent << " \"p99_ms\": " << benchmark_result.p99_ms << ",\n"
<< indent << " \"fps\": " << benchmark_result.fps << ",\n"
<< indent << " \"inference_interval_ms\": " << benchmark_result.mean_ms << ",\n"
<< indent << " \"rolling_latency_mean_ms\": " << benchmark_result.mean_ms << ",\n"
<< indent << " \"rolling_latency_std_ms\": " << benchmark_result.std_ms << ",\n"
<< indent << " \"ram_used_mb\": ";
if (has_tegrastats) {
output << tegrastats_summary.ram_used_mb_max;
} else {
output << "null";
}
output
<< ",\n"
<< indent << " \"max_temperature_c\": ";
if (has_tegrastats) {
output << tegrastats_summary.max_temp_c;
} else {
output << "null";
}
output
<< ",\n"
<< indent << " \"vdd_in_mw_avg\": ";
if (has_tegrastats) {
output << tegrastats_summary.vdd_in_mw_avg;
} else {
output << "null";
}
output
<< ",\n"
<< indent << " \"queue_depth\": null,\n"
<< indent << " \"runtime_uptime_sec\": null,\n"
<< indent << " \"timeout_observed\": "
<< (timeout_observed(config, benchmark_result) ? "true" : "false") << ",\n"
<< indent << " \"latency_budget_exceeded\": "
<< (latency_budget_exceeded(config, benchmark_result) ? "true" : "false") << ",\n"
<< indent << " \"deadline_missed\": "
<< (should_mark_deadline_missed(config, benchmark_result) ? "true" : "false") << ",\n"
<< indent << " \"power_mode\": " << json_string(config.power_mode) << ",\n"
<< indent << " \"telemetry_source\": "
<< json_string(has_tegrastats ? "tegrastats" : "not_available") << ",\n"
<< indent << " \"tegrastats_status\": " << json_string(tegrastats_summary.status) << "\n"
<< indent << " }\n"
<< indent << " ]\n"
<< indent << "}";
}

std::string runtime_operation_recommended_action(
const RuntimeConfig& config,
const EngineMetadata& engine_metadata,
Expand Down Expand Up @@ -788,6 +904,17 @@ void write_runtime_telemetry_json(
<< ",\n"
<< indent << " \"coverage\": ";
write_runtime_telemetry_coverage_json(output, tegrastats_summary, indent_spaces + 2);
output
<< ",\n"
<< indent << " \"history_seed\": ";
write_runtime_telemetry_history_seed_json(
output,
config,
engine_metadata,
benchmark_result,
tegrastats_summary,
timestamp,
indent_spaces + 2);
output
<< ",\n"
<< indent << " \"production_monitoring\": false\n"
Expand Down
33 changes: 33 additions & 0 deletions tests/test_agent_runtime_result_contract.py
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,39 @@ def test_runtime_output_records_optional_agent_block_when_manifest_is_provided(s
coverage["missing_field_count"],
len(coverage["missing_fields"]),
)
history_seed = telemetry["history_seed"]
self.assertEqual(
history_seed["schema_version"],
"inferedge-runtime-telemetry-history-seed-v1",
)
self.assertEqual(history_seed["registry_owner"], "edgeenv")
self.assertEqual(history_seed["decision_owner"], "lab")
self.assertEqual(
history_seed["source_telemetry_schema_version"],
telemetry["schema_version"],
)
self.assertFalse(history_seed["production_monitoring"])
self.assertFalse(history_seed["missing_telemetry_is_failure"])
self.assertTrue(history_seed["replay_ready"])
self.assertIn("compare_key", history_seed["recommended_registry_key_fields"])
self.assertIn("latency.mean_ms", history_seed["time_series_fields"])
self.assertEqual(
history_seed["source_result"]["compare_key"],
result["compare_key"],
)
self.assertEqual(
history_seed["source_result"]["backend_key"],
result["backend_key"],
)
self.assertEqual(history_seed["source_result"]["precision"], result["precision"])
self.assertEqual(history_seed["source_result"]["power_mode"], result["run_config"]["power_mode"])
point = history_seed["points"][0]
self.assertEqual(point["execution_sequence_id"], telemetry["execution_sequence_id"])
self.assertEqual(point["telemetry_timestamp"], telemetry["telemetry_timestamp"])
self.assertEqual(point["mean_ms"], telemetry["latency"]["mean_ms"])
self.assertEqual(point["p99_ms"], telemetry["latency"]["p99_ms"])
self.assertEqual(point["timeout_observed"], telemetry["operation"]["timeout_observed"])
self.assertEqual(point["deadline_missed"], telemetry["operation"]["deadline_missed"])

extra = result["extra"]
self.assertTrue(extra["agent_manifest_recorded"])
Expand Down
Loading
Loading