gwonxhj · hyeokjun32 · May 24, 2026 · May 24, 2026
diff --git a/README.ko.md b/README.ko.md
@@ -75,7 +75,7 @@ Runtime은 Forge `agent_manifest.json`을 선택적으로 읽어 기존 Lab-comp
 
 이 기능은 reliable edge agent runtime 방향의 첫 Runtime-side contract입니다. `agent_id`, `task_id`, `agent_type`, priority, latency budget, queue wait, fallback usage, telemetry context를 기록하지만 기존 `result.json`의 top-level compare/report 필드는 변경하지 않습니다.
 
-Runtime result JSON에는 `runtime_health_snapshot`, `runtime_error_classification`, `runtime_events`, `runtime_operation_summary`도 additive evidence로 기록됩니다. 이제 health snapshot은 backend availability, latency budget/deadline observation, tegrastats evidence availability와 `health_reason`을 함께 남기고, runtime events는 sequential `event_index`를 가진 lifecycle trace로 기록됩니다. `runtime_operation_summary`는 Lab/Orchestrator/AIGuard handoff용 compact index로 `risk_labels`, `evidence_gaps`, retryability, conservative `recommended_action`을 남기되 `decision_owner: lab`, `scheduler_owner: orchestrator`, `production_cancellation: false`를 유지합니다. `runtime_telemetry.coverage`는 expected / observed / missing telemetry fields를 기록하고 `comparability_owner: edgeenv`, `missing_telemetry_is_failure: false`를 명시합니다. `--timeout-ms`는 latency timeout 관측 기준을 남기는 옵션이며, production request cancellation을 의미하지 않습니다. 실행이 `skipped`로 끝나면 Runtime은 `runtime_execution_skipped`, `retryable: true`, `retry_hint: check_backend_availability`를 남겨 Lab/Orchestrator가 failure handling evidence로 해석할 수 있게 합니다.
+Runtime result JSON에는 `runtime_health_snapshot`, `runtime_error_classification`, `runtime_events`, `runtime_operation_summary`도 additive evidence로 기록됩니다. 이제 health snapshot은 backend availability, latency budget/deadline observation, tegrastats evidence availability와 `health_reason`을 함께 남기고, runtime events는 sequential `event_index`를 가진 lifecycle trace로 기록됩니다. `runtime_operation_summary`는 Lab/Orchestrator/AIGuard handoff용 compact index로 `risk_labels`, `evidence_gaps`, retryability, conservative `recommended_action`을 남기되 `decision_owner: lab`, `scheduler_owner: orchestrator`, `production_cancellation: false`를 유지합니다. `runtime_telemetry.coverage`는 expected / observed / missing telemetry fields를 기록하고 `comparability_owner: edgeenv`, `missing_telemetry_is_failure: false`를 명시합니다. `runtime_telemetry.history_seed`는 `registry_owner: edgeenv`, `decision_owner: lab`, `production_monitoring: false`를 유지하며 EdgeEnv telemetry history accumulation으로 넘길 수 있는 single-result replay point를 제공합니다. `--timeout-ms`는 latency timeout 관측 기준을 남기는 옵션이며, production request cancellation을 의미하지 않습니다. 실행이 `skipped`로 끝나면 Runtime은 `runtime_execution_skipped`, `retryable: true`, `retry_hint: check_backend_availability`를 남겨 Lab/Orchestrator가 failure handling evidence로 해석할 수 있게 합니다.
 
 예시:
 

diff --git a/README.md b/README.md
@@ -488,7 +488,7 @@ This is the first bridge toward the reliable edge agent runtime direction. It re
 Runtime result JSON also includes additive operation evidence blocks:
 
 - `runtime_health_snapshot`: execution health, backend/device context, backend availability, run count, latency/FPS summary, latency-budget/deadline observation, tegrastats evidence availability, and explicit timeout observation status. `--timeout-ms` records an observation threshold; it does not claim production request cancellation.
-- `runtime_telemetry`: single-result telemetry seed for Runtime Intelligence history/replay. It records timestamp, execution sequence id, latency rolling seed values, power mode, tegrastats-derived resource evidence when available, operation flags, and explicit `missing_fields` for telemetry that the current device/run did not provide. This is local-first evidence, not a production monitoring stream.
+- `runtime_telemetry`: single-result telemetry seed for Runtime Intelligence history/replay. It records timestamp, execution sequence id, latency rolling seed values, power mode, tegrastats-derived resource evidence when available, operation flags, and explicit `missing_fields` for telemetry that the current device/run did not provide. The additive `history_seed` block packages the same single-result evidence as a one-point replay seed for EdgeEnv telemetry history accumulation. This is local-first evidence, not a production monitoring stream.
 - `runtime_error_classification`: structured success/error category, severity, retryability, retry hint, observed mean latency, and timeout budget for downstream report context. Skipped execution is recorded as `runtime_execution_skipped` with `retry_hint: check_backend_availability` so Lab/Orchestrator can explain runtime failure handling without treating Runtime as a worker daemon.
 - `runtime_events`: compact indexed lifecycle event log for configuration, benchmark completion, error classification, optional agent context, telemetry recording, operation summary, and tegrastats parsing.
 - `runtime_operation_summary`: compact handoff index for Lab/Orchestrator/AIGuard with `health_reason`, `risk_labels`, `evidence_gaps`, retryability, and a conservative `recommended_action`. It keeps `decision_owner: lab`, `scheduler_owner: orchestrator`, and `production_cancellation: false`.
@@ -501,6 +501,7 @@ Runtime Intelligence boundary:
 - `collection_mode` starts as `single_result_export`; EdgeEnv owns telemetry history accumulation and comparability-first regression.
 - Missing device telemetry remains explicit in `missing_fields` instead of being fabricated.
 - `runtime_telemetry.coverage` records expected / observed / missing telemetry fields, with `comparability_owner: edgeenv` and `missing_telemetry_is_failure: false`.
+- `runtime_telemetry.history_seed` uses `inferedge-runtime-telemetry-history-seed-v1`, keeps `registry_owner: edgeenv`, `decision_owner: lab`, `production_monitoring: false`, and exposes a single replay point that EdgeEnv can later accumulate into a local telemetry history.
 - Runtime exports telemetry evidence only. AIGuard may turn it into deterministic anomaly evidence, and Lab remains the deployment decision owner.
 
 The committed fixture

diff --git a/docs/agent_runtime_result_contract.md b/docs/agent_runtime_result_contract.md
@@ -257,6 +257,7 @@ When provided, Runtime appends:
 - `runtime_operation_summary` is an additive handoff index for Lab/Orchestrator/AIGuard. It repeats the health reason, retryability, risk labels, evidence gaps, and a conservative `recommended_action` without making the deployment decision itself.
 - `runtime_operation_summary.decision_owner` must remain `lab`, and `scheduler_owner` must remain `orchestrator`.
 - `runtime_operation_summary.production_cancellation` is always `false`; Runtime records observations only.
+- `runtime_telemetry.history_seed` is an additive `inferedge-runtime-telemetry-history-seed-v1` block for EdgeEnv telemetry history/replay. It keeps `registry_owner: edgeenv`, `decision_owner: lab`, `production_monitoring: false`, and one single-result telemetry point so downstream tools can accumulate history without Runtime becoming a telemetry store.
 - Runtime does not claim production request cancellation. `--timeout-ms` is an observation threshold: if a successful benchmark mean latency exceeds the configured threshold, Runtime records `timeout_observed: true`, `runtime_error_classification.category: runtime_timeout_observed`, and `retryable: true` for downstream reliability reporting.
 - If execution is skipped because Runtime cannot complete the configured benchmark, Runtime records `runtime_error_classification.category: runtime_execution_skipped`, `severity: warning`, `retryable: true`, and `retry_hint: check_backend_availability`. This is failure-handling evidence for Lab/Orchestrator reporting, not a production worker retry loop.
 - Without `--timeout-ms`, results record `timeout_policy: not_configured`, `timeout_budget_ms: null`, and `timeout_observed: false`.

diff --git a/scripts/smoke_default.sh b/scripts/smoke_default.sh
@@ -106,6 +106,22 @@ assert "queue_depth" in coverage["expected_fields"], coverage
 assert "queue_depth" in coverage["missing_fields"], coverage
 assert "telemetry_timestamp" in coverage["observed_fields"], coverage
 assert coverage["missing_fields"] == telemetry["missing_fields"], coverage
+history_seed = telemetry["history_seed"]
+assert history_seed["schema_version"] == "inferedge-runtime-telemetry-history-seed-v1", history_seed
+assert history_seed["registry_owner"] == "edgeenv", history_seed
+assert history_seed["decision_owner"] == "lab", history_seed
+assert history_seed["source_telemetry_schema_version"] == telemetry["schema_version"], history_seed
+assert history_seed["production_monitoring"] is False, history_seed
+assert history_seed["missing_telemetry_is_failure"] is False, history_seed
+assert history_seed["replay_ready"] is True, history_seed
+assert "compare_key" in history_seed["recommended_registry_key_fields"], history_seed
+assert "latency.mean_ms" in history_seed["time_series_fields"], history_seed
+assert history_seed["source_result"]["compare_key"] == data["compare_key"], history_seed
+assert history_seed["source_result"]["backend_key"] == data["backend_key"], history_seed
+assert history_seed["points"][0]["telemetry_timestamp"] == telemetry["telemetry_timestamp"], history_seed
+assert history_seed["points"][0]["execution_sequence_id"] == telemetry["execution_sequence_id"], history_seed
+assert history_seed["points"][0]["mean_ms"] == telemetry["latency"]["mean_ms"], history_seed
+assert history_seed["points"][0]["timeout_observed"] == telemetry["operation"]["timeout_observed"], history_seed
 assert events["runtime_telemetry_recorded"]["observed_field_count"] == coverage["observed_field_count"]
 assert events["runtime_telemetry_recorded"]["missing_field_count"] == coverage["missing_field_count"]
 assert events["runtime_telemetry_recorded"]["schema"] == "inferedge-runtime-telemetry-v1"
@@ -178,6 +194,12 @@ coverage = telemetry["coverage"]
 assert coverage["schema_version"] == "inferedge-runtime-telemetry-coverage-v1", coverage
 assert coverage["comparability_owner"] == "edgeenv", coverage
 assert coverage["missing_fields"] == telemetry["missing_fields"], coverage
+history_seed = telemetry["history_seed"]
+assert history_seed["registry_owner"] == "edgeenv", history_seed
+assert history_seed["decision_owner"] == "lab", history_seed
+assert history_seed["source_result"]["compare_key"] == data["compare_key"], history_seed
+assert history_seed["points"][0]["p99_ms"] == telemetry["latency"]["p99_ms"], history_seed
+assert history_seed["points"][0]["deadline_missed"] == telemetry["operation"]["deadline_missed"], history_seed
 assert "runtime_telemetry_recorded" in events, events
 assert data["extra"]["agent_manifest_recorded"] is True
 PY

diff --git a/src/result_writer.cpp b/src/result_writer.cpp
@@ -560,6 +560,122 @@ void write_runtime_telemetry_coverage_json(
         << indent << "}";
 }
 
+void write_runtime_telemetry_history_seed_json(
+    std::ostream& output,
+    const RuntimeConfig& config,
+    const EngineMetadata& engine_metadata,
+    const BenchmarkResult& benchmark_result,
+    const TegrastatsSummary& tegrastats_summary,
+    const std::string& timestamp,
+    int indent_spaces) {
+    const std::string indent(static_cast<std::size_t>(indent_spaces), ' ');
+    const bool has_tegrastats = tegrastats_summary.status == "parsed";
+    const std::string precision = config.manifest_precision.empty() ? "fp32" : config.manifest_precision;
+    output
+        << "{\n"
+        << indent << "  \"schema_version\": \"inferedge-runtime-telemetry-history-seed-v1\",\n"
+        << indent << "  \"evidence_role\": \"runtime_telemetry_history_seed\",\n"
+        << indent << "  \"registry_owner\": \"edgeenv\",\n"
+        << indent << "  \"decision_owner\": \"lab\",\n"
+        << indent << "  \"source_result_schema_version\": \"inferedge-runtime-result-v1\",\n"
+        << indent << "  \"source_telemetry_schema_version\": \"inferedge-runtime-telemetry-v1\",\n"
+        << indent << "  \"replay_scope\": \"single_result_to_history\",\n"
+        << indent << "  \"replay_ready\": true,\n"
+        << indent << "  \"production_monitoring\": false,\n"
+        << indent << "  \"missing_telemetry_is_failure\": false,\n"
+        << indent << "  \"source_result\": {\n"
+        << indent << "    \"compare_key\": " << json_string(make_compare_key(config)) << ",\n"
+        << indent << "    \"backend_key\": " << json_string(make_backend_key(engine_metadata, config)) << ",\n"
+        << indent << "    \"engine_backend\": " << json_string(engine_metadata.backend) << ",\n"
+        << indent << "    \"device\": " << json_string(config.device) << ",\n"
+        << indent << "    \"precision\": " << json_string(precision) << ",\n"
+        << indent << "    \"power_mode\": " << json_string(config.power_mode) << "\n"
+        << indent << "  },\n"
+        << indent << "  \"recommended_registry_key_fields\": ";
+    write_string_array_json(output, {
+        "compare_key",
+        "backend_key",
+        "device",
+        "precision",
+        "power_mode",
+        "run_config",
+    });
+    output
+        << ",\n"
+        << indent << "  \"time_series_fields\": ";
+    write_string_array_json(output, {
+        "telemetry_timestamp",
+        "execution_sequence_id",
+        "latency.mean_ms",
+        "latency.p95_ms",
+        "latency.p99_ms",
+        "latency.fps",
+        "latency.inference_interval_ms",
+        "latency.rolling_latency_mean_ms",
+        "latency.rolling_latency_std_ms",
+        "resource.ram_used_mb",
+        "resource.max_temperature_c",
+        "resource.vdd_in_mw_avg",
+        "operation.queue_depth",
+        "operation.runtime_uptime_sec",
+        "operation.timeout_observed",
+        "operation.latency_budget_exceeded",
+        "operation.deadline_missed",
+    });
+    output
+        << ",\n"
+        << indent << "  \"points\": [\n"
+        << indent << "    {\n"
+        << indent << "      \"execution_sequence_id\": 0,\n"
+        << indent << "      \"telemetry_timestamp\": " << json_string(timestamp) << ",\n"
+        << indent << "      \"mean_ms\": " << benchmark_result.mean_ms << ",\n"
+        << indent << "      \"p95_ms\": " << benchmark_result.p95_ms << ",\n"
+        << indent << "      \"p99_ms\": " << benchmark_result.p99_ms << ",\n"
+        << indent << "      \"fps\": " << benchmark_result.fps << ",\n"
+        << indent << "      \"inference_interval_ms\": " << benchmark_result.mean_ms << ",\n"
+        << indent << "      \"rolling_latency_mean_ms\": " << benchmark_result.mean_ms << ",\n"
+        << indent << "      \"rolling_latency_std_ms\": " << benchmark_result.std_ms << ",\n"
+        << indent << "      \"ram_used_mb\": ";
+    if (has_tegrastats) {
+        output << tegrastats_summary.ram_used_mb_max;
+    } else {
+        output << "null";
+    }
+    output
+        << ",\n"
+        << indent << "      \"max_temperature_c\": ";
+    if (has_tegrastats) {
+        output << tegrastats_summary.max_temp_c;
+    } else {
+        output << "null";
+    }
+    output
+        << ",\n"
+        << indent << "      \"vdd_in_mw_avg\": ";
+    if (has_tegrastats) {
+        output << tegrastats_summary.vdd_in_mw_avg;
+    } else {
+        output << "null";
+    }
+    output
+        << ",\n"
+        << indent << "      \"queue_depth\": null,\n"
+        << indent << "      \"runtime_uptime_sec\": null,\n"
+        << indent << "      \"timeout_observed\": "
+        << (timeout_observed(config, benchmark_result) ? "true" : "false") << ",\n"
+        << indent << "      \"latency_budget_exceeded\": "
+        << (latency_budget_exceeded(config, benchmark_result) ? "true" : "false") << ",\n"
+        << indent << "      \"deadline_missed\": "
+        << (should_mark_deadline_missed(config, benchmark_result) ? "true" : "false") << ",\n"
+        << indent << "      \"power_mode\": " << json_string(config.power_mode) << ",\n"
+        << indent << "      \"telemetry_source\": "
+        << json_string(has_tegrastats ? "tegrastats" : "not_available") << ",\n"
+        << indent << "      \"tegrastats_status\": " << json_string(tegrastats_summary.status) << "\n"
+        << indent << "    }\n"
+        << indent << "  ]\n"
+        << indent << "}";
+}
+
 std::string runtime_operation_recommended_action(
     const RuntimeConfig& config,
     const EngineMetadata& engine_metadata,
@@ -788,6 +904,17 @@ void write_runtime_telemetry_json(
         << ",\n"
         << indent << "  \"coverage\": ";
     write_runtime_telemetry_coverage_json(output, tegrastats_summary, indent_spaces + 2);
+    output
+        << ",\n"
+        << indent << "  \"history_seed\": ";
+    write_runtime_telemetry_history_seed_json(
+        output,
+        config,
+        engine_metadata,
+        benchmark_result,
+        tegrastats_summary,
+        timestamp,
+        indent_spaces + 2);
     output
         << ",\n"
         << indent << "  \"production_monitoring\": false\n"

diff --git a/tests/test_agent_runtime_result_contract.py b/tests/test_agent_runtime_result_contract.py
@@ -203,6 +203,39 @@ def test_runtime_output_records_optional_agent_block_when_manifest_is_provided(s
             coverage["missing_field_count"],
             len(coverage["missing_fields"]),
         )
+        history_seed = telemetry["history_seed"]
+        self.assertEqual(
+            history_seed["schema_version"],
+            "inferedge-runtime-telemetry-history-seed-v1",
+        )
+        self.assertEqual(history_seed["registry_owner"], "edgeenv")
+        self.assertEqual(history_seed["decision_owner"], "lab")
+        self.assertEqual(
+            history_seed["source_telemetry_schema_version"],
+            telemetry["schema_version"],
+        )
+        self.assertFalse(history_seed["production_monitoring"])
+        self.assertFalse(history_seed["missing_telemetry_is_failure"])
+        self.assertTrue(history_seed["replay_ready"])
+        self.assertIn("compare_key", history_seed["recommended_registry_key_fields"])
+        self.assertIn("latency.mean_ms", history_seed["time_series_fields"])
+        self.assertEqual(
+            history_seed["source_result"]["compare_key"],
+            result["compare_key"],
+        )
+        self.assertEqual(
+            history_seed["source_result"]["backend_key"],
+            result["backend_key"],
+        )
+        self.assertEqual(history_seed["source_result"]["precision"], result["precision"])
+        self.assertEqual(history_seed["source_result"]["power_mode"], result["run_config"]["power_mode"])
+        point = history_seed["points"][0]
+        self.assertEqual(point["execution_sequence_id"], telemetry["execution_sequence_id"])
+        self.assertEqual(point["telemetry_timestamp"], telemetry["telemetry_timestamp"])
+        self.assertEqual(point["mean_ms"], telemetry["latency"]["mean_ms"])
+        self.assertEqual(point["p99_ms"], telemetry["latency"]["p99_ms"])
+        self.assertEqual(point["timeout_observed"], telemetry["operation"]["timeout_observed"])
+        self.assertEqual(point["deadline_missed"], telemetry["operation"]["deadline_missed"])
 
         extra = result["extra"]
         self.assertTrue(extra["agent_manifest_recorded"])
-Original file line number
+Diff line change
@@ Expand Up @@
     이 기능은 reliable edge agent runtime 방향의 첫 Runtime-side contract입니다. `agent_id`, `task_id`, `agent_type`, priority, latency budget, queue wait, fallback usage, telemetry context를 기록하지만 기존 `result.json`의 top-level compare/report 필드는 변경하지 않습니다.
-    Runtime result JSON에는 `runtime_health_snapshot`, `runtime_error_classification`, `runtime_events`, `runtime_operation_summary`도 additive evidence로 기록됩니다. 이제 health snapshot은 backend availability, latency budget/deadline observation, tegrastats evidence availability와 `health_reason`을 함께 남기고, runtime events는 sequential `event_index`를 가진 lifecycle trace로 기록됩니다. `runtime_operation_summary`는 Lab/Orchestrator/AIGuard handoff용 compact index로 `risk_labels`, `evidence_gaps`, retryability, conservative `recommended_action`을 남기되 `decision_owner: lab`, `scheduler_owner: orchestrator`, `production_cancellation: false`를 유지합니다. `runtime_telemetry.coverage`는 expected / observed / missing telemetry fields를 기록하고 `comparability_owner: edgeenv`, `missing_telemetry_is_failure: false`를 명시합니다. `--timeout-ms`는 latency timeout 관측 기준을 남기는 옵션이며, production request cancellation을 의미하지 않습니다. 실행이 `skipped`로 끝나면 Runtime은 `runtime_execution_skipped`, `retryable: true`, `retry_hint: check_backend_availability`를 남겨 Lab/Orchestrator가 failure handling evidence로 해석할 수 있게 합니다.
+    Runtime result JSON에는 `runtime_health_snapshot`, `runtime_error_classification`, `runtime_events`, `runtime_operation_summary`도 additive evidence로 기록됩니다. 이제 health snapshot은 backend availability, latency budget/deadline observation, tegrastats evidence availability와 `health_reason`을 함께 남기고, runtime events는 sequential `event_index`를 가진 lifecycle trace로 기록됩니다. `runtime_operation_summary`는 Lab/Orchestrator/AIGuard handoff용 compact index로 `risk_labels`, `evidence_gaps`, retryability, conservative `recommended_action`을 남기되 `decision_owner: lab`, `scheduler_owner: orchestrator`, `production_cancellation: false`를 유지합니다. `runtime_telemetry.coverage`는 expected / observed / missing telemetry fields를 기록하고 `comparability_owner: edgeenv`, `missing_telemetry_is_failure: false`를 명시합니다. `runtime_telemetry.history_seed`는 `registry_owner: edgeenv`, `decision_owner: lab`, `production_monitoring: false`를 유지하며 EdgeEnv telemetry history accumulation으로 넘길 수 있는 single-result replay point를 제공합니다. `--timeout-ms`는 latency timeout 관측 기준을 남기는 옵션이며, production request cancellation을 의미하지 않습니다. 실행이 `skipped`로 끝나면 Runtime은 `runtime_execution_skipped`, `retryable: true`, `retry_hint: check_backend_availability`를 남겨 Lab/Orchestrator가 failure handling evidence로 해석할 수 있게 합니다.
     예시:
@@ Expand Down @@