test(agentic): add 46 state management and orchestrator tests — all passed by Mog9 · Pull Request #2 · tensormux/Tensorpath

Mog9 · 2026-06-13T06:40:09Z

Continues the agentic loop test coverage from #1. This PR adds tests for the state management layer (agent_state.py) and the main orchestrator loop (agent_runner.py), plus the remaining 5 tests for tool dispatch (agent_tools.py).

46 new tests across 3 files — all passing.

test_agent_tools.py (5 new tests)

#	Test	What it validates
1	`test_unknown_tool_returns_error`	Unknown tool name returns error ToolResult
2	`test_tool_exception_returns_error`	Handler raising exception is caught, returns error
3	`test_run_verify_delegates_to_verifier`	Calls `verify_candidate` with correct args
4	`test_run_benchmark_delegates_to_benchmarker`	Calls `benchmark_candidate`
5	`test_read_candidate_file_rejects_bad_path`	Path traversal in read returns error

test_agent_state.py (16 new tests)

#	Test	What it validates
1	`test_init_state_creates_file`	`agent_state.json` written to disk
2	`test_init_state_truncates_transcript`	Old transcript cleared on fresh start
3	`test_init_state_defaults`	Status=PENDING, iteration=0, cost=0.0
4	`test_save_and_load_roundtrip`	State survives JSON serialization
5	`test_load_returns_none_when_missing`	No file returns `None`
6	`test_load_returns_none_on_corrupt_json`	Malformed file returns `None`
7	`test_request_abort_sets_flag`	`abort_requested` becomes `True`
8	`test_request_abort_only_when_pending_or_running`	Returns `False` for terminal statuses
9	`test_request_abort_returns_false_when_no_state`	No state file returns `False`
10	`test_append_transcript_adds_timestamp`	Each line has `"at"` key
11	`test_append_transcript_multiple_lines`	Multiple appends produce multiple lines
12	`test_read_transcript_empty`	No file returns `[]`
13	`test_read_transcript_skips_blank_lines`	Blank lines ignored
14	`test_read_transcript_skips_invalid_json`	Malformed lines ignored
15	`test_state_path_format`	Path is `repo_root / artifact_dir / "agent_state.json"`
16	`test_transcript_path_format`	Path is `repo_root / artifact_dir / "agent_transcript.jsonl"`

test_agent_runner.py (25 new tests)

Cost tracking (6 tests)

#	Test	What it validates
1	`test_cost_from_usage_all_zeros`	Zero tokens = $0.00
2	`test_cost_from_usage_input_only`	1M input tokens = $5.00
3	`test_cost_from_usage_output_only`	1M output tokens = $25.00
4	`test_cost_from_usage_cache_write`	1M cache_write tokens = $6.25
5	`test_cost_from_usage_cache_read`	1M cache_read tokens = $0.50
6	`test_cost_from_usage_combined`	Mixed tokens = correct sum

System prompt (3 tests)

#	Test	What it validates
7	`test_build_system_prompt_has_two_blocks`	Rules block + cached skill bundle block
8	`test_build_system_prompt_cache_control`	Second block has `cache_control: ephemeral`
9	`test_build_system_prompt_includes_task_details`	Op, GPU, dtype, shape in rules text

Loop termination (7 tests)

#	Test	What it validates
10	`test_loop_errored_on_missing_api_key`	No ANTHROPIC_API_KEY = ERRORED immediately
11	`test_loop_errored_on_api_refusal`	stop_reason="refusal" = ERRORED
12	`test_loop_breaks_on_end_turn_without_tools`	end_turn with no tool_use = breaks
13	`test_loop_rejected_on_give_up`	Agent calls give_up = REJECTED
14	`test_loop_rejected_on_max_iterations`	Loop exhausts iterations = REJECTED
15	`test_loop_rejected_on_cost_cap`	Cost exceeds cap = REJECTED
16	`test_loop_errored_on_api_error`	APIStatusError = ERRORED

State transitions (5 tests)

#	Test	What it validates
17	`test_iteration_counter_increments`	State.iteration matches loop count
18	`test_cost_accumulates_across_turns`	cost_usd grows with each API call
19	`test_token_counts_accumulate`	All four token counters grow correctly
20	`test_verify_result_tracked_in_state`	last_verify_passed/reason updated
21	`test_benchmark_result_tracked_in_state`	last_benchmark_passed/speedup updated

Edge cases (4 tests)

#	Test	What it validates
22	`test_loop_continues_when_verify_fails`	Verify fails = no promotion, loop continues
23	`test_loop_continues_when_benchmark_fails`	Benchmark fails = no promotion, loop continues
24	`test_promotion_failure_does_not_crash`	promote_candidate raises ValueError = handled gracefully
25	`test_transcript_written_each_turn`	transcript_lines increments each turn

Results

test_agent_tools.py: 15 passed (10 from #1 + 5 new)
test_agent_state.py: 16 passed (new file)
test_agent_runner.py: 25 passed (new file)
Full suite: 149 passed, 1 skipped

Coverage summary

Module	Before	After
`agent_tools.py`	0%	15 tests
`agent_state.py`	0%	16 tests
`agent_runner.py`	0%	25 tests
Total agentic	0	56

…assed

test(agentic): add 46 state management and orchestrator tests — all p…

2b27b07

…assed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(agentic): add 46 state management and orchestrator tests — all passed#2

test(agentic): add 46 state management and orchestrator tests — all passed#2
Mog9 wants to merge 1 commit into
tensormux:mainfrom
Mog9:add-agentic-loop-state-runner-tests

Mog9 commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Mog9 commented Jun 13, 2026

test_agent_tools.py (5 new tests)

test_agent_state.py (16 new tests)

test_agent_runner.py (25 new tests)

Cost tracking (6 tests)

System prompt (3 tests)

Loop termination (7 tests)

State transitions (5 tests)

Edge cases (4 tests)

Results

Coverage summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant