[Improvement] Enforce send_message tool use; evaluate dropped_text_retries necessity

## Summary

Two related questions about how pyclaudir ensures outbound messages actually reach the user:

1. **How can we enforce that Luna always calls `send_message`** (or `reply_to_message`) instead of producing bare text content blocks that silently disappear?
2. **Is `dropped_text_retries` still necessary** once enforcement is in place?

---

## Problem

Claude Code turns can end with:
- A `send_message` / `reply_to_message` tool call → message delivered to user ✓
- A plain text content block → silently dropped, user sees nothing ✗

The current workaround is `dropped_text_retries`: when the harness detects a turn ended with text but no outbound tool call, it retries the turn. This is a recovery mechanism, not prevention.

Issues with relying on retries:
- Burns tokens on a second (and possibly third) turn for something that should have worked the first time
- The retry may still produce text if the model is confused about its role
- Adds latency
- Doesn't surface the failure clearly in logs

---

## Option A — `tool_choice` enforcement (API-level)

The Anthropic API supports `tool_choice` to force the model to call a specific tool:

```python
# Force any tool call (model picks which one)
tool_choice={"type": "any"}

# Force a specific tool
tool_choice={"type": "tool", "name": "send_message"}
```

**Tradeoff:** Forcing `send_message` specifically would break turns where the model legitimately needs to call `read_memory`, `query_db`, or other tools first before sending. Forcing `{"type": "any"}` just guarantees *some* tool is called, not that a message is eventually sent.

**Possible hybrid:** Use `tool_choice: any` only on the *final* turn after tool calls are complete — but this requires the harness to know when a turn is "final," which it currently doesn't.

---

## Option B — System prompt enforcement

Add an explicit rule to `system.md`:

> If you produce a text content block instead of `send_message`, the user sees nothing. Always deliver via `send_message` or `reply_to_message`.

This is already partially in the system prompt ("If you produce a text content block instead of `send_message`, the user sees nothing") but the model still occasionally drifts.

**Improvement:** Add a post-turn self-check step — before ending the turn, confirm a `send_message` or `reply_to_message` was called if a reply was warranted. This is behavioral, not structural.

---

## Option C — Harness-level post-turn check (keep retries, improve detection)

Instead of retrying blindly, make the harness smarter:

```python
def check_turn_completion(turn_result):
    has_outbound = any(
        call.tool_name in ("send_message", "reply_to_message", "send_photo", "add_reaction")
        for call in turn_result.tool_calls
    )
    had_text = bool(turn_result.text_content.strip())
    
    if had_text and not has_outbound:
        # Inject the dropped text back as a system note and retry once
        return retry_with_context(f"Your previous turn produced text but no send_message call. The text was: {turn_result.text_content[:500]}. Please call send_message now.")
    
    return turn_result
```

This makes retries targeted (inject the lost text) rather than blind (just re-run the turn).

---

## Questions to resolve

1. Does Claude Code's API support `tool_choice` passthrough from the harness, or does CC manage tool calling internally?
2. Is `dropped_text_retries` currently catching real failures at meaningful rate, or is it rarely triggered?
3. Should `add_reaction`, `send_photo`, `edit_message` count as "valid outbound" (they are — user sees something), or does only `send_message` / `reply_to_message` count?

---

## Recommendation (tentative)

- **Short term:** Keep `dropped_text_retries` but make retries targeted (Option C) — inject the lost text, retry once with context.
- **Medium term:** If CC supports `tool_choice: any`, enable it to guarantee at minimum *some* tool call fires per turn.
- **Long term:** Evaluate whether `dropped_text_retries` rate drops to near-zero after the targeted retry improvement — if yes, remove it.

## Reported by

Rustam, 2026-05-10.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improvement] Enforce send_message tool use; evaluate dropped_text_retries necessity #30

Summary

Problem

Option A — `tool_choice` enforcement (API-level)

Option B — System prompt enforcement

Option C — Harness-level post-turn check (keep retries, improve detection)

Questions to resolve

Recommendation (tentative)

Reported by

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Improvement] Enforce send_message tool use; evaluate dropped_text_retries necessity #30

Description

Summary

Problem

Option A — tool_choice enforcement (API-level)

Option B — System prompt enforcement

Option C — Harness-level post-turn check (keep retries, improve detection)

Questions to resolve

Recommendation (tentative)

Reported by

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Option A — `tool_choice` enforcement (API-level)