feat(providers): 1h system-prompt cache + thinking-effort lever by oratis · Pull Request #210 · oratis/LISA

oratis · 2026-07-02T05:22:39Z

Model-appropriate tuning from the OpenClaw research — kept only what applies to LISA's default claude-sonnet-4-6. Plan + pros/cons debate in docs/PLAN_MODEL_TUNING_v1.0.md.

Dropped (verified against the claude-api reference): 1M context is already native on Sonnet 4.6; fast-mode / task-budgets / service_tier are gated to Opus 4.8/4.7 or Sonnet 5 (400 on Sonnet 4.6).

A. 1h prompt caching on the stable system prefix (soul+skills+memory) — stays warm across think-time gaps in a bursty personal session instead of a cold re-write every 5 min. Tail stays 5-min; LISA_CACHE_TTL=5m opts back. Verified end-to-end via the relay: ephemeral_1h_input_tokens=3603 written turn 1, cache_read_input_tokens=3603 on turn 2.

B. Thinking effort (output_config.effort, GA on Sonnet 4.6) threaded provider→agent→subagent. Default-off globally (keeps "high"); dispatched subagents default low; LISA_EFFORT overrides for power users.

Verified: tsc clean · 64 provider + 4 subagent tests pass · live /chat unchanged · already rebuilt+running on the local backend.

🤖 Generated with Claude Code

…ropic) Model-appropriate tuning borrowed from OpenClaw's research, kept only where it applies to LISA's default model (claude-sonnet-4-6) — see docs/PLAN_MODEL_TUNING_v1.0.md for the plan + pros/cons debate. Most OpenClaw knobs were dropped: 1M context is already native on Sonnet 4.6, and fast-mode/task-budgets/service_tier are gated to Opus 4.8/4.7 or Sonnet 5 (400 on Sonnet 4.6). A. Extended (1h) prompt caching on the stable system prefix (soul+skills+memory) so it stays warm across think-time gaps in a bursty personal session instead of a cold re-write every 5 min. Conversational tail stays 5-min. LISA_CACHE_TTL=5m opts back for heavy-continuous use. Verified end-to-end via the relay: ephemeral_1h_input_tokens written on turn 1, cache_read on turn 2. B. Optional thinking effort (output_config.effort, GA on Sonnet 4.6) threaded provider→agent→subagent. Default-off globally (keeps API default "high"); dispatched subagents default to "low" (cheap parallel work); LISA_EFFORT overrides globally for power users. Verified: tsc clean; 64 provider + 4 subagent tests pass; live /chat unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

oratis merged commit 5a112aa into main Jul 2, 2026
1 check passed

oratis mentioned this pull request Jul 2, 2026

feat(voice): ElevenLabs ASR + line-style composer icons #174

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(providers): 1h system-prompt cache + thinking-effort lever#210

feat(providers): 1h system-prompt cache + thinking-effort lever#210
oratis merged 1 commit into
mainfrom
feat/model-tuning

oratis commented Jul 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant