Add SUGGESTION_MODE_MODEL env var to control suggestion mode LLM calls #42

MichaelAnders · 2026-02-10T16:34:38Z

Summary

Add SUGGESTION_MODE_MODEL env var to skip or redirect suggestion mode LLM calls to a lighter model
Add hallucination guard to drop tool calls when no tools were sent to the model

Problem

Each user message triggers three concurrent runAgentLoop calls (main, suggestion, and others). Suggestion mode calls are helpful, but require LLM execution time (with local models potentially multiple seconds) and/or potentially creating payable token usage.

Changes

src/orchestrator/index.js: Added detectSuggestionMode() to identify suggestion mode requests; early return when SUGGESTION_MODE_MODEL=none; hallucination guard for models that hallucinate tool calls without tools
src/config/index.js: Added SUGGESTION_MODE_MODEL config (default: "default")
src/clients/databricks.js: Minor adjustment for suggestion mode model passthrough
.env.example: Documented the new env var with usage examples

Configuration

# Skip suggestion mode entirely
SUGGESTION_MODE_MODEL=none

# Redirect to a lighter model
SUGGESTION_MODE_MODEL=llama3.2:1b

# Default behavior (unchanged)
SUGGESTION_MODE_MODEL=default

Testing

SUGGESTION_MODE_MODEL=none: suggestion calls return instantly (0ms)
SUGGESTION_MODE_MODEL=llama3.2:1b: redirects to lighter model
Default/unset: behavior unchanged
Hallucination guard tested with Llama 3.1
npm run test:unit passes with no regressions

Problem: Each user message triggers three concurrent runAgentLoop calls (main, suggestion, and others). When using large local models via Ollama, each call takes 30-90 seconds of GPU time. Suggestion mode calls are low-value and waste significant compute resources that could be serving the main request. Changes implemented: 1. Suggestion mode detection (src/orchestrator/index.js) - Added detectSuggestionMode() that scans the last user message for the [SUGGESTION MODE: marker used by Claude Code's CLI - Tags requests with _requestMode ("suggestion" vs "main") in sanitizePayload 2. Model override and skip logic (src/orchestrator/index.js, src/config/index.js) - When SUGGESTION_MODE_MODEL=none, suggestion mode requests return an empty response immediately without calling the LLM - When set to a model name, suggestion mode uses that lighter model instead of the default, freeing GPU for the main request - When set to "default" (or unset), behavior is unchanged 3. Hallucination guard (src/orchestrator/index.js) - Added guard to drop tool calls when no tools were sent to the model - Some models (e.g. Llama 3.1) hallucinate tool_call blocks from conversation history even when the request had zero tool definitions 4. Configuration (src/config/index.js, .env.example) - Added SUGGESTION_MODE_MODEL env var with "default" as the default value - Documented in .env.example with usage examples 5. Databricks client cleanup (src/clients/databricks.js) - Minor adjustment to pass through suggestion mode model config Testing: - SUGGESTION_MODE_MODEL=none: suggestion mode calls return instantly (0ms) - SUGGESTION_MODE_MODEL=llama3.2:1b: redirects to lighter model correctly - SUGGESTION_MODE_MODEL=default (or unset): unchanged behavior - Main agent loop unaffected in all cases - Hallucination guard tested with Llama 3.1 no-tool requests - npm run test:unit passes with no regressions

MichaelAnders mentioned this pull request Feb 10, 2026

Improve tool calling, Ollama resilience, and response handling #45

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add SUGGESTION_MODE_MODEL env var to control suggestion mode LLM calls #42

Add SUGGESTION_MODE_MODEL env var to control suggestion mode LLM calls #42

Uh oh!

MichaelAnders commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Add SUGGESTION_MODE_MODEL env var to control suggestion mode LLM calls #42

Are you sure you want to change the base?

Add SUGGESTION_MODE_MODEL env var to control suggestion mode LLM calls #42

Uh oh!

Conversation

MichaelAnders commented Feb 10, 2026

Summary

Problem

Changes

Configuration

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant