[Bug] Context usage indicator always shows <1% with BYOK models via LiteLLM proxy

## Description

When using BYOK custom models routed through a LiteLLM proxy (v1.81.15), the context usage indicator in the CLI status bar always shows `<1%` regardless of actual context consumption. The `/cost` command also shows `Input: 0 tokens` (related to #157).

## Environment
- Droid CLI: v0.81.0
- macOS (arm64)
- BYOK config: `provider: "anthropic"` pointing to a LiteLLM proxy backed by Azure/Bedrock
- LiteLLM version: 1.81.15
- Models affected: All custom BYOK models (both Anthropic and OpenAI provider types)
- `showTokenUsageIndicator: true` in settings.json

## Steps to Reproduce
1. Configure a BYOK model with `provider: "anthropic"` pointing to a LiteLLM proxy
2. Start a droid session using the custom model
3. Run several prompts that consume meaningful context
4. Observe the context usage indicator -- always shows `<1%`
5. Run `/cost` -- input tokens show as 0

## Investigation: The proxy IS returning correct usage data

I tested the LiteLLM proxy directly with curl and confirmed all token usage fields are correctly populated in every response path:

**Non-streaming:**
```json
"usage": {"input_tokens": 16, "cache_creation_input_tokens": 0,
          "cache_read_input_tokens": 0, "output_tokens": 8, "total_tokens": 24}
```

**Streaming (message_start event):**
```json
"usage": {"input_tokens": 16, "cache_creation_input_tokens": 0,
          "cache_read_input_tokens": 0, "output_tokens": 1}
```

**Streaming (message_delta event):**
```json
"usage": {"output_tokens": 8}
```

**Streaming (message_stop event):**
```json
"usage": {"input_tokens": 16, "output_tokens": 8}
```

**Streaming with extended thinking:** Also returns correct `input_tokens` and `output_tokens`.

All Anthropic SSE event types (`message_start`, `message_delta`, `message_stop`) contain properly structured usage data. The proxy is fully compliant with the Anthropic streaming format.

## Likely Root Causes

1. **Context window size unknown for custom models:** The proxy's `/v1/models` endpoint returns minimal metadata (no `max_tokens` or `context_window` field). Droid likely computes context % as `input_tokens / context_window_size`, and without knowing the context window for a custom model, it may default to 0 or an extremely large value, yielding `<1%`.

2. **Token accumulation not working for BYOK streaming:** Even though usage data is present in streaming events, Droid may not be parsing/accumulating `input_tokens` from the `message_start` event for BYOK provider models (per #157).

## Expected Behavior
- Context usage indicator should reflect actual token consumption as a percentage of the model's context window
- For BYOK models where context window is unknown, Droid could either:
  - Allow users to specify `context_window` in the BYOK config
  - Use a sensible default based on the model name (e.g., `claude-*` → 200K)
  - Fall back to showing raw token count instead of a percentage

## Related
- #157 (`/cost` shows partial token counts for BYOK models)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Context usage indicator always shows <1% with BYOK models via LiteLLM proxy #857

Description

Environment

Steps to Reproduce

Investigation: The proxy IS returning correct usage data

Likely Root Causes

Expected Behavior

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] Context usage indicator always shows <1% with BYOK models via LiteLLM proxy #857

Description

Description

Environment

Steps to Reproduce

Investigation: The proxy IS returning correct usage data

Likely Root Causes

Expected Behavior

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions