Skip to content

Commit 9c338df

Browse files
authored
Merge pull request #50 from VoynichLabs/feature/model-profile-switching
config(llm): reprioritize models and switch to gemini-2.5-flash-lite across profiles
2 parents 6d09af4 + 2bfdf6a commit 9c338df

30 files changed

Lines changed: 983 additions & 177 deletions

docker-compose.yml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,9 @@ services:
151151
volumes:
152152
- ./.env:/app/.env:ro
153153
- ./llm_config.json:/app/llm_config.json:ro
154+
- ./llm_config.premium.json:/app/llm_config.premium.json:ro
155+
- ./llm_config.frontier.json:/app/llm_config.frontier.json:ro
156+
- ./llm_config.custom.json:/app/llm_config.custom.json:ro
154157
- ./run:/app/run
155158
restart: unless-stopped
156159
develop:
@@ -166,6 +169,15 @@ services:
166169
- action: sync
167170
path: ./llm_config.json
168171
target: /app/llm_config.json
172+
- action: sync
173+
path: ./llm_config.premium.json
174+
target: /app/llm_config.premium.json
175+
- action: sync
176+
path: ./llm_config.frontier.json
177+
target: /app/llm_config.frontier.json
178+
- action: sync
179+
path: ./llm_config.custom.json
180+
target: /app/llm_config.custom.json
169181
- action: sync
170182
path: ./.env
171183
target: /app/.env
@@ -196,6 +208,13 @@ services:
196208
PLANEXE_FRONTEND_MULTIUSER_ADMIN_PASSWORD: ${PLANEXE_FRONTEND_MULTIUSER_ADMIN_PASSWORD:-admin}
197209
ports:
198210
- "${PLANEXE_FRONTEND_MULTIUSER_PORT:-5001}:5000"
211+
volumes:
212+
- ./.env:/app/.env:ro
213+
- ./llm_config.json:/app/llm_config.json:ro
214+
- ./llm_config.premium.json:/app/llm_config.premium.json:ro
215+
- ./llm_config.frontier.json:/app/llm_config.frontier.json:ro
216+
- ./llm_config.custom.json:/app/llm_config.custom.json:ro
217+
- ./run:/app/run
199218
healthcheck:
200219
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:5000/healthcheck').read()"]
201220
interval: 10s

docs/llm_config.md

Lines changed: 66 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,72 @@
11
---
2-
title: LLM config (llm_config.json)
2+
title: LLM config profiles
33
---
44

5-
# LLM config (llm_config.json)
5+
# LLM config profiles
66

7-
This file defines which LLM providers and models PlanExe can use. Each top‑level key is a model id used in the UI and pipeline.
7+
PlanExe supports **4 model profiles**:
88

9-
`llm_config.json` lives in the PlanExe repo root and is read at runtime. Environment variables are substituted from `.env`.
9+
- `baseline`
10+
- `premium`
11+
- `frontier`
12+
- `custom`
13+
14+
Each profile maps to a separate config file:
15+
16+
- `baseline``llm_config.json`
17+
- `premium``llm_config.premium.json`
18+
- `frontier``llm_config.frontier.json`
19+
- `custom``llm_config.custom.json` (or `PLANEXE_LLM_CONFIG_CUSTOM_FILENAME`)
20+
21+
If the selected profile file is missing or invalid, PlanExe safely falls back to `llm_config.json`.
22+
23+
---
24+
25+
## How profile selection works
26+
27+
### Runtime env var
28+
29+
Set:
30+
31+
- `PLANEXE_MODEL_PROFILE=baseline|premium|frontier|custom`
32+
33+
This is passed end-to-end in worker execution paths (frontend/API/task parameters → worker pipeline).
34+
35+
### Request/task parameter
36+
37+
Task producers (web frontend, MCP) can include:
38+
39+
- `model_profile`
40+
41+
Invalid values are normalized to `baseline`.
42+
43+
---
44+
45+
## Strict filename validation
46+
47+
Config filenames are strictly validated:
48+
49+
- must be a **filename only** (no `/`, `\\`, absolute path)
50+
- must match: `llm_config*.json`
51+
52+
This prevents path traversal and unsafe file selection.
53+
54+
Legacy override `PLANEXE_LLM_CONFIG_NAME` is still supported for backward compatibility, but profile-based selection is preferred.
55+
56+
---
57+
58+
## Provider-priority ordering per profile
59+
60+
Within each profile config file, priority is defined per model entry:
61+
62+
- lower `priority` value = tried first
63+
- higher `priority` value = fallback order
64+
65+
`auto` mode uses this profile-specific priority ordering.
1066

1167
---
1268

13-
## File structure
69+
## File format (same for all profile files)
1470

1571
```json
1672
{
@@ -24,8 +80,6 @@ This file defines which LLM providers and models PlanExe can use. Each top‑lev
2480
"api_key": "${OPENROUTER_API_KEY}",
2581
"temperature": 0.1,
2682
"timeout": 60.0,
27-
"is_function_calling_model": false,
28-
"is_chat_model": true,
2983
"max_tokens": 8192,
3084
"max_retries": 5
3185
}
@@ -35,41 +89,11 @@ This file defines which LLM providers and models PlanExe can use. Each top‑lev
3589

3690
---
3791

38-
## Top-level fields
92+
## Backward compatibility
3993

40-
- **comment**: Plain‑text description for humans. Optional.
41-
- **priority**: Lower number = higher priority when `auto` is selected. Optional.
42-
- **luigi_workers**: Number of Luigi workers used for this model. Use `1` for local models (Ollama/LM Studio).
43-
- **class**: Provider class name (e.g., `OpenRouter`, `OpenAI`, `Ollama`, `LMStudio`, `OpenAILike`).
44-
- **arguments**: Provider‑specific settings passed to the LLM client.
45-
46-
---
47-
48-
## Common arguments
49-
50-
These keys are common across most providers:
51-
52-
- **model** / **model_name**: Provider model identifier.
53-
- **api_key**: API key reference (usually `${ENV_VAR}`).
54-
- **base_url** / **api_base**: Override the provider base URL.
55-
- **temperature**: Controls randomness. Lower is more deterministic.
56-
- **timeout** / **request_timeout**: Max time per request in seconds.
57-
- **max_tokens** / **max_completion_tokens**: Output token limit (provider specific).
58-
- **max_retries**: Retry count on transient errors.
59-
- **is_function_calling_model**: Whether the model supports structured/tool output.
60-
- **is_chat_model**: Whether the model uses chat format.
61-
62-
---
63-
64-
## Choosing values
65-
66-
- Use **luigi_workers = 1** for local models (Ollama / LM Studio).
67-
- Use **luigi_workers > 1** for cloud models if you want parallel tasks.
68-
- Keep **timeout** higher for slower models.
69-
70-
---
94+
When no profile is provided, PlanExe defaults to:
7195

72-
## Notes
96+
- `baseline`
97+
- `llm_config.json`
7398

74-
- If `llm_config.json` is missing, PlanExe logs a warning and proceeds with defaults.
75-
- Changes to `llm_config.json` require a container restart (or rebuild if baked into the image).
99+
So existing deployments continue to work without changes.

frontend_multi_user/Dockerfile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ WORKDIR /app
1313
COPY worker_plan/worker_plan_api /app/worker_plan_api
1414
COPY database_api /app/database_api
1515
COPY frontend_multi_user /app/frontend_multi_user
16+
COPY llm_config*.json /app/
1617

1718
# Install dependencies from frontend_multi_user pyproject
1819
RUN set -eux; \

frontend_multi_user/src/app.py

Lines changed: 49 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@
5353

5454
from worker_plan_api.planexe_dotenv import DotEnvKeyEnum, PlanExeDotEnv
5555
from worker_plan_api.planexe_config import PlanExeConfig
56+
from worker_plan_api.model_profile import ModelProfileEnum, normalize_model_profile
5657

5758
RUN_DIR = "run"
5859

@@ -131,6 +132,43 @@ def wrapper(*args, **kwargs):
131132
return view(*args, **kwargs)
132133
return wrapper
133134

135+
136+
def _profile_model_name_map() -> Dict[str, list[str]]:
137+
profile_to_models: Dict[str, list[str]] = {}
138+
for profile in ModelProfileEnum:
139+
config = PlanExeConfig.load(model_profile_override=profile)
140+
config_path = config.llm_config_json_path
141+
if config_path is None:
142+
profile_to_models[profile.value] = []
143+
continue
144+
try:
145+
with config_path.open("r", encoding="utf-8") as fh:
146+
model_map = json.load(fh)
147+
except Exception:
148+
profile_to_models[profile.value] = []
149+
continue
150+
if not isinstance(model_map, dict):
151+
profile_to_models[profile.value] = []
152+
continue
153+
154+
def sort_key(item: tuple[str, dict]) -> tuple[int, str]:
155+
data = item[1] if isinstance(item[1], dict) else {}
156+
priority = data.get("priority")
157+
if not isinstance(priority, int):
158+
priority = 999999
159+
return priority, item[0]
160+
161+
names: list[str] = []
162+
for model_id, model_data in sorted(model_map.items(), key=sort_key):
163+
model_name = model_id
164+
if isinstance(model_data, dict):
165+
args = model_data.get("arguments")
166+
if isinstance(args, dict) and isinstance(args.get("model"), str):
167+
model_name = args["model"]
168+
names.append(model_name)
169+
profile_to_models[profile.value] = names
170+
return profile_to_models
171+
134172
class MyFlaskApp:
135173
def __init__(self):
136174
logger.info(f"MyFlaskApp.__init__. Starting...")
@@ -1944,6 +1982,7 @@ def index():
19441982
nonce=nonce,
19451983
user_id=user_id,
19461984
example_prompts=example_prompts,
1985+
model_profile_models_json=json.dumps(_profile_model_name_map()),
19471986
)
19481987

19491988
@self.app.route('/healthcheck')
@@ -2483,6 +2522,12 @@ def run():
24832522
if len(parameters) == 0:
24842523
parameters = None
24852524

2525+
# Normalize model profile to a known value with backward-compatible baseline default.
2526+
if not isinstance(parameters, dict):
2527+
parameters = {}
2528+
raw_profile = parameters.get("model_profile")
2529+
parameters["model_profile"] = normalize_model_profile(raw_profile).value
2530+
24862531
# Get length of prompt_param in bytes and in characters
24872532
prompt_param_bytes = len(prompt_param.encode('utf-8'))
24882533
prompt_param_characters = len(prompt_param)
@@ -2584,8 +2629,10 @@ def create_plan():
25842629
parameters.pop('user_id', None)
25852630
parameters.pop('nonce', None)
25862631
parameters.pop('redirect_to_plan', None)
2587-
if len(parameters) == 0:
2588-
parameters = None
2632+
2633+
# Normalize model profile to a known value with backward-compatible baseline default.
2634+
raw_profile = parameters.get("model_profile")
2635+
parameters["model_profile"] = normalize_model_profile(raw_profile).value
25892636

25902637
prompt_param_bytes = len(prompt_param.encode('utf-8'))
25912638
prompt_param_characters = len(prompt_param)

frontend_multi_user/templates/demo_run.html

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -211,6 +211,7 @@ <h1>Demo Run</h1>
211211
<input type="hidden" name="nonce" value="{{ nonce }}">
212212
<!-- Values are submitted only when enabled (not disabled) -->
213213
<input type="hidden" name="speed_vs_detail" id="form-speed-vs-detail" value="ping_llm">
214+
<input type="hidden" name="model_profile" id="form-model-profile" value="baseline">
214215
<input type="hidden" name="developer" id="form-developer" value="true">
215216
</form>
216217

@@ -282,7 +283,7 @@ <h1>Demo Run</h1>
282283

283284
if (methodSelect.value === 'GET') {
284285
// GET method: build URL with query parameters
285-
let url = `/run?prompt=${encodeURIComponent(promptValue)}&user_id={{ user_id }}&nonce={{ nonce }}&speed_vs_detail=${encodeURIComponent(speedVsDetailValue)}`;
286+
let url = `/run?prompt=${encodeURIComponent(promptValue)}&user_id={{ user_id }}&nonce={{ nonce }}&speed_vs_detail=${encodeURIComponent(speedVsDetailValue)}&model_profile=baseline`;
286287
if (developerChecked) {
287288
url += '&developer';
288289
}

frontend_multi_user/templates/index.html

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -468,6 +468,21 @@ <h2>Start a New Plan</h2>
468468
<form id="new-plan-form" method="POST" action="{{ url_for('create_plan') }}">
469469
<input type="hidden" name="csrf_token" value="{{ csrf_token() }}">
470470
<input type="hidden" name="speed_vs_detail" value="all_details_but_slow">
471+
<label for="model-profile" style="display:block; margin-bottom:8px; font-size:0.9rem; color:#a0aec0;">Model profile</label>
472+
<select id="model-profile" name="model_profile" style="margin-bottom:12px; width:100%; max-width:240px; padding:8px; border-radius:8px;">
473+
<option value="baseline" selected>baseline (default balanced)</option>
474+
<option value="premium">premium (higher-cost ordering)</option>
475+
<option value="frontier">frontier (highest-capability ordering)</option>
476+
<option value="custom">custom (your custom file)</option>
477+
</select>
478+
<div style="margin-bottom:12px; font-size:0.85rem; color:#6b7280; line-height:1.4;">
479+
baseline -> <code>llm_config.json</code>,
480+
premium -> <code>llm_config.premium.json</code>,
481+
frontier -> <code>llm_config.frontier.json</code>,
482+
custom -> <code>llm_config.custom.json</code> (or <code>PLANEXE_LLM_CONFIG_CUSTOM_FILENAME</code>).
483+
The actual models are read from the selected file's priority order.
484+
</div>
485+
<div id="model-profile-models" style="margin-bottom:12px; font-size:0.85rem; color:#4b5563; line-height:1.4;"></div>
471486
<textarea name="prompt" id="plan-prompt" placeholder="Describe your project or idea in detail. The more context you provide, the better the plan will be." required></textarea>
472487
<div class="char-count" id="char-count">0 characters</div>
473488
<div class="new-plan-footer">
@@ -586,4 +601,32 @@ <h3>Avoid Surprises</h3>
586601
});
587602
</script>
588603
{% endif %}
604+
{% if user %}
605+
<script>
606+
var profileToModels = {{ model_profile_models_json | safe }};
607+
var profileSelect = document.getElementById('model-profile');
608+
var profileModelsDiv = document.getElementById('model-profile-models');
609+
610+
function renderProfileModels() {
611+
if (!profileSelect || !profileModelsDiv) {
612+
return;
613+
}
614+
var profile = profileSelect.value || 'baseline';
615+
var models = profileToModels[profile] || [];
616+
if (models.length === 0) {
617+
profileModelsDiv.innerHTML = '<strong>Models in ' + profile + ':</strong> none found';
618+
return;
619+
}
620+
var lines = models.map(function(modelName) {
621+
return '<li><code>' + modelName + '</code></li>';
622+
}).join('');
623+
profileModelsDiv.innerHTML = '<strong>Models in ' + profile + ':</strong><ul style="margin:6px 0 0 16px;">' + lines + '</ul>';
624+
}
625+
626+
if (profileSelect) {
627+
profileSelect.addEventListener('change', renderProfileModels);
628+
}
629+
renderProfileModels();
630+
</script>
631+
{% endif %}
589632
{% endblock %}

frontend_single_user/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ RUN pip install --no-cache-dir --upgrade pip \
1919
# Copy application code and supporting files
2020
COPY worker_plan/worker_plan_api /app/worker_plan_api
2121
COPY frontend_single_user /app/frontend_single_user
22-
COPY llm_config.json /app/
22+
COPY llm_config*.json /app/
2323

2424
# Default location for generated plans
2525
RUN mkdir -p /app/run

0 commit comments

Comments
 (0)