You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add model selection, replace tinyllama with llama3.2 and qwen2.5
- Add multi-model support: users choose model via /newsession <number>
- Available models: Llama 3.2 (1B), Qwen 2.5 (1.5B instruct)
- Warn users on deprecated models and prompt to create new session
- Fix API Gateway 29s timeout: reduce Ollama timeout to 22s, smart retry
- Fix Lambda deployment: include requests dependency in zip
- Update README with model management docs and available models table
- Update variables.tf default ollama_model to llama3.2:1b
Users select a model when creating a session via `/newsession <number>`. Sessions using removed models show a warning and prompt the user to create a new session.
463
+
454
464
**Error Handling:**
455
-
- Connection timeouts (45s) with structured JSON error logging
465
+
- Connection timeouts (22s) with structured JSON error logging
466
+
- Smart retry: retries once only on fast connection errors (< 5s), not on timeouts
456
467
- HTTP status code validation (non-200 responses return user-friendly error)
457
468
- Exception handling with stack traces logged to CloudWatch
458
469
- Graceful fallback: bot remains functional even if Ollama is unreachable
@@ -476,8 +487,38 @@ The bot integrates with [Ollama](https://ollama.com), a self-hosted large langua
476
487
./scripts/manage-ollama.sh start # Start instance, wait for Ollama API
477
488
./scripts/manage-ollama.sh stop # Stop instance (syncs models to S3)
478
489
./scripts/manage-ollama.sh status # Check instance and API health
490
+
./scripts/manage-ollama.sh ssh # SSH into the instance
491
+
```
492
+
493
+
**Managing Models:**
494
+
495
+
To add a new model, SSH into the EC2 instance and pull it:
496
+
497
+
```bash
498
+
# SSH into the instance
499
+
./scripts/manage-ollama.sh ssh
500
+
501
+
# Pull a model (must set OLLAMA_HOST since Ollama binds to port 11435)
> **Note:** Ollama binds to `127.0.0.1:11435` (not the default 11434) because nginx reverse proxy occupies port 11434 for API key authentication. Always set `OLLAMA_HOST=http://127.0.0.1:11435` when using the `ollama` CLI on the instance.
resp=f"Hello! Your current model is {session['model_name']}. Chat away or use /help."
460
+
resp=f"Hello! Your current model is {session['model_name']}.\n\nAvailable models:\n{format_model_list()}\n\nUse /newsession <number> to start a session with a specific model.\nChat away or use /help."
436
461
send_message(chat_id, resp)
437
462
return"start_or_hello"
438
463
439
464
ifcmd=="/help":
440
-
resp="""Commands:
441
-
/start or /hello - Greeting and session init
442
-
/newsession - Start a new chat session
465
+
resp=f"""Commands:
466
+
/start or /hello - Greeting and session info
467
+
/newsession - Show available models
468
+
/newsession <number> - Start session with chosen model
443
469
/listsessions - List your sessions
444
470
/switch <number> - Switch to a session (e.g., /switch 1)
445
471
/history - Show recent messages in current session
warn_msg=f"The model '{session_model}' is no longer available.\n\nPlease create a new session with an available model:\n{available_list}\n\nUsage: /newsession <number>"
797
+
ifthinking_msg_id:
798
+
edit_message(chat_id, thinking_msg_id, warn_msg)
799
+
else:
800
+
send_message(chat_id, warn_msg)
801
+
return"model_unavailable"
802
+
745
803
# Check if a request is already being processed (within last 55 seconds)
0 commit comments