Skip to content

Conversation

@MichaelAnders
Copy link
Contributor

Summary

  • Add TOPIC_DETECTION_MODEL env var to redirect topic detection to a lighter model

Problem

Topic detection requests use the same large model as the main request. For users running local models, this adds unnecessary GPU load for a simple classification task.

Changes

  • src/config/index.js: Added TOPIC_DETECTION_MODEL env var (default: "default"), wired into config object and reloadConfig() for hot reload support

Configuration

# Use main model (default, unchanged behavior)
TOPIC_DETECTION_MODEL=default

# Redirect to a lighter model
TOPIC_DETECTION_MODEL=llama3.2:1b

Testing

  • Default/unset: behavior unchanged
  • Set to model name: config correctly reads the value
  • Hot reload picks up changes without restart
  • npm run test:unit passes with no regressions

Problem: Topic detection requests use the same large model as the main request.
For users running local models, this adds unnecessary GPU load for a simple
classification task. There is no way to redirect topic detection to a lighter,
faster model.

Changes implemented:

1. Configuration (src/config/index.js)
   - Added TOPIC_DETECTION_MODEL env var, defaulting to "default" (use main model)
   - When set to a model name, topic detection requests use that model instead
   - Added to config object and hot reload support in reloadConfig()

Testing:
- TOPIC_DETECTION_MODEL=default (or unset): unchanged behavior
- TOPIC_DETECTION_MODEL=llama3.2:1b: config correctly reads the value
- Hot reload picks up changes without restart
- npm run test:unit passes with no regressions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant