Skip to content

Commit d5b07ff

Browse files
committed
Add compact feature
1 parent c2867d3 commit d5b07ff

7 files changed

Lines changed: 140 additions & 53 deletions

File tree

Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
---
2+
title: Chat UI
3+
description: A built-in chat interface for interactive conversations with your LLM agents, supporting rich media
4+
---
5+
6+
## πŸ“ Rich Markdown & Syntax Highlighting
7+
8+
Full markdown rendering with syntax highlighting for popular programming languages:
9+
10+
<Screenshot src="/img/llms-syntax.webp" />
11+
12+
Code blocks include:
13+
- Copy to clipboard on hover
14+
- Language detection
15+
- Line numbers
16+
- Syntax highlighting
17+
18+
## Compact Feature
19+
20+
The **Compact** feature is a powerful tool designed to help you manage long conversations by summarizing the current thread into a more concise version. This allows you to continue your conversation with the AI while significantly reducing token usage and costs, without losing the context of your discussion.
21+
22+
## When to use it
23+
24+
The **Compact** button appears automatically at the bottom of your thread when:
25+
* The conversation has more than **10 messages**.
26+
* **OR** you have used more than **40%** of the model's context limit.
27+
28+
<ScreenshotsGallery className="mb-8" gridClass="grid grid-cols-1 md:grid-cols-2 gap-4" images={{
29+
'Compact Button': '/img/compact-button.webp',
30+
'Compact Button Intensity': '/img/compact-intensity.webp',
31+
}} />
32+
33+
34+
## What it does
35+
36+
When activated, the Compact feature:
37+
1. **Analyzes** your current conversation thread.
38+
2. **Creates a new thread** with a summarized version of the chat history.
39+
3. **Preserves key information** while discarding redundant or less important details.
40+
4. **Targets a 30%** size of the original context, giving you much more room to continue.
41+
42+
<Screenshot src="/img/compact-result.webp" />
43+
44+
<Info>Your original thread is preserved! Compact creates a *new* thread, so you can always go back to the full history if needed.</Info>
45+
46+
## Benefits
47+
48+
* **Save Costs**: Reduces the number of tokens sent to the LLM, lowering the cost per request
49+
* **Extend Conversations**: Frees up context window space, preventing you from hitting the model's hard limit
50+
* **Improve Focus**: Helps AI focus on the current state of the conversation rather than getting distracted by old history
51+
52+
## Customizing Compact Behavior
53+
54+
The Compact feature is fully customizable through your [~/.llms/llms.json](https://github.com/ServiceStack/llms/blob/main/llms/llms.json) configuration file. You can modify the AI model used, the system prompt, and the user message template to tailor the compaction process to your needs.
55+
56+
### Configuration Location
57+
58+
Add a `compact` section to your [~/.llms/llms.json](https://github.com/ServiceStack/llms/blob/main/llms/llms.json) file under the `default` key:
59+
60+
```json
61+
{
62+
"compact": {
63+
"model": "Gemini 2.5 Flash Lite",
64+
"messages": [
65+
{ "role": "system", "content": "Your system prompt here..." },
66+
{ "role": "user", "content": "Your user message template here..." }
67+
]
68+
}
69+
}
70+
```
71+
72+
### Choosing a Model
73+
74+
You can specify any configured model for the compaction task. Fast, cost-effective models like **Gemini 2.5 Flash Lite** or **Claude 3.5 Haiku** are good choices since compaction is a straightforward summarization task.
75+
76+
### Template Placeholders
77+
78+
The user message template supports the following placeholders that get replaced with the actual thread data:
79+
80+
| Placeholder | Description |
81+
|-------------|-------------|
82+
| `{message_count}` | The total number of messages in the conversation being compacted |
83+
| `{token_count}` | The approximate token count of the original conversation |
84+
| `{target_tokens}` | The target token count for the compacted result (default: 30% of original) |
85+
| `{messages_json}` | The full conversation history as a JSON array of message objects |
86+
87+
### Example User Message Template
88+
89+
```
90+
Compact the following conversation while preserving all context needed to
91+
continue it coherently. The conversation has {message_count} messages totaling
92+
approximately {token_count} tokens. Target approximately {target_tokens} tokens.
93+
94+
<conversation>
95+
{messages_json}
96+
</conversation>
97+
98+
Return your response as a JSON object with a single "messages" key containing
99+
the compacted array.
100+
```
101+
102+
### Customization Tips
103+
104+
- **Adjust the target ratio**: Modify the system prompt to request more or less aggressive compaction
105+
- **Preserve specific content**: Add instructions to always keep certain types of information (code, URLs, decisions)
106+
- **Change the output format**: Customize how the AI structures the compacted conversation
107+
- **Use specialized models**: For technical conversations, you might prefer a model with stronger code understanding
108+
109+
## 🎭 Reasoning Support
110+
111+
Specialized rendering for reasoning models with thinking processes:
112+
113+
<Screenshot src="/img/llms-reasoning.webp" />
114+
115+
Shows:
116+
- Thinking process (collapsed by default)
117+
- Final response
118+
- Clear separation between reasoning and output
119+
120+
## πŸ“Š Token Metrics
121+
122+
See token usage for every message and conversation:
123+
124+
<Screenshot src="/img/llms-tokens-usage.webp" />
125+
126+
Displayed metrics:
127+
- Per-message token count
128+
- Thread total tokens
129+
- Input vs output tokens
130+
- Total cost
131+
- Response time
132+
133+
## ✏️ Edit & Redo
134+
135+
Edit previous messages or retry with different parameters:
136+
137+
- **Edit**: Modify user messages and rerun
138+
- **Redo**: Regenerate AI responses
139+
- Hover over messages to see options

β€Žcontent/docs/features/meta.jsonβ€Ž

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
"pages": [
55
"cli",
66
"web-ui",
7+
"chat-ui",
78
"analytics",
89
"core-tools",
910
"calculator-ui",

β€Žcontent/docs/features/web-ui.mdxβ€Ž

Lines changed: 0 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -29,18 +29,6 @@ Built-in dark mode support that respects your system preference or can be toggle
2929

3030
<Screenshot src="/img/llms-system-prompt-dark.webp" />
3131

32-
### πŸ“ Rich Markdown & Syntax Highlighting
33-
34-
Full markdown rendering with syntax highlighting for popular programming languages:
35-
36-
<Screenshot src="/img/llms-syntax.webp" />
37-
38-
Code blocks include:
39-
- Copy to clipboard on hover
40-
- Language detection
41-
- Line numbers
42-
- Syntax highlighting
43-
4432
### πŸ” Search History
4533

4634
Quickly find past conversations with built-in search:
@@ -93,47 +81,6 @@ Available parameters:
9381
- **Reasoning Effort**: For reasoning models
9482
- **Top Logprobs** (0-20): Token probability analysis
9583

96-
### 🎭 Reasoning Support
97-
98-
Specialized rendering for reasoning models with thinking processes:
99-
100-
<Screenshot src="/img/llms-reasoning.webp" />
101-
102-
Shows:
103-
- Thinking process (collapsed by default)
104-
- Final response
105-
- Clear separation between reasoning and output
106-
107-
### πŸ“Š Token Metrics
108-
109-
See token usage for every message and conversation:
110-
111-
<Screenshot src="/img/llms-tokens-usage.webp" />
112-
113-
Displayed metrics:
114-
- Per-message token count
115-
- Thread total tokens
116-
- Input vs output tokens
117-
- Total cost
118-
- Response time
119-
120-
### ✏️ Edit & Redo
121-
122-
Edit previous messages or retry with different parameters:
123-
124-
- **Edit**: Modify user messages and rerun
125-
- **Redo**: Regenerate AI responses
126-
- Hover over messages to see options
127-
128-
### πŸ’Ύ Export/Import
129-
130-
Backup and transfer your chat history:
131-
132-
- **Export**: Save all conversations to JSON
133-
- **Import**: Restore from backup
134-
- Hold `ALT` while clicking Export to include analytics
135-
- Transfer between browsers or instances
136-
13784
### πŸ”Œ Enable/Disable Providers
13885

13986
Manage which providers are active in real-time:
94.4 KB
Loading
95.8 KB
Loading
62.7 KB
Loading

β€Žpublic/img/llms-syntax.webpβ€Ž

43.6 KB
Loading

0 commit comments

Comments
Β (0)