Merged
Conversation
| record = collector.finalize("Home roasting is much cheaper.") | ||
|
|
||
| source_names = [source["name"] for source in record.data_sources] | ||
| assert "happymugcoffee.com" in source_names |
|
|
||
| source_names = [source["name"] for source in record.data_sources] | ||
| assert "happymugcoffee.com" in source_names | ||
| assert "burmancoffee.com" in source_names |
e908cee to
7b6def7
Compare
|
All contributors have signed the CLA ✍️ ✅ |
Contributor
Author
|
I have read the CLA Document and I hereby sign the CLA |
798ec74 to
cdc23db
Compare
Contributor
Author
|
@tino097 This one is now ready for review, whenever you have capacity. |
tino097
approved these changes
Apr 9, 2026
| assistant_text_parts: list[str] = [] | ||
| _max_auto_retries = 2 | ||
| _retry_count = 0 | ||
| self._active_explainability = ExplainabilityCollector( |
Contributor
There was a problem hiding this comment.
i see similar initiation on line 145, why is this slightly different?
Contributor
Author
There was a problem hiding this comment.
This variable tracks the explainability for the current turn, so it's defaulted to None on initialization, and only updated when the turn begins.
cd733a2 to
b0e8351
Compare
b0e8351 to
bd3a32b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds Explainability
Summary
Anton is able to generate sophisticated answers to queries, leveraging data from many datasources. Many users will want to understand how these answers are sourced.
This PR adds explainability in accordance with the ticket requirements. As Anton works, he/it will store context under
.anton/explainability/. These JSON objects will include these fields:The most recent message will always be in
latest.json, and all prior turns for the current conversation will be stored asturn-NNNN.json. Future conversations will overwrite these, so that only the most recent conversation is explainable. In future we could easily extend this to key turns by chat session, as is being done with memory.Usage
Standard
To use explainability, ask a question that requires database queries, like:
Then use the new
/explaincommand:Auditing Data
If you'd like to see the data actually stored in the file, you can use this command:
The output will look like:
{ "turn": 1, "created_at": "2026-04-08T23:03:47.011144+00:00", "user_message": "In the \"postgres-TPC-DS\" database, can you tell me what the total revenue per category is?", "answer_text": "Here's the **total revenue per category** across all store sales:\n\n| Category | Total Revenue |\n|---|---|\n| 🥇 Music | $533,555,300 |\n| 🥈 Shoes | $526,660,700 |\n| 🥉 Electronics | $513,248,600 |\n| Home | $512,377,100 |\n| Sports | $511,330,300 |\n| Women | $509,343,800 |\n| Men | $509,100,900 |\n| Children | $508,566,900 |\n| Books | $503,149,600 |\n| Jewelry | $497,373,400 |\n| *(uncategorized)* | $13,592,900 |\n\n**💰 Grand Total: ~$5.14 billion**\n\n---\n\n**Key observations:**\n\n- **Distribution is remarkably even** — the top category (Music at $534M) is only ~7% above the bottom named category (Jewelry at $497M). This is characteristic of TPC-DS benchmark data, which uses synthetic, balanced distributions.\n- **Music & Shoes lead** the pack, with Electronics close behind.\n- **Jewelry sits at the bottom** of named categories, trailing the leader by ~$36M (~7%).\n- There's a small **uncategorized bucket** (~$14M) from items with a `NULL` category — worth cleaning up if this were production data.\n\nWant me to break this down further — by year, store, or look at return rates and discounts by category?", "summary": "I queried postgres-tpc-ds with 1 SQL statement to gather the data behind this answer. I used the scratchpad to query total revenue per category from tpc-ds store_sales.", "data_sources": [ { "name": "postgres-tpc-ds", "engine": null } ], "sql_queries": [ { "datasource": "postgres-tpc-ds", "sql": "SELECT\n i.i_category AS category,\n SUM(ss.ss_sales_price::FLOAT * ss.ss_quantity::FLOAT) AS total_revenue\n FROM store_sales ss\n JOIN item i ON ss.ss_item_sk = i.i_item_sk\n GROUP BY i.i_category\n ORDER BY total_revenue DESC", "engine": null, "status": "ok", "error_message": null } ], "scratchpad_steps": [ "Query total revenue per category from TPC-DS store_sales" ] }Tests
Run Tests
Known limitation:
Antontron’s UI is currently “latest answer” explainability, not literal per-response footer actions, because the desktop app does not yet have structured response rendering.