Skip to content

[STRC-926] Adds Explainability#82

Merged
pnewsam merged 1 commit intoreleases/v2.0.0from
feature/explainability-v1
Apr 9, 2026
Merged

[STRC-926] Adds Explainability#82
pnewsam merged 1 commit intoreleases/v2.0.0from
feature/explainability-v1

Conversation

@pnewsam
Copy link
Copy Markdown
Contributor

@pnewsam pnewsam commented Apr 7, 2026

Adds Explainability

Summary

Anton is able to generate sophisticated answers to queries, leveraging data from many datasources. Many users will want to understand how these answers are sourced.

This PR adds explainability in accordance with the ticket requirements. As Anton works, he/it will store context under .anton/explainability/. These JSON objects will include these fields:

  • answer text
  • summary
  • scratchpad steps
  • data sources, and
  • generated SQL (when available)

The most recent message will always be in latest.json, and all prior turns for the current conversation will be stored as turn-NNNN.json. Future conversations will overwrite these, so that only the most recent conversation is explainable. In future we could easily extend this to key turns by chat session, as is being done with memory.

.anton/
  explainability/
    latest.json              # Always the most recent turn (overwritten each turn)
    turn-0001.json           # Per-turn snapshot (keyed by turn index, not session)
    turn-0002.json
    turn-NNNN.json

Usage

Standard

To use explainability, ask a question that requires database queries, like:

In the "postgres-TPC-DS" database, can you tell me what the total revenue per category is?

Then use the new /explain command:

you> /explain

Explain This Answer
Turn 1 • 2026-04-08T23:03:47.011144+00:00

Summary
I queried postgres-tpc-ds with 1 SQL statement to gather
the data behind this answer. I used the scratchpad to query
total revenue per category from tpc-ds store_sales.

Data Sources Used
  - postgres-tpc-ds

Generated SQL
  Query 1: postgres-tpc-ds
```sql
SELECT
        i.i_category AS category,
        SUM(ss.ss_sales_price::FLOAT *
ss.ss_quantity::FLOAT) AS total_revenue
    FROM store_sales ss
    JOIN item i ON ss.ss_item_sk = i.i_item_sk
    GROUP BY i.i_category
    ORDER BY total_revenue DESC

Auditing Data

If you'd like to see the data actually stored in the file, you can use this command:

cat .anton/explainability/latest.json

The output will look like:

{
  "turn": 1,
  "created_at": "2026-04-08T23:03:47.011144+00:00",
  "user_message": "In the \"postgres-TPC-DS\" database, can you tell me what the total revenue per category is?",
  "answer_text": "Here's the **total revenue per category** across all store sales:\n\n| Category | Total Revenue |\n|---|---|\n| 🥇 Music | $533,555,300 |\n| 🥈 Shoes | $526,660,700 |\n| 🥉 Electronics | $513,248,600 |\n| Home | $512,377,100 |\n| Sports | $511,330,300 |\n| Women | $509,343,800 |\n| Men | $509,100,900 |\n| Children | $508,566,900 |\n| Books | $503,149,600 |\n| Jewelry | $497,373,400 |\n| *(uncategorized)* | $13,592,900 |\n\n**💰 Grand Total: ~$5.14 billion**\n\n---\n\n**Key observations:**\n\n- **Distribution is remarkably even** — the top category (Music at $534M) is only ~7% above the bottom named category (Jewelry at $497M). This is characteristic of TPC-DS benchmark data, which uses synthetic, balanced distributions.\n- **Music & Shoes lead** the pack, with Electronics close behind.\n- **Jewelry sits at the bottom** of named categories, trailing the leader by ~$36M (~7%).\n- There's a small **uncategorized bucket** (~$14M) from items with a `NULL` category — worth cleaning up if this were production data.\n\nWant me to break this down further — by year, store, or look at return rates and discounts by category?",
  "summary": "I queried postgres-tpc-ds with 1 SQL statement to gather the data behind this answer. I used the scratchpad to query total revenue per category from tpc-ds store_sales.",
  "data_sources": [
    {
      "name": "postgres-tpc-ds",
      "engine": null
    }
  ],
  "sql_queries": [
    {
      "datasource": "postgres-tpc-ds",
      "sql": "SELECT\n        i.i_category AS category,\n        SUM(ss.ss_sales_price::FLOAT * ss.ss_quantity::FLOAT) AS total_revenue\n    FROM store_sales ss\n    JOIN item i ON ss.ss_item_sk = i.i_item_sk\n    GROUP BY i.i_category\n    ORDER BY total_revenue DESC",
      "engine": null,
      "status": "ok",
      "error_message": null
    }
  ],
  "scratchpad_steps": [
    "Query total revenue per category from TPC-DS store_sales"
  ]
}

Tests

Run Tests

pytest /Users/paulnewsam/mindsdb/anton-monorepo/anton/tests/test_chat.py /Users/paulnewsam/mindsdb/anton-monorepo/anton/tests/test_explainability.py
python3 -m compileall /Users/paulnewsam/mindsdb/anton-monorepo/anton/anton

Known limitation:

Antontron’s UI is currently “latest answer” explainability, not literal per-response footer actions, because the desktop app does not yet have structured response rendering.

@pnewsam pnewsam changed the title Explainability v1 Adds explainability Apr 7, 2026
record = collector.finalize("Home roasting is much cheaper.")

source_names = [source["name"] for source in record.data_sources]
assert "happymugcoffee.com" in source_names

source_names = [source["name"] for source in record.data_sources]
assert "happymugcoffee.com" in source_names
assert "burmancoffee.com" in source_names
@pnewsam pnewsam changed the title Adds explainability Adds Explainability Apr 7, 2026
@torrmal torrmal requested a review from tino097 April 8, 2026 03:00
@pnewsam pnewsam changed the base branch from main to releases/v2.0.0 April 8, 2026 19:07
@pnewsam pnewsam force-pushed the feature/explainability-v1 branch from e908cee to 7b6def7 Compare April 8, 2026 19:22
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@pnewsam
Copy link
Copy Markdown
Contributor Author

pnewsam commented Apr 8, 2026

I have read the CLA Document and I hereby sign the CLA

github-actions bot added a commit that referenced this pull request Apr 8, 2026
@pnewsam pnewsam force-pushed the feature/explainability-v1 branch 2 times, most recently from 798ec74 to cdc23db Compare April 8, 2026 22:49
@pnewsam pnewsam marked this pull request as ready for review April 8, 2026 23:11
@pnewsam pnewsam changed the title Adds Explainability [STRC-926] Adds Explainability Apr 8, 2026
@pnewsam
Copy link
Copy Markdown
Contributor Author

pnewsam commented Apr 9, 2026

@tino097 This one is now ready for review, whenever you have capacity.

Copy link
Copy Markdown
Contributor

@tino097 tino097 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just few nitpicks

assistant_text_parts: list[str] = []
_max_auto_retries = 2
_retry_count = 0
self._active_explainability = ExplainabilityCollector(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see similar initiation on line 145, why is this slightly different?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable tracks the explainability for the current turn, so it's defaulted to None on initialization, and only updated when the turn begins.

@pnewsam pnewsam force-pushed the feature/explainability-v1 branch from cd733a2 to b0e8351 Compare April 9, 2026 16:42
@pnewsam pnewsam force-pushed the feature/explainability-v1 branch from b0e8351 to bd3a32b Compare April 9, 2026 16:47
@pnewsam pnewsam merged commit 7df4175 into releases/v2.0.0 Apr 9, 2026
1 check passed
@github-actions github-actions bot locked and limited conversation to collaborators Apr 9, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants