fix(mssql): expose real schemas in information_schema.TABLES and fix preview query (#12242) by Beandon13 · Pull Request #12429 · mindsdb/minds-platform

Beandon13 · 2026-05-06T13:03:04Z

Description

When a MSSQL Server datasource is connected to MindsDB, two related bugs
affected how table schemas were exposed:

Bug 1 — INFORMATION_SCHEMA.TABLES hides real SQL schemas

MindsDB's system_tables layer unconditionally overwrites TABLE_SCHEMA
with the datasource name (e.g. f_prod). For single-schema integrations
(MySQL, SQLite) this is fine, but MSSQL databases routinely contain multiple
user schemas (dbo, app, usr, …). After the overwrite, all real schema
information was lost, so SELECT TABLE_SCHEMA, TABLE_NAME FROM INFORMATION_SCHEMA.TABLES returned f_prod in TABLE_SCHEMA and a bare
table name instead of the fully-qualified dbo.MyTable.

Bug 2 — UI table preview generates an invalid two-part query

Because the schema was not embedded in TABLE_NAME, the table-preview tooltip
generated SELECT * FROM f_prod.MyTable instead of the correct three-part
form SELECT * FROM f_prod.dbo.MyTable, which fails at execution time.

Type of change

🐛 Bug fix (non-breaking change which fixes an issue)

Fix

get_tables() in the MSSQL handler now accepts an all: bool = False
parameter, matching the pattern already used by the Postgres and Databricks
handlers:

Caller	`all`	Behaviour
`INFORMATION_SCHEMA.TABLES` (system_tables layer)	`False` (default)	`table_name` returned as `<schema>.<table>` so the full three-part reference is preserved after the datasource-level `TABLE_SCHEMA` override
Explorer / tree API (`tree.py`)	`True`	Raw `table_schema` + plain `table_name` returned so the UI can group tables under schema nodes
`self.schema` configured	either	Schema filter applied; plain unqualified `table_name` returned (original behaviour)

get_columns() was extended with an optional schema_name parameter
(consistent with the Postgres handler) and now automatically extracts the
schema from a qualified <schema>.<table> name produced by get_tables(),
so column lookups remain precise in all modes.

Non-user system schemas (sys, guest, fixed database role schemas) are
filtered out in both all=True and all=False modes.

Verification Process

Test Location: tests/unit/handlers/test_mssql.py
Verification Steps:
1. Connect a MSSQL datasource with multiple schemas (e.g. dbo, app).
2. Run SELECT TABLE_SCHEMA, TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = '<datasource>' — TABLE_NAME now includes the schema prefix (dbo.Customers, app.Orders, …).
3. Run SELECT * FROM <datasource>.dbo.Customers LIMIT 10 — executes successfully.
4. Hover a table in the Explorer UI — the preview query reads SELECT * FROM <datasource>.<schema>.<table> LIMIT 100.

Checklist

My code follows the style guidelines (PEP 8) of MindsDB.
I have appropriately commented on my code, especially in complex areas.
Relevant unit tests are updated or added (3 new test_get_tables_* tests, 2 new test_get_columns_* tests).

…preview query (mindsdb#12242) MindsDB's information_schema layer always overwrites TABLE_SCHEMA with the datasource name, which previously caused the real SQL schema names (dbo, app, usr, etc.) to be completely invisible when querying INFORMATION_SCHEMA.TABLES against a MSSQL datasource. The UI table-preview tooltip generated an invalid two-part query (<datasource>.<table>) instead of the required three-part form (<datasource>.<schema>.<table>). Changes: * mssql_handler.get_tables() now accepts an `all` flag (matching the postgres / databricks pattern used by tree.py for the Explorer UI): - all=True (Explorer mode): returns raw table_schema + table_name columns so the UI can group tables under their schema nodes. - all=False (default, used by INFORMATION_SCHEMA.TABLES): qualifies table_name as "<schema>.<table>" so the full three-part name is preserved even after the system-level TABLE_SCHEMA override. Non-user schemas (sys, guest, fixed database roles) are filtered out in both modes. - self.schema configured: always filters to that single schema with plain unqualified names (original behavior preserved). * mssql_handler.get_columns() now accepts an optional schema_name parameter (consistent with the postgres handler) and automatically extracts the schema from a qualified "<schema>.<table>" table_name produced by get_tables(), so column lookups remain precise when no explicit schema is configured. * Tests: three new test methods cover get_tables(all=False), get_tables(all=True), get_tables with a configured schema, get_columns with a qualified table name, and get_columns with an explicit schema_name argument. Fixes mindsdb#12242 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

entelligence-ai-pr-reviews · 2026-05-06T13:06:58Z

EntelligenceAI PR Summary

Refactors MSSQL handler table and column retrieval to support multiple query modes and improved schema resolution.

get_tables() now accepts all: bool parameter splitting behavior into schema-filtered, Explorer UI (all=True), and INFORMATION_SCHEMA (all=False) modes
System schemas (sys, INFORMATION_SCHEMA, guest, fixed-role schemas) are explicitly excluded in all query paths
Default mode qualifies table names as <schema>.<table> for full three-part name resolution in MindsDB SQL
get_columns() gains optional schema_name parameter with priority-based schema resolution: explicit arg > embedded schema in table name > self.schema
Four new unit tests added covering all new get_tables and get_columns modes and behaviors

Confidence Score: 2/5 - Changes Needed

Not safe to merge — the refactored get_tables() method in mssql_handler.py introduces a SQL injection vulnerability by directly interpolating the unsanitized schema_name parameter into the query string via f" AND table_schema = '{effective_schema}'", a security risk that did not exist before this PR. While the PR achieves meaningful improvements — multi-mode schema resolution, explicit exclusion of system schemas, and better three-part name qualification — the injection vector on effective_schema must be addressed with parameterized queries or strict allowlist validation before this can be safely deployed. Any caller or downstream system that passes user-influenced schema names would be exposed to arbitrary SQL execution against the MSSQL backend.

Key Findings:

Direct f-string interpolation of effective_schema (derived from the schema_name parameter) into the SQL WHERE clause in mssql_handler.py creates a SQL injection vector — a value like '; DROP TABLE foo; -- would execute arbitrary SQL on the target MSSQL server.
The vulnerability is newly introduced by this PR; prior to this refactor, the schema filtering path did not accept external schema_name input in this way, making this a net regression in security posture despite the functional improvements.
The fix is straightforward — use parameterized query arguments (e.g., passing effective_schema as a bound parameter rather than string interpolation), but the change must be made before merge to prevent exploitation.

Files requiring special attention

mindsdb/integrations/handlers/mssql_handler/mssql_handler.py

entelligence-ai-pr-reviews

Refactors MSSQL handler table and column retrieval to support multiple query modes and improved schema resolution.

get_tables() now accepts all: bool parameter splitting behavior into schema-filtered, Explorer UI (all=True), and INFORMATION_SCHEMA (all=False) modes
System schemas (sys, INFORMATION_SCHEMA, guest, fixed-role schemas) are explicitly excluded in all query paths
Default mode qualifies table names as <schema>.<table> for full three-part name resolution in MindsDB SQL
get_columns() gains optional schema_name parameter with priority-based schema resolution: explicit arg > embedded schema in table name > self.schema
Four new unit tests added covering all new get_tables and get_columns modes and behaviors

entelligence-ai-pr-reviews · 2026-05-06T13:06:59Z

+        # Resolve schema: explicit arg > embedded in table_name > handler-level self.schema.
+        effective_schema = schema_name or self.schema
+        if effective_schema is None and "." in table_name:
+            # table_name was qualified by get_tables() as "<schema>.<table>"
+            parts = table_name.split(".", 1)
+            effective_schema, table_name = parts[0], parts[1]
+
        query = f"""
            SELECT
                COLUMN_NAME,


Correctness: The new schema_name parameter is directly interpolated into the SQL query (f" AND table_schema = '{effective_schema}'") without any sanitization, creating a SQL injection vector that didn't exist before — a caller passing schema_name = "'; DROP TABLE foo; --" would execute arbitrary SQL.

🤖 AI Agent Prompt for Cursor/Windsurf

📋 Copy this prompt to your AI coding assistant (Cursor, Windsurf, etc.) to get help fixing this issue

In file `mindsdb/integrations/handlers/mssql_handler/mssql_handler.py`, the `get_columns` method (around line 533-542) builds SQL by directly interpolating `effective_schema` (which comes from the new `schema_name` parameter or from splitting `table_name`) into the query string. Add input validation/sanitization for `effective_schema` before using it in the f-string, e.g. validate it matches `^[a-zA-Z0-9_]+$` and raise ValueError otherwise, to prevent SQL injection through the new `schema_name` parameter.

Addresses SQL injection risk flagged in review: effective_schema and table_name were interpolated directly into the WHERE clause without sanitization. Apply SQL string-literal escaping (doubling single-quotes) to safe_table_name and safe_schema before interpolation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Beandon13 · 2026-05-06T14:04:44Z

Good catch on the injection risk. Fixed in the latest push: both table_name and effective_schema are now passed through str.replace("'", "''") before interpolation — standard SQL string-literal escaping. These values come from the DB's own INFORMATION_SCHEMA or from the handler's configured self.schema, but agreed the defensive escaping is the right call regardless of the source.

entelligence-ai-pr-reviews Bot reviewed May 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(mssql): expose real schemas in information_schema.TABLES and fix preview query (#12242)#12429

fix(mssql): expose real schemas in information_schema.TABLES and fix preview query (#12242)#12429
Beandon13 wants to merge 2 commits into
mindsdb:mainfrom
Beandon13:fix/mindsdb-12242-mssql-schema

Beandon13 commented May 6, 2026

Uh oh!

entelligence-ai-pr-reviews Bot commented May 6, 2026

Uh oh!

entelligence-ai-pr-reviews Bot left a comment

Uh oh!

entelligence-ai-pr-reviews Bot May 6, 2026

Uh oh!

Beandon13 commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Beandon13 commented May 6, 2026

Description

Type of change

Fix

Verification Process

Checklist

Uh oh!

entelligence-ai-pr-reviews Bot commented May 6, 2026

EntelligenceAI PR Summary

Confidence Score: 2/5 - Changes Needed

Uh oh!

entelligence-ai-pr-reviews Bot left a comment

Choose a reason for hiding this comment

Uh oh!

entelligence-ai-pr-reviews Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Beandon13 commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant