Skip to content

Databricks connector missing catalog #2177

@justin-gerolami

Description

@justin-gerolami

Describe the bug
The Databricks connector attempts to list the columns using the information schema, however, a catalog is never specified in the configuration. This results in errors because the sql query doesn't find the information_schema.

In wren-engine-ibis/app/model/connector.py, the DatabricksConnector calls dbsql.connect() without a catalog parameter. And in wren-engine-ibis/app/model/metadata/databricks.py, the get_table_list() query references INFORMATION_SCHEMA.COLUMNS without a catalog prefix. In Databricks Unity Catalog environments with multiple catalogs, if no default catalog is set on the SQL warehouse, Spark cannot resolve INFORMATION_SCHEMA without a catalog qualifier.

The DatabricksTokenConnectionInfo model in app/model/__init__.py also has no catalog field, so there is no way for the user to specify one through the UI.

SELECT
  c.TABLE_CATALOG AS TABLE_CATALOG,
  c.TABLE_SCHEMA AS TABLE_SCHEMA,
  c.TABLE_NAME AS TABLE_NAME,
  c.COLUMN_NAME AS COLUMN_NAME,
  c.DATA_TYPE AS DATA_TYPE,
  c.IS_NULLABLE AS IS_NULLABLE,
  c.COMMENT AS COLUMN_COMMENT,
  t.COMMENT AS TABLE_COMMENT
FROM
  INFORMATION_SCHEMA.COLUMNS c
    JOIN INFORMATION_SCHEMA.TABLES t
      ON c.TABLE_SCHEMA = t.TABLE_SCHEMA
      AND c.TABLE_NAME = t.TABLE_NAME
      AND c.TABLE_CATALOG = t.TABLE_CATALOG
WHERE
  c.TABLE_SCHEMA NOT IN ('information_schema')

To Reproduce
Steps to reproduce the behavior:

  1. Have a Databricks workspace with Unity Catalog enabled and multiple catalogs
  2. Deploy WrenAI via Docker Compose
  3. Open the Wren UI and configure a new Databricks data source with server hostname, HTTP path, and access token
  4. Submit the connection — the ibis-server calls get_table_list() which runs the above query
  5. Error: [TABLE_OR_VIEW_NOT_FOUND] The table or view INFORMATION_SCHEMA.COLUMNS cannot be found. SQLSTATE: 42P01

Expected behavior
The Databricks connection form should include a catalog field. The selected catalog should be:

  1. Passed as the catalog parameter to dbsql.connect() in DatabricksConnector.__init__()
  2. Or used to qualify the query as <catalog>.INFORMATION_SCHEMA.COLUMNS

This would align with how the Databricks SQL Connector for Python supports catalog selection: dbsql.connect(server_hostname=..., http_path=..., access_token=..., catalog="my_catalog")

Workaround
Manually patching connector.py inside the ibis-server container to add catalog="<name>" to the dbsql.connect() call resolves the issue.

Desktop (please complete the following information):

  • OS: macOS (Darwin 25.0.0, Apple Silicon)
  • Browser: Chrome

Wren AI Information

  • Version: 0.29.1 (latest as of 2025-11-28)
  • Wren Engine: 0.22.0
  • Wren AI Service: 0.29.0
  • Ibis Server: 0.22.0
  • Wren UI: 0.32.2
  • Bootstrap: 0.1.5

Relevant log output

ibis-server stack trace:

File "/app/app/model/metadata/databricks.py", line 61, in get_table_list
    response = self.connection.query(sql).to_pandas().to_dict(orient="records")
  File "/app/app/model/connector.py", line 627, in query
    cursor.execute(sql)
  ...
databricks.sql.exc.ServerOperationError: [TABLE_OR_VIEW_NOT_FOUND] The table or view `INFORMATION_SCHEMA`.`COLUMNS` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.
To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS. SQLSTATE: 42P01; line 12 pos 16

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions