databricks-solutions · CheeYuTan · Mar 10, 2026 · Mar 10, 2026
diff --git a/databricks-skills/databricks-agent-bricks/SKILL.md b/databricks-skills/databricks-agent-bricks/SKILL.md
@@ -204,6 +204,17 @@ manage_mas(
 - **[databricks-model-serving](../databricks-model-serving/SKILL.md)** - Deploy custom agent endpoints used as MAS agents
 - **[databricks-vector-search](../databricks-vector-search/SKILL.md)** - Build vector indexes for RAG applications paired with KAs
 
+## Common Issues
+
+| Issue | Solution |
+|-------|----------|
+| **KA endpoint stuck in PROVISIONING** | Endpoints take 5-15 minutes to provision. Use `manage_ka(action="get", tile_id="...")` to poll status. If stuck >20 min, delete and recreate |
+| **KA returns generic answers ignoring documents** | Ensure documents are indexed (check knowledge source status). Add specific instructions telling the KA to cite sources |
+| **MAS routes all questions to one agent** | Agent descriptions are critical for routing. Make each description specific about what that agent handles vs. doesn't handle |
+| **"Endpoint not found" when querying KA** | The endpoint name follows the pattern `ka-{tile_id_prefix}-endpoint` where prefix is the first segment of the tile_id before the first hyphen |
+| **Examples not being added to KA** | Examples are queued when endpoint is not ONLINE yet. They are added automatically once the endpoint becomes ready |
+| **Genie space returns no results** | Verify the warehouse is running and the tables in `table_identifiers` exist and are accessible to the current user |
+
 ## See Also
 
 - `1-knowledge-assistants.md` - Detailed KA patterns and examples

diff --git a/databricks-skills/databricks-aibi-dashboards/SKILL.md b/databricks-skills/databricks-aibi-dashboards/SKILL.md
@@ -843,6 +843,17 @@ result = create_or_update_dashboard(
 print(result["url"])
 ```
 
+## Common Issues
+
+| Issue | Solution |
+|-------|----------|
+| **Dashboard API returns 404** | Verify the dashboard ID is correct. Use `list_lakeview_dashboards` to find valid IDs. Draft dashboards use a different endpoint than published ones |
+| **SQL query works in editor but fails in dashboard** | Dashboard queries run as the dashboard owner. Ensure the owner has `SELECT` on all referenced tables and `USE CATALOG`/`USE SCHEMA` grants |
+| **Chart shows no data despite valid query** | Field names in `query.fields[].name` must exactly match `encodings[].fieldName`. See Troubleshooting section below for details |
+| **Widget layout overlaps or misaligned** | Positions use a 6-column grid. Ensure `x + width <= 6` for each widget. Heights are in grid units (1 unit ≈ 40px) |
+| **Filter widget not filtering other widgets** | Filters use `associatedQueries` to link to datasets. Verify the query name and column name match exactly |
+| **Published dashboard shows stale data** | Published dashboards use a schedule. Update the schedule or use `execute_sql` to refresh the underlying tables |
+
 ## Troubleshooting
 
 ### Widget shows "no selected fields to visualize"

diff --git a/databricks-skills/databricks-config/SKILL.md b/databricks-skills/databricks-config/SKILL.md
@@ -20,3 +20,14 @@ Use the `manage_workspace` MCP tool for all workspace operations. Do NOT edit `~
 4. Present the result. For `status`/`switch`/`login`: show host, profile, username. For `list`: formatted table with the active profile marked.
 
 > **Note:** The switch is session-scoped — it resets on MCP server restart. For permanent profile setup, use `databricks auth login -p <profile>` and update `~/.databrickscfg` with `cluster_id` or `serverless_compute_id = auto`.
+
+## Common Issues
+
+| Issue | Solution |
+|-------|----------|
+| **`manage_workspace` returns "no profiles found"** | Run `databricks auth login --host https://your-workspace.cloud.databricks.com` to create a profile in `~/.databrickscfg` |
+| **Switch doesn't persist after restart** | Expected — switches are session-scoped. For permanent changes, set `DATABRICKS_HOST` / `DATABRICKS_TOKEN` env vars |
+| **"Token expired" errors** | Re-authenticate with `databricks auth login`. OAuth tokens from `databricks auth login` auto-refresh; PATs do not |
+| **Wrong workspace after switching** | Use `action="status"` to verify which workspace is active. The MCP server may have restarted, resetting the switch |
+| **Multiple profiles for same host** | Use distinct profile names. The CLI picks the first matching host if no profile is specified |
+| **`DATABRICKS_CONFIG_PROFILE` not respected** | Env vars override `~/.databrickscfg` defaults. Unset conflicting env vars: `DATABRICKS_HOST`, `DATABRICKS_TOKEN` |
diff --git a/databricks-skills/databricks-dbsql/SKILL.md b/databricks-skills/databricks-dbsql/SKILL.md
@@ -298,3 +298,16 @@ Load these for detailed syntax, full parameter lists, and advanced patterns:
 - **Define PK/FK constraints** on dimensional models for query optimization
 - **Use `COLLATE UTF8_LCASE`** for user-facing string columns that need case-insensitive search
 - **Use MCP tools** (`execute_sql`, `execute_sql_multi`) to test and validate all SQL before deploying
+
+## Common Issues
+
+| Issue | Solution |
+|-------|----------|
+| **`execute_sql` times out on large queries** | Add `LIMIT` during development. For production, use `execute_sql_multi` to break into smaller statements |
+| **`ai_query` returns NULL or errors** | Ensure the Foundation Model API endpoint exists and is running. Check that the prompt column is not NULL. Use `ai_query('databricks-meta-llama-...',  col)` with a valid model name |
+| **Pipe syntax `\|>` not recognized** | Pipe syntax requires DBR 16.2+. Check your warehouse version. Use traditional `SELECT ... FROM ... WHERE` as fallback |
+| **`COLLATE` errors on string comparisons** | `COLLATE` requires DBR 16.0+. Define collation at column creation: `name STRING COLLATE UTF8_LCASE` |
+| **Materialized view refresh fails** | MVs require a SQL warehouse or DLT pipeline to refresh. They cannot be refreshed from an all-purpose cluster |
+| **`MERGE INTO` performance is slow** | Add `CLUSTER BY` on the merge key columns. Ensure the target table has liquid clustering enabled |
+| **`http_request` blocked or returns 403** | `http_request` requires allowlisting the target domain. Contact your workspace admin to configure network access |
+| **Recursive CTE hits iteration limit** | Default max recursion is 100. Add `OPTION (MAXRECURSION n)` or restructure to avoid deep recursion |
diff --git a/databricks-skills/databricks-docs/SKILL.md b/databricks-skills/databricks-docs/SKILL.md
@@ -55,6 +55,16 @@ The llms.txt file is organized by category:
 2. Read the specific docs to understand the feature
 3. Determine which skill/tools apply, then use them
 
+## Common Issues
+
+| Issue | Solution |
+|-------|----------|
+| **llms.txt is too large to process** | Don't fetch the entire file. Search for keywords in the URL index first, then fetch only the specific documentation pages you need |
+| **Documentation page returns 404** | Databricks docs URLs change when features are renamed. Search llms.txt for the feature name to find the current URL |
+| **Docs show different API than what works** | Check the DBR/runtime version. Many features require specific minimum versions (e.g., pipe syntax needs DBR 16.2+) |
+| **Can't find docs for a preview feature** | Preview features may only be documented in release notes. Search for the feature name in the release notes page |
+| **Conflicting information between docs pages** | Prefer the more specific page (e.g., feature-specific guide over general overview). Check the page's last-updated date |
+
 ## Related Skills
 
 - **[databricks-python-sdk](../databricks-python-sdk/SKILL.md)** - SDK patterns for programmatic Databricks access

diff --git a/databricks-skills/databricks-mlflow-evaluation/SKILL.md b/databricks-skills/databricks-mlflow-evaluation/SKILL.md
@@ -139,6 +139,18 @@ For automatically improving a registered system prompt using `optimize_prompts()
 
 See `GOTCHAS.md` for complete list.
 
+## Common Issues
+
+| Issue | Solution |
+|-------|----------|
+| **`mlflow.evaluate()` vs `mlflow.genai.evaluate()`** | Use `mlflow.genai.evaluate()` for GenAI agents. The older `mlflow.evaluate()` has a different API and doesn't support GenAI scorers |
+| **`predict_fn` receives dict instead of kwargs** | The predict function receives `**unpacked` keyword arguments, not a single dict. Define it as `def predict(query, context=None)` not `def predict(inputs)` |
+| **Scorer returns wrong type** | `@scorer` functions must return a `Score` object: `Score(value=0.8, rationale="...")`. Don't return raw floats or strings |
+| **Dataset `inputs` format error** | Inputs must be nested: `{"inputs": {"query": "..."}}` not `{"query": "..."}`. Each row's `inputs` dict is unpacked as kwargs to `predict_fn` |
+| **Built-in scorer fails with "no guidelines"** | `Guidelines` scorer requires a `guidelines` parameter. Pass it as: `Guidelines(name="helpful", guidelines="The response should be helpful")` |
+| **Evaluation runs but scores are all None** | Check that your scorer handles the response format correctly. If `predict_fn` returns a dict, the scorer receives that dict as `output` |
+| **MemAlign requires human labels** | MemAlign calibrates judge prompts from domain expert feedback. You need at least 20-50 labeled examples for meaningful alignment |
+
 ## Related Skills
 
 - **[databricks-docs](../databricks-docs/SKILL.md)** - General Databricks documentation reference

diff --git a/databricks-skills/databricks-python-sdk/SKILL.md b/databricks-skills/databricks-python-sdk/SKILL.md
@@ -614,6 +614,18 @@ If I'm unsure about a method, I should:
 | Secrets | https://databricks-sdk-py.readthedocs.io/en/latest/workspace/workspace/secrets.html |
 | DBUtils | https://databricks-sdk-py.readthedocs.io/en/latest/dbutils.html |
 
+## Common Issues
+
+| Issue | Solution |
+|-------|----------|
+| **`ValueError: default auth` on `WorkspaceClient()`** | No valid credentials found. Run `databricks auth login --host <url>` or set `DATABRICKS_HOST` + `DATABRICKS_TOKEN` env vars |
+| **`PermissionDenied` on API calls** | The authenticated user/SP lacks permissions. Check grants with `w.grants.get()` or ask a workspace admin |
+| **SDK method signature changed** | The SDK is actively developed. Pin your version in `requirements.txt`. Check the [changelog](https://github.com/databricks/databricks-sdk-py/releases) for breaking changes |
+| **`w.jobs.list()` is very slow** | The workspace may have thousands of jobs. Use `w.jobs.list(name="prefix")` to filter, or add `limit=N` |
+| **Databricks Connect `SparkSession` fails** | Ensure `databricks-connect` version matches your DBR version. Use `serverless_compute_id="auto"` for serverless |
+| **`ImportError` for SDK service classes** | Import from the correct submodule: `from databricks.sdk.service.workspace import ImportFormat` not `from databricks.sdk import ImportFormat` |
+| **OAuth token refresh fails** | Re-run `databricks auth login`. If using a service principal, check that the client secret hasn't expired |
+
 ## Related Skills
 
 - **[databricks-config](../databricks-config/SKILL.md)** - profile and authentication setup

diff --git a/databricks-skills/databricks-spark-structured-streaming/SKILL.md b/databricks-skills/databricks-spark-structured-streaming/SKILL.md
@@ -63,3 +63,16 @@ df.writeStream \
 - [ ] Exactly-once verified (txnVersion/txnAppId)
 - [ ] Watermark configured for stateful operations
 - [ ] Left joins for stream-static (not inner)
+
+## Common Issues
+
+| Issue | Solution |
+|-------|----------|
+| **Checkpoint corruption after schema change** | Checkpoints are tied to the query plan. Schema changes require a new checkpoint location. Back up the old checkpoint before changing |
+| **OOM on stateful operations** | Enable RocksDB state store: `spark.conf.set("spark.sql.streaming.stateStore.providerClass", "com.databricks.sql.streaming.state.RocksDBStateStoreProvider")` |
+| **`availableNow` trigger processes no data** | Ensure the source has new data since the last checkpoint. Check that the checkpoint path is correct and accessible |
+| **Stream-static join returns stale data** | The static side is read once per micro-batch by default. Use `spark.sql.streaming.forceDeleteTempCheckpointLocation` or refresh the static DataFrame |
+| **`foreachBatch` MERGE has duplicates** | Use `txnVersion` and `txnAppId` for idempotent writes: `deltaTable.merge(...).whenMatchedUpdateAll().whenNotMatchedInsertAll().execute()` |
+| **Auto Loader `cloudFiles` schema inference fails** | Set `cloudFiles.schemaLocation` to a persistent path. For schema evolution, use `cloudFiles.schemaEvolutionMode = "addNewColumns"` |
+| **Watermark delay too aggressive** | Late data arriving after the watermark is dropped silently. Set watermark delay >= max expected lateness of your data |
+| **Streaming query silently stops** | Check the Spark UI for exceptions. Add a `StreamingQueryListener` or monitor `query.lastProgress` for null batches |
diff --git a/databricks-skills/databricks-unity-catalog/SKILL.md b/databricks-skills/databricks-unity-catalog/SKILL.md
@@ -113,6 +113,18 @@ mcp__databricks__execute_sql(
 - **[databricks-synthetic-data-gen](../databricks-synthetic-data-gen/SKILL.md)** - for generating data stored in Unity Catalog Volumes
 - **[databricks-aibi-dashboards](../databricks-aibi-dashboards/SKILL.md)** - for building dashboards on top of Unity Catalog data
 
+## Common Issues
+
+| Issue | Solution |
+|-------|----------|
+| **`PERMISSION_DENIED` on system tables** | System tables require explicit grants: `GRANT USE CATALOG ON CATALOG system TO group`, then `GRANT USE SCHEMA` and `GRANT SELECT` on the specific schema |
+| **System table query is slow** | Always filter by date: `WHERE event_date >= current_date() - 7`. System tables can have billions of rows |
+| **`GRANT` fails with "not owner"** | Only the object owner or metastore admin can grant permissions. Use `SHOW GRANTS ON <object>` to check current ownership |
+| **Table not visible after creation** | Check that `USE CATALOG` and `USE SCHEMA` grants exist for the user/group. Three-level namespace requires grants at each level |
+| **Tags not appearing on table** | Tags are set via `ALTER TABLE ... SET TAGS`. Verify with `SELECT * FROM system.information_schema.table_tags` |
+| **External location permission denied** | The storage credential must have access to the cloud path. Check `SHOW EXTERNAL LOCATIONS` and verify IAM/SAS permissions |
+| **Delta Sharing recipient can't access share** | Verify the recipient's activation link was used. Check `SHOW GRANTS ON SHARE` and ensure tables are added to the share |
+
 ## Resources
 
 - [Unity Catalog System Tables](https://docs.databricks.com/administration-guide/system-tables/)

diff --git a/databricks-skills/spark-python-data-source/SKILL.md b/databricks-skills/spark-python-data-source/SKILL.md
@@ -136,6 +136,16 @@ Implement a batch writer for Snowflake with staged uploads
 Write a data source for REST API with OAuth2 authentication and pagination
 ```
 
+## Common Issues
+
+| Issue | Solution |
+|-------|----------|
+| **`DataSource.schema()` returns wrong types** | Spark types must match exactly. Use `StructType([StructField("col", StringType())])` — don't return Python dicts |
+| **Data source not found after registration** | Ensure `spark.dataSource.register(MyDataSource)` is called before `spark.read.format("my_source")`. The name comes from `MyDataSource.name()` |
+| **Serialization error in `read()`** | The `DataSourceReader.read()` method runs on executors. Don't reference SparkSession or driver-only objects inside it |
+| **Streaming source never triggers new batches** | `latestOffset()` must return a new offset when new data is available. If it returns the same offset, Spark skips the batch |
+| **Schema evolution not supported** | Python data sources have a fixed schema from `schema()`. To handle schema changes, return a superset schema and fill missing fields with NULL |
+
 ## Related
 
 - databricks-testing: Test data sources on Databricks clusters

diff --git a/databricks-tools-core/databricks_tools_core/auth.py b/databricks-tools-core/databricks_tools_core/auth.py
@@ -160,9 +160,7 @@ def get_workspace_client() -> WorkspaceClient:
     # Cross-workspace: explicit token overrides env OAuth so tool operations
     # target the caller-specified workspace instead of the app's own workspace
     if force and host and token:
-        return tag_client(
-            WorkspaceClient(host=host, token=token, auth_type="pat", **product_kwargs)
-        )
+        return tag_client(WorkspaceClient(host=host, token=token, auth_type="pat", **product_kwargs))
 
     # In Databricks Apps (OAuth credentials in env), explicitly use OAuth M2M.
     # Setting auth_type="oauth-m2m" prevents the SDK from also reading
@@ -185,9 +183,7 @@ def get_workspace_client() -> WorkspaceClient:
 
     # Development mode: use explicit token if provided
     if host and token:
-        return tag_client(
-            WorkspaceClient(host=host, token=token, auth_type="pat", **product_kwargs)
-        )
+        return tag_client(WorkspaceClient(host=host, token=token, auth_type="pat", **product_kwargs))
 
     if host:
         return tag_client(WorkspaceClient(host=host, **product_kwargs))