Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions databricks-skills/databricks-agent-bricks/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,17 @@ manage_mas(
- **[databricks-model-serving](../databricks-model-serving/SKILL.md)** - Deploy custom agent endpoints used as MAS agents
- **[databricks-vector-search](../databricks-vector-search/SKILL.md)** - Build vector indexes for RAG applications paired with KAs

## Common Issues

| Issue | Solution |
|-------|----------|
| **KA endpoint stuck in PROVISIONING** | Endpoints take 5-15 minutes to provision. Use `manage_ka(action="get", tile_id="...")` to poll status. If stuck >20 min, delete and recreate |
| **KA returns generic answers ignoring documents** | Ensure documents are indexed (check knowledge source status). Add specific instructions telling the KA to cite sources |
| **MAS routes all questions to one agent** | Agent descriptions are critical for routing. Make each description specific about what that agent handles vs. doesn't handle |
| **"Endpoint not found" when querying KA** | The endpoint name follows the pattern `ka-{tile_id_prefix}-endpoint` where prefix is the first segment of the tile_id before the first hyphen |
| **Examples not being added to KA** | Examples are queued when endpoint is not ONLINE yet. They are added automatically once the endpoint becomes ready |
| **Genie space returns no results** | Verify the warehouse is running and the tables in `table_identifiers` exist and are accessible to the current user |

## See Also

- `1-knowledge-assistants.md` - Detailed KA patterns and examples
Expand Down
11 changes: 11 additions & 0 deletions databricks-skills/databricks-aibi-dashboards/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -843,6 +843,17 @@ result = create_or_update_dashboard(
print(result["url"])
```

## Common Issues

| Issue | Solution |
|-------|----------|
| **Dashboard API returns 404** | Verify the dashboard ID is correct. Use `list_lakeview_dashboards` to find valid IDs. Draft dashboards use a different endpoint than published ones |
| **SQL query works in editor but fails in dashboard** | Dashboard queries run as the dashboard owner. Ensure the owner has `SELECT` on all referenced tables and `USE CATALOG`/`USE SCHEMA` grants |
| **Chart shows no data despite valid query** | Field names in `query.fields[].name` must exactly match `encodings[].fieldName`. See Troubleshooting section below for details |
| **Widget layout overlaps or misaligned** | Positions use a 6-column grid. Ensure `x + width <= 6` for each widget. Heights are in grid units (1 unit ≈ 40px) |
| **Filter widget not filtering other widgets** | Filters use `associatedQueries` to link to datasets. Verify the query name and column name match exactly |
| **Published dashboard shows stale data** | Published dashboards use a schedule. Update the schedule or use `execute_sql` to refresh the underlying tables |

## Troubleshooting

### Widget shows "no selected fields to visualize"
Expand Down
11 changes: 11 additions & 0 deletions databricks-skills/databricks-config/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,14 @@ Use the `manage_workspace` MCP tool for all workspace operations. Do NOT edit `~
4. Present the result. For `status`/`switch`/`login`: show host, profile, username. For `list`: formatted table with the active profile marked.

> **Note:** The switch is session-scoped — it resets on MCP server restart. For permanent profile setup, use `databricks auth login -p <profile>` and update `~/.databrickscfg` with `cluster_id` or `serverless_compute_id = auto`.

## Common Issues

| Issue | Solution |
|-------|----------|
| **`manage_workspace` returns "no profiles found"** | Run `databricks auth login --host https://your-workspace.cloud.databricks.com` to create a profile in `~/.databrickscfg` |
| **Switch doesn't persist after restart** | Expected — switches are session-scoped. For permanent changes, set `DATABRICKS_HOST` / `DATABRICKS_TOKEN` env vars |
| **"Token expired" errors** | Re-authenticate with `databricks auth login`. OAuth tokens from `databricks auth login` auto-refresh; PATs do not |
| **Wrong workspace after switching** | Use `action="status"` to verify which workspace is active. The MCP server may have restarted, resetting the switch |
| **Multiple profiles for same host** | Use distinct profile names. The CLI picks the first matching host if no profile is specified |
| **`DATABRICKS_CONFIG_PROFILE` not respected** | Env vars override `~/.databrickscfg` defaults. Unset conflicting env vars: `DATABRICKS_HOST`, `DATABRICKS_TOKEN` |
13 changes: 13 additions & 0 deletions databricks-skills/databricks-dbsql/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -298,3 +298,16 @@ Load these for detailed syntax, full parameter lists, and advanced patterns:
- **Define PK/FK constraints** on dimensional models for query optimization
- **Use `COLLATE UTF8_LCASE`** for user-facing string columns that need case-insensitive search
- **Use MCP tools** (`execute_sql`, `execute_sql_multi`) to test and validate all SQL before deploying

## Common Issues

| Issue | Solution |
|-------|----------|
| **`execute_sql` times out on large queries** | Add `LIMIT` during development. For production, use `execute_sql_multi` to break into smaller statements |
| **`ai_query` returns NULL or errors** | Ensure the Foundation Model API endpoint exists and is running. Check that the prompt column is not NULL. Use `ai_query('databricks-meta-llama-...', col)` with a valid model name |
| **Pipe syntax `\|>` not recognized** | Pipe syntax requires DBR 16.2+. Check your warehouse version. Use traditional `SELECT ... FROM ... WHERE` as fallback |
| **`COLLATE` errors on string comparisons** | `COLLATE` requires DBR 16.0+. Define collation at column creation: `name STRING COLLATE UTF8_LCASE` |
| **Materialized view refresh fails** | MVs require a SQL warehouse or DLT pipeline to refresh. They cannot be refreshed from an all-purpose cluster |
| **`MERGE INTO` performance is slow** | Add `CLUSTER BY` on the merge key columns. Ensure the target table has liquid clustering enabled |
| **`http_request` blocked or returns 403** | `http_request` requires allowlisting the target domain. Contact your workspace admin to configure network access |
| **Recursive CTE hits iteration limit** | Default max recursion is 100. Add `OPTION (MAXRECURSION n)` or restructure to avoid deep recursion |
10 changes: 10 additions & 0 deletions databricks-skills/databricks-docs/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,16 @@ The llms.txt file is organized by category:
2. Read the specific docs to understand the feature
3. Determine which skill/tools apply, then use them

## Common Issues

| Issue | Solution |
|-------|----------|
| **llms.txt is too large to process** | Don't fetch the entire file. Search for keywords in the URL index first, then fetch only the specific documentation pages you need |
| **Documentation page returns 404** | Databricks docs URLs change when features are renamed. Search llms.txt for the feature name to find the current URL |
| **Docs show different API than what works** | Check the DBR/runtime version. Many features require specific minimum versions (e.g., pipe syntax needs DBR 16.2+) |
| **Can't find docs for a preview feature** | Preview features may only be documented in release notes. Search for the feature name in the release notes page |
| **Conflicting information between docs pages** | Prefer the more specific page (e.g., feature-specific guide over general overview). Check the page's last-updated date |

## Related Skills

- **[databricks-python-sdk](../databricks-python-sdk/SKILL.md)** - SDK patterns for programmatic Databricks access
Expand Down
12 changes: 12 additions & 0 deletions databricks-skills/databricks-mlflow-evaluation/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,18 @@ For automatically improving a registered system prompt using `optimize_prompts()

See `GOTCHAS.md` for complete list.

## Common Issues

| Issue | Solution |
|-------|----------|
| **`mlflow.evaluate()` vs `mlflow.genai.evaluate()`** | Use `mlflow.genai.evaluate()` for GenAI agents. The older `mlflow.evaluate()` has a different API and doesn't support GenAI scorers |
| **`predict_fn` receives dict instead of kwargs** | The predict function receives `**unpacked` keyword arguments, not a single dict. Define it as `def predict(query, context=None)` not `def predict(inputs)` |
| **Scorer returns wrong type** | `@scorer` functions must return a `Score` object: `Score(value=0.8, rationale="...")`. Don't return raw floats or strings |
| **Dataset `inputs` format error** | Inputs must be nested: `{"inputs": {"query": "..."}}` not `{"query": "..."}`. Each row's `inputs` dict is unpacked as kwargs to `predict_fn` |
| **Built-in scorer fails with "no guidelines"** | `Guidelines` scorer requires a `guidelines` parameter. Pass it as: `Guidelines(name="helpful", guidelines="The response should be helpful")` |
| **Evaluation runs but scores are all None** | Check that your scorer handles the response format correctly. If `predict_fn` returns a dict, the scorer receives that dict as `output` |
| **MemAlign requires human labels** | MemAlign calibrates judge prompts from domain expert feedback. You need at least 20-50 labeled examples for meaningful alignment |

## Related Skills

- **[databricks-docs](../databricks-docs/SKILL.md)** - General Databricks documentation reference
Expand Down
12 changes: 12 additions & 0 deletions databricks-skills/databricks-python-sdk/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -614,6 +614,18 @@ If I'm unsure about a method, I should:
| Secrets | https://databricks-sdk-py.readthedocs.io/en/latest/workspace/workspace/secrets.html |
| DBUtils | https://databricks-sdk-py.readthedocs.io/en/latest/dbutils.html |

## Common Issues

| Issue | Solution |
|-------|----------|
| **`ValueError: default auth` on `WorkspaceClient()`** | No valid credentials found. Run `databricks auth login --host <url>` or set `DATABRICKS_HOST` + `DATABRICKS_TOKEN` env vars |
| **`PermissionDenied` on API calls** | The authenticated user/SP lacks permissions. Check grants with `w.grants.get()` or ask a workspace admin |
| **SDK method signature changed** | The SDK is actively developed. Pin your version in `requirements.txt`. Check the [changelog](https://github.com/databricks/databricks-sdk-py/releases) for breaking changes |
| **`w.jobs.list()` is very slow** | The workspace may have thousands of jobs. Use `w.jobs.list(name="prefix")` to filter, or add `limit=N` |
| **Databricks Connect `SparkSession` fails** | Ensure `databricks-connect` version matches your DBR version. Use `serverless_compute_id="auto"` for serverless |
| **`ImportError` for SDK service classes** | Import from the correct submodule: `from databricks.sdk.service.workspace import ImportFormat` not `from databricks.sdk import ImportFormat` |
| **OAuth token refresh fails** | Re-run `databricks auth login`. If using a service principal, check that the client secret hasn't expired |

## Related Skills

- **[databricks-config](../databricks-config/SKILL.md)** - profile and authentication setup
Expand Down
13 changes: 13 additions & 0 deletions databricks-skills/databricks-spark-structured-streaming/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,3 +63,16 @@ df.writeStream \
- [ ] Exactly-once verified (txnVersion/txnAppId)
- [ ] Watermark configured for stateful operations
- [ ] Left joins for stream-static (not inner)

## Common Issues

| Issue | Solution |
|-------|----------|
| **Checkpoint corruption after schema change** | Checkpoints are tied to the query plan. Schema changes require a new checkpoint location. Back up the old checkpoint before changing |
| **OOM on stateful operations** | Enable RocksDB state store: `spark.conf.set("spark.sql.streaming.stateStore.providerClass", "com.databricks.sql.streaming.state.RocksDBStateStoreProvider")` |
| **`availableNow` trigger processes no data** | Ensure the source has new data since the last checkpoint. Check that the checkpoint path is correct and accessible |
| **Stream-static join returns stale data** | The static side is read once per micro-batch by default. Use `spark.sql.streaming.forceDeleteTempCheckpointLocation` or refresh the static DataFrame |
| **`foreachBatch` MERGE has duplicates** | Use `txnVersion` and `txnAppId` for idempotent writes: `deltaTable.merge(...).whenMatchedUpdateAll().whenNotMatchedInsertAll().execute()` |
| **Auto Loader `cloudFiles` schema inference fails** | Set `cloudFiles.schemaLocation` to a persistent path. For schema evolution, use `cloudFiles.schemaEvolutionMode = "addNewColumns"` |
| **Watermark delay too aggressive** | Late data arriving after the watermark is dropped silently. Set watermark delay >= max expected lateness of your data |
| **Streaming query silently stops** | Check the Spark UI for exceptions. Add a `StreamingQueryListener` or monitor `query.lastProgress` for null batches |
12 changes: 12 additions & 0 deletions databricks-skills/databricks-unity-catalog/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,18 @@ mcp__databricks__execute_sql(
- **[databricks-synthetic-data-gen](../databricks-synthetic-data-gen/SKILL.md)** - for generating data stored in Unity Catalog Volumes
- **[databricks-aibi-dashboards](../databricks-aibi-dashboards/SKILL.md)** - for building dashboards on top of Unity Catalog data

## Common Issues

| Issue | Solution |
|-------|----------|
| **`PERMISSION_DENIED` on system tables** | System tables require explicit grants: `GRANT USE CATALOG ON CATALOG system TO group`, then `GRANT USE SCHEMA` and `GRANT SELECT` on the specific schema |
| **System table query is slow** | Always filter by date: `WHERE event_date >= current_date() - 7`. System tables can have billions of rows |
| **`GRANT` fails with "not owner"** | Only the object owner or metastore admin can grant permissions. Use `SHOW GRANTS ON <object>` to check current ownership |
| **Table not visible after creation** | Check that `USE CATALOG` and `USE SCHEMA` grants exist for the user/group. Three-level namespace requires grants at each level |
| **Tags not appearing on table** | Tags are set via `ALTER TABLE ... SET TAGS`. Verify with `SELECT * FROM system.information_schema.table_tags` |
| **External location permission denied** | The storage credential must have access to the cloud path. Check `SHOW EXTERNAL LOCATIONS` and verify IAM/SAS permissions |
| **Delta Sharing recipient can't access share** | Verify the recipient's activation link was used. Check `SHOW GRANTS ON SHARE` and ensure tables are added to the share |

## Resources

- [Unity Catalog System Tables](https://docs.databricks.com/administration-guide/system-tables/)
Expand Down
10 changes: 10 additions & 0 deletions databricks-skills/spark-python-data-source/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,16 @@ Implement a batch writer for Snowflake with staged uploads
Write a data source for REST API with OAuth2 authentication and pagination
```

## Common Issues

| Issue | Solution |
|-------|----------|
| **`DataSource.schema()` returns wrong types** | Spark types must match exactly. Use `StructType([StructField("col", StringType())])` — don't return Python dicts |
| **Data source not found after registration** | Ensure `spark.dataSource.register(MyDataSource)` is called before `spark.read.format("my_source")`. The name comes from `MyDataSource.name()` |
| **Serialization error in `read()`** | The `DataSourceReader.read()` method runs on executors. Don't reference SparkSession or driver-only objects inside it |
| **Streaming source never triggers new batches** | `latestOffset()` must return a new offset when new data is available. If it returns the same offset, Spark skips the batch |
| **Schema evolution not supported** | Python data sources have a fixed schema from `schema()`. To handle schema changes, return a superset schema and fill missing fields with NULL |

## Related

- databricks-testing: Test data sources on Databricks clusters
Expand Down
8 changes: 2 additions & 6 deletions databricks-tools-core/databricks_tools_core/auth.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,9 +160,7 @@ def get_workspace_client() -> WorkspaceClient:
# Cross-workspace: explicit token overrides env OAuth so tool operations
# target the caller-specified workspace instead of the app's own workspace
if force and host and token:
return tag_client(
WorkspaceClient(host=host, token=token, auth_type="pat", **product_kwargs)
)
return tag_client(WorkspaceClient(host=host, token=token, auth_type="pat", **product_kwargs))

# In Databricks Apps (OAuth credentials in env), explicitly use OAuth M2M.
# Setting auth_type="oauth-m2m" prevents the SDK from also reading
Expand All @@ -185,9 +183,7 @@ def get_workspace_client() -> WorkspaceClient:

# Development mode: use explicit token if provided
if host and token:
return tag_client(
WorkspaceClient(host=host, token=token, auth_type="pat", **product_kwargs)
)
return tag_client(WorkspaceClient(host=host, token=token, auth_type="pat", **product_kwargs))

if host:
return tag_client(WorkspaceClient(host=host, **product_kwargs))
Expand Down
Loading