Skip to content

Commit abdbc50

Browse files
Refactor create_or_update_genie function for improved clarity and functionality
- Streamline the logic for handling serialized_space, ensuring clearer separation of update and create operations. - Enhance error handling for non-existent Genie spaces when updating by ID. - Update documentation to reflect changes in the usage of ask_genie and ask_genie_followup tools, clarifying their purposes and examples. - Revise examples in conversation and SKILL documentation to align with the new function structure and naming conventions.
1 parent d1d16a0 commit abdbc50

4 files changed

Lines changed: 157 additions & 95 deletions

File tree

databricks-mcp-server/databricks_mcp_server/tools/genie.py

Lines changed: 65 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -107,82 +107,82 @@ def create_or_update_genie(
107107

108108
operation = "created"
109109

110-
# When serialized_space is provided, use the public genie/spaces API
111-
if serialized_space:
112-
if space_id:
113-
# Update existing space with serialized config
114-
manager.genie_update_with_serialized_space(
115-
space_id=space_id,
116-
serialized_space=serialized_space,
117-
title=display_name,
118-
description=description,
119-
warehouse_id=warehouse_id,
120-
)
121-
operation = "updated"
122-
else:
123-
# Check if exists by name, then create or update
124-
existing = manager.genie_find_by_name(display_name)
125-
if existing:
126-
operation = "updated"
127-
space_id = existing.space_id
110+
# When serialized_space is provided, use the public genie/spaces API
111+
if serialized_space:
112+
if space_id:
113+
# Update existing space with serialized config
128114
manager.genie_update_with_serialized_space(
129115
space_id=space_id,
130116
serialized_space=serialized_space,
131117
title=display_name,
132118
description=description,
133119
warehouse_id=warehouse_id,
134120
)
135-
else:
136-
result = manager.genie_import(
137-
warehouse_id=warehouse_id,
138-
serialized_space=serialized_space,
139-
title=display_name,
140-
description=description,
141-
)
142-
space_id = result.get("space_id", "")
143-
else:
144-
if space_id:
145-
# Update existing space by ID
146-
existing = manager.genie_get(space_id)
147-
if existing:
148121
operation = "updated"
149-
manager.genie_update(
150-
space_id=space_id,
151-
display_name=display_name,
152-
description=description,
153-
warehouse_id=warehouse_id,
154-
table_identifiers=table_identifiers,
155-
sample_questions=sample_questions,
156-
)
157122
else:
158-
return {"error": f"Genie space {space_id} not found"}
123+
# Check if exists by name, then create or update
124+
existing = manager.genie_find_by_name(display_name)
125+
if existing:
126+
operation = "updated"
127+
space_id = existing.space_id
128+
manager.genie_update_with_serialized_space(
129+
space_id=space_id,
130+
serialized_space=serialized_space,
131+
title=display_name,
132+
description=description,
133+
warehouse_id=warehouse_id,
134+
)
135+
else:
136+
result = manager.genie_import(
137+
warehouse_id=warehouse_id,
138+
serialized_space=serialized_space,
139+
title=display_name,
140+
description=description,
141+
)
142+
space_id = result.get("space_id", "")
159143
else:
160-
# Check if exists by name first
161-
existing = manager.genie_find_by_name(display_name)
162-
if existing:
163-
operation = "updated"
164-
manager.genie_update(
165-
space_id=existing.space_id,
166-
display_name=display_name,
167-
description=description,
168-
warehouse_id=warehouse_id,
169-
table_identifiers=table_identifiers,
170-
sample_questions=sample_questions,
171-
)
172-
space_id = existing.space_id
144+
if space_id:
145+
# Update existing space by ID
146+
existing = manager.genie_get(space_id)
147+
if existing:
148+
operation = "updated"
149+
manager.genie_update(
150+
space_id=space_id,
151+
display_name=display_name,
152+
description=description,
153+
warehouse_id=warehouse_id,
154+
table_identifiers=table_identifiers,
155+
sample_questions=sample_questions,
156+
)
157+
else:
158+
return {"error": f"Genie space {space_id} not found"}
173159
else:
174-
# Create new
175-
result = manager.genie_create(
176-
display_name=display_name,
177-
warehouse_id=warehouse_id,
178-
table_identifiers=table_identifiers,
179-
description=description,
180-
)
181-
space_id = result.get("space_id", "")
182-
183-
# Add sample questions if provided
184-
if sample_questions and space_id:
185-
manager.genie_add_sample_questions_batch(space_id, sample_questions)
160+
# Check if exists by name first
161+
existing = manager.genie_find_by_name(display_name)
162+
if existing:
163+
operation = "updated"
164+
manager.genie_update(
165+
space_id=existing.space_id,
166+
display_name=display_name,
167+
description=description,
168+
warehouse_id=warehouse_id,
169+
table_identifiers=table_identifiers,
170+
sample_questions=sample_questions,
171+
)
172+
space_id = existing.space_id
173+
else:
174+
# Create new
175+
result = manager.genie_create(
176+
display_name=display_name,
177+
warehouse_id=warehouse_id,
178+
table_identifiers=table_identifiers,
179+
description=description,
180+
)
181+
space_id = result.get("space_id", "")
182+
183+
# Add sample questions if provided
184+
if sample_questions and space_id:
185+
manager.genie_add_sample_questions_batch(space_id, sample_questions)
186186

187187
response = {
188188
"space_id": space_id,

databricks-skills/databricks-genie/SKILL.md

Lines changed: 81 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ Use this skill when:
3333
|------|---------|
3434
| `list_genie` | List all Genie Spaces accessible to you |
3535
| `create_or_update_genie` | Create or update a Genie Space (supports `serialized_space`) |
36-
| `get_genie` | Get space details (by ID and support `include_serialized_space` parameter) or list all spaces (no ID) |
36+
| `get_genie` | Get Genie Space details (supports `include_serialized_space`) |
3737
| `delete_genie` | Delete a Genie Space |
3838
| `export_genie` | Export a Genie Space with full serialized configuration |
3939
| `import_genie` | Import / clone a Genie Space from a serialized payload |
@@ -42,7 +42,8 @@ Use this skill when:
4242

4343
| Tool | Purpose |
4444
|------|---------|
45-
| `ask_genie` | Ask a question or follow-up (`conversation_id` optional) |
45+
| `ask_genie` | Ask a question to a Genie Space, get SQL + results |
46+
| `ask_genie_followup` | Ask follow-up question in existing conversation |
4647

4748
### Supporting Tools
4849

@@ -113,28 +114,92 @@ import_genie(
113114

114115
#### Example: Migrating Genie Spaces from Prod to Dev
115116

116-
When migrating Genie Spaces between environments (e.g., from a `prod` target to a `dev` target defined in your `databricks.yml`), you must update the catalog references within the serialized space.
117+
When migrating Genie Spaces between environments (e.g., from a `prod` target to a `dev` target defined in your `databricks.yml`), you must update the catalog references within the serialized space.
117118

118119
**Note:** Genie Space migration assumes that the underlying data assets (schemas and tables) remain structurally identical across environments. The migration of the actual catalogs, schemas, or tables themselves is outside the scope of Genie Space migration skills.
119120

120-
For instance, if your production tables reside in the `healthverity_claims_sample_patient_dataset` catalog, but your development tables are in `healthverity_claims_sample_patient_dataset_dev`, you can perform a string replacement on the exported configuration before importing it into the target workspace:
121+
##### The Challenge: MCP Servers Are Workspace-Scoped
122+
123+
Each Databricks MCP server instance connects to exactly one workspace (set via `DATABRICKS_CONFIG_PROFILE` at startup). This means a single MCP server cannot export from PROD and import into DEV in the same session — you need two server instances.
124+
125+
##### Recommended Setup: Dual MCP Server Profiles
126+
127+
Configure two Databricks MCP server entries in your IDE's MCP config (e.g. `~/.cursor/mcp.json`), one per workspace:
128+
129+
```json
130+
"databricks-prod": {
131+
"command": "/path/to/.venv/bin/python",
132+
"args": ["/path/to/databricks-mcp-server/run_server.py"],
133+
"env": { "DATABRICKS_CONFIG_PROFILE": "prod" }
134+
},
135+
"databricks-dev": {
136+
"command": "/path/to/.venv/bin/python",
137+
"args": ["/path/to/databricks-mcp-server/run_server.py"],
138+
"env": { "DATABRICKS_CONFIG_PROFILE": "dev" }
139+
}
140+
```
141+
142+
Both servers run simultaneously after one IDE reload. This lets you call `export_genie` against `databricks-prod` and `import_genie` against `databricks-dev` within the same conversation — no further reloads needed.
143+
144+
> **Tip:** The Databricks CLI profiles (`prod`, `dev`) referenced above must be defined in `~/.databrickscfg`. Both token-based and OAuth (`auth_type = databricks-cli`) profiles are supported.
145+
146+
##### Full Migration Workflow
147+
148+
**Step 1 — Export from PROD** using the `databricks-prod` MCP server:
121149

122150
```python
123-
# 1. Export the Genie Space from the production workspace
151+
# Call export_genie via the prod-scoped MCP server
124152
exported = export_genie(space_id="<prod_space_id>")
153+
# exported["serialized_space"] contains the full config
154+
# exported["warehouse_id"] is the PROD warehouse — do NOT reuse it for DEV
155+
```
156+
157+
**Step 2 — Find the DEV warehouse ID:**
158+
159+
```python
160+
# Call list_warehouses via the dev-scoped MCP server
161+
list_warehouses() # note the warehouse_id for the DEV workspace
162+
```
125163

126-
# 2. Remap the catalog name for the development environment
164+
**Step 3 — Remap the catalog and import into DEV** using the `databricks-dev` MCP server:
165+
166+
```python
167+
# Catalog name differs between environments — replace ALL occurrences.
168+
# serialized_space embeds the catalog in table identifiers, SQL FROM clauses,
169+
# join specs, and filter snippets, so a single string replace covers everything.
127170
dev_serialized_space = exported["serialized_space"].replace(
128-
"healthverity_claims_sample_patient_dataset",
129-
"healthverity_claims_sample_patient_dataset_dev"
171+
"my_prod_catalog",
172+
"my_dev_catalog"
130173
)
131174

132-
# 3. Import the modified space into the dev workspace
133-
import_genie(
175+
# Call import_genie via the dev-scoped MCP server
176+
result = import_genie(
134177
warehouse_id="<dev_warehouse_id>",
135178
serialized_space=dev_serialized_space,
136-
title="HealthVerity Claims (Dev)"
179+
title="My Space"
137180
)
181+
# result["space_id"] is the new DEV space ID
182+
```
183+
184+
**Step 4 — Update `databricks.yml`** with the new DEV space IDs so they are tracked in the bundle:
185+
186+
```yaml
187+
targets:
188+
dev:
189+
variables:
190+
genie_space_ids: "<new_dev_space_id_1>,<new_dev_space_id_2>,<new_dev_space_id_3>"
191+
```
192+
193+
**Step 5 — Save exports locally** for version control and future re-migrations:
194+
195+
```json
196+
// genie_exports/MySpace.json
197+
{
198+
"space_id": "<prod_space_id>",
199+
"title": "MySpace",
200+
"warehouse_id": "<prod_warehouse_id>",
201+
"serialized_space": "{ ... }"
202+
}
138203
```
139204

140205
## Workflow
@@ -162,8 +227,8 @@ Before creating a Genie Space:
162227
### Creating Tables
163228

164229
Use these skills in sequence:
165-
1. `databricks-synthetic-data-gen` - Generate raw parquet files
166-
2. `databricks-spark-declarative-pipelines` - Create bronze/silver/gold tables
230+
1. `synthetic-data-generation` - Generate raw parquet files
231+
2. `spark-declarative-pipelines` - Create bronze/silver/gold tables
167232

168233
## Common Issues
169234

@@ -176,10 +241,6 @@ Use these skills in sequence:
176241
| **`import_genie` fails with permission error** | Ensure you have CREATE privileges in the target workspace folder |
177242
| **Tables not found after migration** | Catalog name was not remapped — replace the source catalog name in `serialized_space` before calling `import_genie` |
178243
| **Catalog name appears in SQL queries too** | `serialized_space` embeds catalog in table identifiers, SQL FROM clauses, join specs, and filters — a single `.replace(src, tgt)` on the whole string covers all occurrences |
179-
180-
## Related Skills
181-
182-
- **[databricks-agent-bricks](../databricks-agent-bricks/SKILL.md)** - Use Genie Spaces as agents inside Supervisor Agents
183-
- **[databricks-synthetic-data-gen](../databricks-synthetic-data-gen/SKILL.md)** - Generate raw parquet data to populate tables for Genie
184-
- **[databricks-spark-declarative-pipelines](../databricks-spark-declarative-pipelines/SKILL.md)** - Build bronze/silver/gold tables consumed by Genie Spaces
185-
- **[databricks-unity-catalog](../databricks-unity-catalog/SKILL.md)** - Manage the catalogs, schemas, and tables Genie queries
244+
| **`export_genie` / `import_genie` land in the wrong workspace** | Each MCP server is workspace-scoped. Set up two named MCP server entries (one per profile) in your IDE's MCP config instead of switching a single server's profile mid-session |
245+
| **MCP server doesn't pick up profile change** | The MCP process reads `DATABRICKS_CONFIG_PROFILE` once at startup — editing the config file requires an IDE reload to take effect |
246+
| **`import_genie` fails with JSON parse error** | The `serialized_space` string may contain multi-line SQL arrays with `\n` escape sequences; flatten SQL arrays to single-line strings before passing to avoid double-escaping issues |

databricks-skills/databricks-genie/conversation.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,8 @@ The `ask_genie` tool allows you to programmatically send questions to a Genie Sp
3131

3232
| Tool | Purpose |
3333
|------|---------|
34-
| `ask_genie` | Ask a question or follow-up (`conversation_id` optional) |
34+
| `ask_genie` | Ask a question, start new conversation |
35+
| `ask_genie_followup` | Ask follow-up in existing conversation |
3536

3637
## Basic Usage
3738

@@ -70,10 +71,10 @@ result = ask_genie(
7071
)
7172

7273
# Follow-up (uses context from first question)
73-
ask_genie(
74+
ask_genie_followup(
7475
space_id="01abc123...",
75-
question="Break that down by region",
76-
conversation_id=result["conversation_id"]
76+
conversation_id=result["conversation_id"],
77+
question="Break that down by region"
7778
)
7879
```
7980

@@ -162,9 +163,9 @@ User: "Use my analytics Genie to explore sales trends"
162163
Claude:
163164
1. ask_genie(space_id, "What were total sales by month this year?")
164165
2. User: "Which month had the highest growth?"
165-
3. ask_genie(space_id, "Which month had the highest growth?", conversation_id=conv_id)
166+
3. ask_genie_followup(space_id, conv_id, "Which month had the highest growth?")
166167
4. User: "What products drove that growth?"
167-
5. ask_genie(space_id, "What products drove that growth?", conversation_id=conv_id)
168+
5. ask_genie_followup(space_id, conv_id, "What products drove that growth?")
168169
```
169170

170171
## Best Practices
@@ -180,8 +181,8 @@ result2 = ask_genie(space_id, "How many employees do we have?") # New conversat
180181

181182
# Good: Follow-up for related question
182183
result1 = ask_genie(space_id, "What were sales last month?")
183-
result2 = ask_genie(space_id, "Break that down by product",
184-
conversation_id=result1["conversation_id"]) # Related follow-up
184+
result2 = ask_genie_followup(space_id, result1["conversation_id"],
185+
"Break that down by product") # Related follow-up
185186
```
186187

187188
### Handle Clarification Requests

databricks-skills/databricks-genie/spaces.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -297,10 +297,10 @@ create_or_update_genie(
297297

298298
## Example End-to-End Workflow
299299

300-
1. **Generate synthetic data** using `databricks-synthetic-data-gen` skill:
300+
1. **Generate synthetic data** using `synthetic-data-generation` skill:
301301
- Creates parquet files in `/Volumes/catalog/schema/raw_data/`
302302

303-
2. **Create tables** using `databricks-spark-declarative-pipelines` skill:
303+
2. **Create tables** using `spark-declarative-pipelines` skill:
304304
- Creates `catalog.schema.bronze_*``catalog.schema.silver_*``catalog.schema.gold_*`
305305

306306
3. **Inspect the tables**:

0 commit comments

Comments
 (0)