You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add metadata comments guidance to datagen skill (#292)
* feat: add table/column comments guidance to synthetic data gen skill
Adds documentation for DDL-first and post-write approaches to set
table and column comments when writing Delta tables to Unity Catalog.
* feat: add PySpark StructField metadata approach for column comments
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
spark.sql(f"CREATE SCHEMA IF NOT EXISTS {CATALOG}.{SCHEMA}")
15
+
spark.sql(f"CREATE SCHEMA IF NOT EXISTS {CATALOG}.{SCHEMA} COMMENT 'Synthetic data for demo scenario'")
16
16
spark.sql(f"CREATE VOLUME IF NOT EXISTS {CATALOG}.{SCHEMA}.raw_data")
17
17
```
18
18
19
-
**Important:** Do NOT create catalogs - assume they already exist. Only create schema and volume.
19
+
**Important:** Do NOT create catalogs - assume they already exist. Only create schema and volume. Always add a `COMMENT` to schemas describing the dataset purpose.
20
20
21
21
---
22
22
@@ -126,6 +126,62 @@ customers_df.write \
126
126
- Skip the SDP bronze/silver/gold pipeline
127
127
- Direct SQL analytics
128
128
129
+
### Adding Table and Column Comments
130
+
131
+
Always add comments to Delta tables for discoverability in Unity Catalog. Prefer DDL-first approach — define the table with comments, then insert data.
132
+
133
+
**DDL-first (preferred):**
134
+
```python
135
+
# Create table with inline column comments and table comment
136
+
spark.sql(f"""
137
+
CREATE TABLE IF NOT EXISTS {CATALOG}.{SCHEMA}.customers (
**Note:** Column/table comments only apply to Delta tables in Unity Catalog. Parquet/JSON/CSV files written to volumes do not support metadata comments.
0 commit comments