|
| 1 | +# Pydantic 2.11.9 兼容性修复方案 |
| 2 | + |
| 3 | +## 🎯 问题分析 |
| 4 | + |
| 5 | +**当前版本**: Pydantic 2.11.9 |
| 6 | + |
| 7 | +**问题根源**: |
| 8 | +1. JSON Schema 使用嵌套的 `oneOf` 定义(string | array) |
| 9 | +2. datamodel-code-generator 为此生成 RootModel |
| 10 | +3. Pydantic 2.x 的 RootModel **不支持** `model_config['extra']` |
| 11 | + |
| 12 | +**错误示例**: |
| 13 | +```python |
| 14 | +# datamodel-code-generator 生成的代码 |
| 15 | +class Database(RootModel[Union[str, Dict[str, Union[str, List[str]]]]]): |
| 16 | + model_config = ConfigDict(extra="forbid") # ❌ RootModel 不支持这个 |
| 17 | + root: Union[str, Dict[str, Union[str, List[str]]]] |
| 18 | +``` |
| 19 | + |
| 20 | +## ✅ 解决方案 |
| 21 | + |
| 22 | +### 方案 1: 简化 Schema(推荐,立即可用)⭐ |
| 23 | + |
| 24 | +**核心思路**: 移除嵌套的 `oneOf`,只支持字符串形式的 owner,避免生成 RootModel |
| 25 | + |
| 26 | +#### 修改内容 |
| 27 | + |
| 28 | +**替换文件**: `openmetadata-spec/src/main/resources/json/schema/type/ownerConfig.json` |
| 29 | + |
| 30 | +**关键改动**: |
| 31 | + |
| 32 | +```json |
| 33 | +// 修改前(导致 RootModel): |
| 34 | +"database": { |
| 35 | + "oneOf": [ |
| 36 | + { "type": "string" }, |
| 37 | + { |
| 38 | + "type": "object", |
| 39 | + "additionalProperties": { |
| 40 | + "oneOf": [ // ← 嵌套的 oneOf 导致 RootModel |
| 41 | + { "type": "string" }, |
| 42 | + { "type": "array", "items": { "type": "string" } } |
| 43 | + ] |
| 44 | + } |
| 45 | + } |
| 46 | + ] |
| 47 | +} |
| 48 | + |
| 49 | +// 修改后(避免 RootModel): |
| 50 | +"database": { |
| 51 | + "anyOf": [ // ← 使用 anyOf |
| 52 | + { "type": "string" }, |
| 53 | + { |
| 54 | + "type": "object", |
| 55 | + "additionalProperties": { |
| 56 | + "type": "string" // ← 只支持字符串,移除数组 |
| 57 | + } |
| 58 | + } |
| 59 | + ] |
| 60 | +} |
| 61 | +``` |
| 62 | + |
| 63 | +**优点**: |
| 64 | +- ✅ 不生成 RootModel |
| 65 | +- ✅ 完全兼容 Pydantic 2.11.9 |
| 66 | +- ✅ 生成简单的 Union 类型 |
| 67 | +- ✅ 立即可用,无需额外配置 |
| 68 | + |
| 69 | +**缺点**: |
| 70 | +- ⚠️ 暂时不支持数组形式的多个 owner(如 `["alice", "bob"]`) |
| 71 | +- ⚠️ 只能配置单个 owner(字符串形式) |
| 72 | + |
| 73 | +**生成的 Pydantic 模型**: |
| 74 | +```python |
| 75 | +from typing import Union, Dict, Optional |
| 76 | +from pydantic import BaseModel, Field |
| 77 | + |
| 78 | +class OwnerConfig(BaseModel): |
| 79 | + default: Optional[str] = Field(None, description="...") |
| 80 | + database: Optional[Union[str, Dict[str, str]]] = Field(None) # ✅ 简单的 Union |
| 81 | + databaseSchema: Optional[Union[str, Dict[str, str]]] = Field(None) |
| 82 | + table: Optional[Union[str, Dict[str, str]]] = Field(None) |
| 83 | + enableInheritance: Optional[bool] = Field(True) |
| 84 | +``` |
| 85 | + |
| 86 | +#### 实施步骤 |
| 87 | + |
| 88 | +```bash |
| 89 | +cd ~/workspaces/OpenMetadata |
| 90 | + |
| 91 | +# 1. 备份原文件 |
| 92 | +cp openmetadata-spec/src/main/resources/json/schema/type/ownerConfig.json \ |
| 93 | + openmetadata-spec/src/main/resources/json/schema/type/ownerConfig.json.bak |
| 94 | + |
| 95 | +# 2. 使用优化的 schema(我已创建) |
| 96 | +cp /workspace/ownerConfig_optimized.json \ |
| 97 | + openmetadata-spec/src/main/resources/json/schema/type/ownerConfig.json |
| 98 | + |
| 99 | +# 3. 重新生成 Pydantic 模型 |
| 100 | +cd openmetadata-spec |
| 101 | +mvn clean install |
| 102 | + |
| 103 | +# 4. 重新安装 ingestion |
| 104 | +cd ../ingestion |
| 105 | +pip install -e . --force-reinstall --no-deps |
| 106 | + |
| 107 | +# 5. 验证 |
| 108 | +python3 -c "from metadata.generated.schema.type import ownerConfig; print('✅ Success')" |
| 109 | + |
| 110 | +# 6. 测试 |
| 111 | +cd .. |
| 112 | +metadata ingest -c ingestion/tests/unit/metadata/ingestion/owner_config_tests/test-01-basic-configuration.yaml |
| 113 | +``` |
| 114 | + |
| 115 | +### 方案 2: 继续使用自动修复脚本(临时方案) |
| 116 | + |
| 117 | +如果不想修改 schema,可以继续使用自动修复: |
| 118 | + |
| 119 | +```bash |
| 120 | +# 使用现有的修复逻辑 |
| 121 | +cd ~/workspaces/OpenMetadata |
| 122 | +python3 scripts/datamodel_generation.py |
| 123 | + |
| 124 | +# scripts/datamodel_generation.py 已包含 RootModel 自动修复 |
| 125 | +``` |
| 126 | + |
| 127 | +### 方案 3: 未来支持数组(长期方案) |
| 128 | + |
| 129 | +如果未来需要支持多个 owner(数组形式),需要: |
| 130 | + |
| 131 | +1. **更复杂的 Schema 定义**(使用 discriminator) |
| 132 | +2. **或者使用自定义 validator** 在 Python 代码中处理 |
| 133 | +3. **或者等待 datamodel-code-generator 改进** |
| 134 | + |
| 135 | +## 📋 配置对比 |
| 136 | + |
| 137 | +### 简化后支持的配置 |
| 138 | + |
| 139 | +```yaml |
| 140 | +ownerConfig: |
| 141 | + default: "data-platform-team" |
| 142 | + |
| 143 | + # ✅ 支持:字符串形式 |
| 144 | + database: "database-admin" |
| 145 | + |
| 146 | + # ✅ 支持:字典映射(单个字符串值) |
| 147 | + database: |
| 148 | + "sales_db": "sales-team" |
| 149 | + "finance_db": "finance-team" |
| 150 | + |
| 151 | + databaseSchema: |
| 152 | + "sales_db.public": "public-team" |
| 153 | + "finance_db.accounting": "accounting-team" |
| 154 | + |
| 155 | + table: |
| 156 | + "sales_db.public.orders": "order-team" |
| 157 | + "finance_db.accounting.revenue": "revenue-team" |
| 158 | + |
| 159 | + enableInheritance: true |
| 160 | +``` |
| 161 | +
|
| 162 | +### 不再支持的配置 |
| 163 | +
|
| 164 | +```yaml |
| 165 | +ownerConfig: |
| 166 | + # ❌ 不支持:数组形式(多个 owner) |
| 167 | + database: |
| 168 | + "sales_db": ["alice", "bob", "charlie"] # ❌ 报错 |
| 169 | + |
| 170 | + table: |
| 171 | + "orders": ["user1", "user2"] # ❌ 报错 |
| 172 | +``` |
| 173 | +
|
| 174 | +**解决方法**: 如果需要多个 owner,选择其中一个主要负责人: |
| 175 | +```yaml |
| 176 | +# 从: |
| 177 | +database: |
| 178 | + "sales_db": ["alice", "bob"] |
| 179 | + |
| 180 | +# 改为: |
| 181 | +database: |
| 182 | + "sales_db": "alice" # 选择主要负责人 |
| 183 | +``` |
| 184 | +
|
| 185 | +## 🔧 测试配置更新 |
| 186 | +
|
| 187 | +由于简化后只支持单个 owner,需要更新测试配置: |
| 188 | +
|
| 189 | +### Test 1-2, 5-6: 无需修改 ✅ |
| 190 | +这些测试已经使用单个字符串,兼容新 schema |
| 191 | +
|
| 192 | +### Test 3: Multiple Users → 改为单个 owner |
| 193 | +
|
| 194 | +```yaml |
| 195 | +# 文件: test-03-multiple-users.yaml |
| 196 | + |
| 197 | +# 修改前: |
| 198 | +ownerConfig: |
| 199 | + database: |
| 200 | + "finance_db": ["alice", "bob"] |
| 201 | + table: |
| 202 | + "finance_db.accounting.revenue": ["charlie", "david", "emma"] |
| 203 | + "finance_db.accounting.expenses": ["frank"] |
| 204 | + |
| 205 | +# 修改后: |
| 206 | +ownerConfig: |
| 207 | + database: |
| 208 | + "finance_db": "alice" # ✅ 单个 owner |
| 209 | + table: |
| 210 | + "finance_db.accounting.revenue": "charlie" # ✅ |
| 211 | + "finance_db.accounting.expenses": "frank" # ✅ |
| 212 | +``` |
| 213 | +
|
| 214 | +### Test 4: Validation → 简化验证场景 |
| 215 | +
|
| 216 | +```yaml |
| 217 | +# 文件: test-04-validation-errors.yaml |
| 218 | + |
| 219 | +# 修改前: |
| 220 | +ownerConfig: |
| 221 | + database: |
| 222 | + "finance_db": ["finance-team", "audit-team", "compliance-team"] |
| 223 | + table: |
| 224 | + "finance_db.accounting.revenue": ["alice", "bob", "finance-team"] |
| 225 | + |
| 226 | +# 修改后(测试其他验证场景): |
| 227 | +ownerConfig: |
| 228 | + database: |
| 229 | + "finance_db": "finance-team" # ✅ 单个 team |
| 230 | + table: |
| 231 | + "finance_db.accounting.revenue": "alice" # ✅ |
| 232 | + "finance_db.accounting.budgets": "nonexistent-team" # 测试不存在的 owner |
| 233 | +``` |
| 234 | +
|
| 235 | +### Test 7: Partial Success → 修改测试策略 |
| 236 | +
|
| 237 | +```yaml |
| 238 | +# 文件: test-07-partial-success.yaml |
| 239 | + |
| 240 | +# 修改前: |
| 241 | +ownerConfig: |
| 242 | + table: |
| 243 | + "finance_db.accounting.revenue": ["alice", "nonexistent-user-1", "bob"] |
| 244 | + |
| 245 | +# 修改后(测试不存在的单个 owner): |
| 246 | +ownerConfig: |
| 247 | + table: |
| 248 | + "finance_db.accounting.revenue": "alice" # ✅ 存在的 owner |
| 249 | + "finance_db.accounting.budgets": "nonexistent-user-1" # ✅ 测试不存在 |
| 250 | +``` |
| 251 | +
|
| 252 | +### Test 8: Complex Mixed → 简化配置 |
| 253 | +
|
| 254 | +```yaml |
| 255 | +# 文件: test-08-complex-mixed.yaml |
| 256 | + |
| 257 | +# 修改前: |
| 258 | +ownerConfig: |
| 259 | + database: |
| 260 | + "marketing_db": ["marketing-user-1", "marketing-user-2"] |
| 261 | + databaseSchema: |
| 262 | + "finance_db.accounting": ["alice", "bob"] |
| 263 | + table: |
| 264 | + "finance_db.accounting.revenue": ["charlie", "david", "emma"] |
| 265 | + |
| 266 | +# 修改后: |
| 267 | +ownerConfig: |
| 268 | + database: |
| 269 | + "marketing_db": "marketing-user-1" # ✅ |
| 270 | + databaseSchema: |
| 271 | + "finance_db.accounting": "alice" # ✅ |
| 272 | + table: |
| 273 | + "finance_db.accounting.revenue": "charlie" # ✅ |
| 274 | +``` |
| 275 | +
|
| 276 | +## 📊 方案对比 |
| 277 | +
|
| 278 | +| 方案 | 优点 | 缺点 | 推荐度 | |
| 279 | +|------|------|------|--------| |
| 280 | +| **方案1: 简化Schema** | 彻底解决,无需修复脚本 | 不支持数组 | ⭐⭐⭐⭐⭐ | |
| 281 | +| **方案2: 自动修复** | 保持原schema,支持数组 | 每次生成都需要修复 | ⭐⭐⭐ | |
| 282 | +| **方案3: 等待改进** | 完美支持 | 时间不确定 | ⭐ | |
| 283 | +
|
| 284 | +## ✅ 推荐实施 |
| 285 | +
|
| 286 | +**立即执行**(方案1): |
| 287 | +
|
| 288 | +```bash |
| 289 | +# 1. 使用简化的 schema |
| 290 | +cp /workspace/ownerConfig_optimized.json \ |
| 291 | + ~/workspaces/OpenMetadata/openmetadata-spec/src/main/resources/json/schema/type/ownerConfig.json |
| 292 | + |
| 293 | +# 2. 重新生成 |
| 294 | +cd ~/workspaces/OpenMetadata/openmetadata-spec |
| 295 | +mvn clean install |
| 296 | + |
| 297 | +# 3. 重新安装 |
| 298 | +cd ../ingestion |
| 299 | +pip install -e . --force-reinstall --no-deps |
| 300 | + |
| 301 | +# 4. 验证 |
| 302 | +python3 -c "from metadata.generated.schema.type import ownerConfig; print('✅ Success')" |
| 303 | + |
| 304 | +# 5. 运行测试 |
| 305 | +cd .. |
| 306 | +metadata ingest -c ingestion/tests/unit/metadata/ingestion/owner_config_tests/test-05-inheritance-enabled.yaml |
| 307 | +``` |
| 308 | + |
| 309 | +## 🎯 总结 |
| 310 | + |
| 311 | +**对于 Pydantic 2.11.9**: |
| 312 | +- ✅ 方案1(简化Schema)是最干净的解决方案 |
| 313 | +- ✅ 完全兼容,无需额外修复脚本 |
| 314 | +- ✅ 代码生成稳定可靠 |
| 315 | +- ⚠️ 暂时牺牲数组支持(大多数场景单个owner已足够) |
| 316 | + |
| 317 | +**未来如需数组支持**: |
| 318 | +- 可以在 Python 代码层面实现(使用 validator) |
| 319 | +- 或者使用更复杂的 discriminated union schema |
| 320 | +- 或者等待 datamodel-code-generator 改进 |
0 commit comments