Pangolin supports advanced branching and merging capabilities for data catalogs, similar to Git but specialized for data lake table formats like Iceberg. This document explains the merge logic, conflict detection, and resolution strategies.
Pangolin's Enhanced Merge Conflict Resolution system provides intelligent conflict detection and resolution when merging branches. Unlike simple merge operations that fail on any conflict, this system:
- Detects conflicts automatically during merge operations
- Categorizes conflicts by type (schema, deletion, metadata, data overlap)
- Tracks merge operations with full lifecycle management
- Enables manual resolution with multiple strategies
- Supports automatic resolution for non-critical conflicts
Pangolin uses a 3-Way Merge algorithm when a common ancestor (base commit) can be detected. If no common ancestor is found (e.g., disjoint histories), it falls back to a 2-way merge (Source overwrites Target).
When you initiate a merge from Feature Branch to Main, Pangolin:
- Traverses the commit history of both branches.
- Finds the most recent common ancestor (Base Commit).
- Calculates changes:
- Source Changes: Base -> Source Head
- Target Changes: Base -> Target Head
Pangolin detects conflicts by comparing modifications to assets in Source and Target relative to the Base.
| Scenario | Source | Target | Result |
|---|---|---|---|
| No Change | Same as Base | Same as Base | No Change |
| Fast-Forward | Modified | Same as Base | Update to Source |
| Incoming | Same as Base | Modified | Keep Target |
| Conflict | Modified (A) | Modified (B) | CONFLICT |
| Deletion | Deleted | Modified | CONFLICT (Deletion Conflict) |
| Resurrection | Created | Created | CONFLICT (if different content) |
Occur when the same table has different schemas in source and target branches.
- Example: Both branches added columns independently or changed column types.
- Resolution: Requires manual review to merge schema changes properly.
Occur when an asset is deleted in one branch but modified in another.
- Example: Source deleted
temp_tablewhile Target added data to it. - Resolution: Choose to keep (with modifications) or delete the asset.
Occur when asset properties (tags, description, properties) differ between branches.
- Resolution: Can often be auto-resolved for non-critical properties.
Occur when the same partitions of a table are modified in both branches.
- Resolution: Requires manual review to determine which version to keep.
POST /api/v1/branches/merge
{
"catalog": "production",
"source_branch": "feature-branch",
"target_branch": "main"
}Response (Conflicts Detected):
{
"status": "conflicted",
"operation_id": "550e8400-e29b...",
"conflicts": 3,
"message": "Merge has 3 conflicts that need resolution"
}GET /api/v1/merge-operations/{operation_id}/conflictsPOST /api/v1/conflicts/{conflict_id}/resolve
{
"strategy": "TakeSource",
"resolved_value": null
}Resolution Strategies:
TakeSource: Use the source branch version.TakeTarget: Use the target branch version.Manual: Provide custom resolution (e.g., a merged schema) inresolved_value.AutoMerge: Automatically merge non-conflicting changes.
After all conflicts are resolved:
POST /api/v1/merge-operations/{operation_id}/completeIf you decide not to proceed:
POST /api/v1/merge-operations/{operation_id}/abortPending → Conflicted → Resolving → Ready → Completed
↓ ↑
└─────────────→ Aborted ←──────────────────┘
GET /api/v1/catalogs/{catalog_name}/merge-operationsGET /api/v1/merge-operations/{operation_id}- Merge Frequently: Minimize divergence between branches.
- Review Thoroughly: Schema and deletion conflicts can be destructive.
- Test After Merge: Always validate data and schema after a complex merge.