Depends on the filesystem catalog issue.
A filesystem catalog uses version-hint.text as the latest-version pointer. Crashes between writing vN.metadata.json and updating version-hint.text can leave the pointer trailing the actual latest committed metadata. Recovery should:
- List
metadata/v*.metadata.json
- Parse each, identify the highest version with a consistent snapshot lineage
- Optionally rewrite
version-hint.text to point at it (or surface the discrepancy and let the caller decide)
Spark/Java avoid this entirely because the fix is implicit on next read (HadoopTableOperations rescans), but a C++ embedded user committing in a long-lived process needs an explicit recovery entry point.
API sketch
class FilesystemCatalog : public Catalog {
Result<int64_t> RepairLatestPointer(const TableIdentifier& id);
};
Depends on the filesystem catalog issue.
A filesystem catalog uses
version-hint.textas the latest-version pointer. Crashes between writingvN.metadata.jsonand updatingversion-hint.textcan leave the pointer trailing the actual latest committed metadata. Recovery should:metadata/v*.metadata.jsonversion-hint.textto point at it (or surface the discrepancy and let the caller decide)Spark/Java avoid this entirely because the fix is implicit on next read (
HadoopTableOperationsrescans), but a C++ embedded user committing in a long-lived process needs an explicit recovery entry point.API sketch