Summary
We're using umya-spreadsheet (v2.3.3) in office2pdf to parse XLSX files. When processing a corpus of ~2,800 real-world XLSX files (sourced from LibreOffice and Apache POI test suites), 15 files cause panics due to unwrap() calls in the library. We've worked around this with catch_unwind, but the panics should ideally be handled gracefully.
Related to #271 (panic reduction proposal) and #301 (FileNotFound on .xlsm).
Panicking files (15 files)
FileNotFound panics (7 files)
unwrap() on missing zip entries for hyperlinks/charts:
chart_hyperlink.xlsx, hyperlink.xlsx, tdf130959.xlsx, tdf115192.xlsx (LibreOffice)
47504.xlsx, bug63189.xlsx, ConditionalFormattingSamples.xlsx (Apache POI)
Stack trace points to zip entry lookup by relationship ID where the referenced file doesn't exist in the archive (e.g., external hyperlinks).
ParseFloatError / ParseIntError panics (3 files)
unwrap() on str::parse::<f64>() or str::parse::<u32>():
check-boolean.xlsx — Boolean cell value parsed as float
functions-excel-2010.xlsx — Integer overflow parsing cell reference
FormulaEvalTestData_Copy.xlsx — Integer overflow parsing cell reference
unwrap() on None panics (3 files)
Expected XML elements/attributes missing:
tdf100709.xlsx, 64450.xlsx, sample-beta.xlsx
dataBar end element panics (2 files)
Conditional formatting parser panics via panic!():
tdf162948.xlsx, NewStyleConditionalFormattings.xlsx
Error: Could not find dataBar end element
Expected behavior
All of these should return Err instead of panicking. Library consumers should be able to handle failures without catch_unwind.
Environment
umya-spreadsheet v2.3.3
office2pdf v0.3.0
- Rust 1.85+
- Files sourced from LibreOffice and Apache POI test suites (publicly available)
Summary
We're using
umya-spreadsheet(v2.3.3) in office2pdf to parse XLSX files. When processing a corpus of ~2,800 real-world XLSX files (sourced from LibreOffice and Apache POI test suites), 15 files cause panics due tounwrap()calls in the library. We've worked around this withcatch_unwind, but the panics should ideally be handled gracefully.Related to #271 (panic reduction proposal) and #301 (FileNotFound on .xlsm).
Panicking files (15 files)
FileNotFound panics (7 files)
unwrap()on missing zip entries for hyperlinks/charts:chart_hyperlink.xlsx,hyperlink.xlsx,tdf130959.xlsx,tdf115192.xlsx(LibreOffice)47504.xlsx,bug63189.xlsx,ConditionalFormattingSamples.xlsx(Apache POI)Stack trace points to zip entry lookup by relationship ID where the referenced file doesn't exist in the archive (e.g., external hyperlinks).
ParseFloatError / ParseIntError panics (3 files)
unwrap()onstr::parse::<f64>()orstr::parse::<u32>():check-boolean.xlsx— Boolean cell value parsed as floatfunctions-excel-2010.xlsx— Integer overflow parsing cell referenceFormulaEvalTestData_Copy.xlsx— Integer overflow parsing cell referenceunwrap() on None panics (3 files)
Expected XML elements/attributes missing:
tdf100709.xlsx,64450.xlsx,sample-beta.xlsxdataBar end element panics (2 files)
Conditional formatting parser panics via
panic!():tdf162948.xlsx,NewStyleConditionalFormattings.xlsxError:
Could not find dataBar end elementExpected behavior
All of these should return
Errinstead of panicking. Library consumers should be able to handle failures withoutcatch_unwind.Environment
umya-spreadsheetv2.3.3office2pdfv0.3.0