Summary
scitex.io.save(df, "foo.parquet") (and .feather) warns "Unsupported file format" and silently does nothing, even though standalone scitex-io registers both extensions and scitex_io.get_saver('.parquet') returns the valid _save_parquet function.
The umbrella's save() dispatcher (in scitex/io/_save.py) is not consulting the same registry that list_formats() / get_saver() consult.
Reproducer
```python
import scitex as stx, pandas as pd, os, tempfile
Standalone registry says: parquet is supported, get_saver returns a function
print('.parquet listed:', '.parquet' in stx.io.list_formats()['save']['builtin']) # True
print(stx.io.get_saver('.parquet')) # <function _save_parquet at ...>
Umbrella save() does not dispatch to it
with tempfile.TemporaryDirectory() as td:
p = os.path.join(td, "x.parquet")
stx.io.save(pd.DataFrame({'a':[1.0]}), p, verbose=True)
# WARN: Unsupported file format. .../x.parquet was not saved.
assert not os.path.exists(p)
Feather: same pattern
with tempfile.TemporaryDirectory() as td:
p = os.path.join(td, "x.feather")
stx.io.save(pd.DataFrame({'a':[1.0]}), p, verbose=True)
assert not os.path.exists(p)
```
Likely root cause
scitex/io/_save.py carries its own extension→saver mapping (used by the umbrella's save() dispatcher) which has not been updated to match scitex_io's registry after the standalonization. Three sources of truth now exist:
scitex_io.list_formats()['save']['builtin'] — includes .parquet, .feather ✓
scitex_io.get_saver('.parquet') → returns _save_parquet ✓
- Umbrella
_save.py dispatcher → unknown — falls through to "Unsupported" ✗
Suggested fix
Have scitex.io.save() delegate dispatch entirely to scitex_io.get_saver(ext):
```python
ext = _os.path.splitext(spath)[1].lower()
saver = scitex_io.get_saver(ext)
if saver is None:
print(f"WARN: Unsupported file format. {spath} was not saved.")
return
saver(obj, spath_final, **kwargs)
```
This makes the umbrella delegate to the single source of truth and prevents future drift.
Impact
Hit while building a feature-extraction pipeline that needed parquet for ~4 000 small DataFrames per patient cohort. Pivoted to .pkl as workaround. The silent dispatch failure is dangerous because subsequent stx.io.load(<path>) raises FileNotFoundError far from the actual write-site failure.
Cross-ref
Originally filed (and closed as misdirected) at scitex-io#25 and scitex-io#26.
Summary
scitex.io.save(df, "foo.parquet")(and.feather) warns"Unsupported file format"and silently does nothing, even though standalonescitex-ioregisters both extensions andscitex_io.get_saver('.parquet')returns the valid_save_parquetfunction.The umbrella's
save()dispatcher (inscitex/io/_save.py) is not consulting the same registry thatlist_formats()/get_saver()consult.Reproducer
```python
import scitex as stx, pandas as pd, os, tempfile
Standalone registry says: parquet is supported, get_saver returns a function
print('.parquet listed:', '.parquet' in stx.io.list_formats()['save']['builtin']) # True
print(stx.io.get_saver('.parquet')) # <function _save_parquet at ...>
Umbrella save() does not dispatch to it
with tempfile.TemporaryDirectory() as td:
p = os.path.join(td, "x.parquet")
stx.io.save(pd.DataFrame({'a':[1.0]}), p, verbose=True)
# WARN: Unsupported file format. .../x.parquet was not saved.
assert not os.path.exists(p)
Feather: same pattern
with tempfile.TemporaryDirectory() as td:
p = os.path.join(td, "x.feather")
stx.io.save(pd.DataFrame({'a':[1.0]}), p, verbose=True)
assert not os.path.exists(p)
```
Likely root cause
scitex/io/_save.pycarries its own extension→saver mapping (used by the umbrella'ssave()dispatcher) which has not been updated to matchscitex_io's registry after the standalonization. Three sources of truth now exist:scitex_io.list_formats()['save']['builtin']— includes.parquet,.feather✓scitex_io.get_saver('.parquet')→ returns_save_parquet✓_save.pydispatcher → unknown — falls through to"Unsupported"✗Suggested fix
Have
scitex.io.save()delegate dispatch entirely toscitex_io.get_saver(ext):```python
ext = _os.path.splitext(spath)[1].lower()
saver = scitex_io.get_saver(ext)
if saver is None:
print(f"WARN: Unsupported file format. {spath} was not saved.")
return
saver(obj, spath_final, **kwargs)
```
This makes the umbrella delegate to the single source of truth and prevents future drift.
Impact
Hit while building a feature-extraction pipeline that needed parquet for ~4 000 small DataFrames per patient cohort. Pivoted to
.pklas workaround. The silent dispatch failure is dangerous because subsequentstx.io.load(<path>)raisesFileNotFoundErrorfar from the actual write-site failure.Cross-ref
Originally filed (and closed as misdirected) at scitex-io#25 and scitex-io#26.