LindiH5pyDataset maintains a dictionary that maps URLs to LindiRemfile objects that were used to open an external array dataset. These open files are not closed. For some reason, when a simple script has a function that has a return type that is a Tuple of lindi.LindiH5pyFile or lindi.LindiH5pyGroup or lindi.LindiH5pyDataset, when cleaning up Python execution of that function, we get a segmentation fault. This does not happen if the return type is simply lindi.LindiH5pyFile. This also does not happen if no data is sliced from the lindi.LindiH5pyDataset so no LindiRemfile object is opened.
I guess:
- Wrapping
lindi.LindiH5pyFile in Tuple defers resolution of lindi.LindiH5pyFile to a later time
lindi.LindiH5pyFile imports LindiH5pyDataset
- Importing
LindiH5pyDataset initializes a module-level variable _external_hdf5_clients
- When cleaning up Python execution, the
LindiRemfile that is stored in LindiH5pyDataset gets closed and deleted from one imported LindiH5pyDataset but not the other? I'm not sure...
MWE:
from typing import Tuple
import lindi
def do_nothing() -> Tuple[lindi.LindiH5pyFile]:
pass
rfs = "https://dandi-api-staging-dandisets.s3.amazonaws.com/blobs/7f0/aa4/7f0aa474-4169-42f8-a895-ada0af4072c7"
client = lindi.LindiH5pyFile.from_lindi_file(url_or_path=rfs)
print(client["acquisition"]["ElectricalSeries"]["data"][0,0])
# the following code prevents the segmentation fault during Python execution clean up
# the external array link is where the external array for client["acquisition"]["ElectricalSeries"]["data"] is located
ext_array_link = "https://api.dandiarchive.org/api/assets/df0e074e-3509-4b03-908e-2a1303072707/download/"
client["acquisition"]["ElectricalSeries"]["data"]._get_external_hdf5_client(ext_array_link).close()
Related to NeurodataWithoutBorders/nwb-benchmarks#136
LindiH5pyDatasetmaintains a dictionary that maps URLs toLindiRemfileobjects that were used to open an external array dataset. These open files are not closed. For some reason, when a simple script has a function that has a return type that is aTupleoflindi.LindiH5pyFileorlindi.LindiH5pyGrouporlindi.LindiH5pyDataset, when cleaning up Python execution of that function, we get a segmentation fault. This does not happen if the return type is simplylindi.LindiH5pyFile. This also does not happen if no data is sliced from thelindi.LindiH5pyDatasetso noLindiRemfileobject is opened.I guess:
lindi.LindiH5pyFileinTupledefers resolution oflindi.LindiH5pyFileto a later timelindi.LindiH5pyFileimportsLindiH5pyDatasetLindiH5pyDatasetinitializes a module-level variable_external_hdf5_clientsLindiRemfilethat is stored inLindiH5pyDatasetgets closed and deleted from one importedLindiH5pyDatasetbut not the other? I'm not sure...MWE:
Related to NeurodataWithoutBorders/nwb-benchmarks#136