It seems that there are some edge cases with serializing scalars and numpy arrays with 0-dimensions
This has always confused me, so for precision, I quote numpy's documentation
An array scalar is an instance of the types/classes float32, float64, etc., whereas a 0-dimensional array is an ndarray instance containing precisely one array scalar.
So I added the following test to the file tests/test_np.py to see if things serialize correctly:
def test_nd_array_shape_empty():
to_dump = zeros((), dtype='uint32')
to_dump[...] = 123
the_dumps = dumps(to_dump)
the_double_dumps = dumps(loads(dumps(to_dump)))
assert the_dumps == the_double_dumps
_________________________________ test_nd_array_shape_empty __________________________________
def test_nd_array_shape_empty():
to_dump = zeros((), dtype='uint32')
to_dump[...] = 123
the_dumps = dumps(to_dump)
the_double_dumps = dumps(loads(dumps(to_dump)))
> assert the_dumps == the_double_dumps
E assert '{"__ndarray_... "shape": []}' == '123'
E - 123
E + {"__ndarray__": 123, "dtype": "uint32", "shape": []}
tests/test_np.py:197: AssertionError
After round tripping, we aren't preserving the "0-dimension" and it is being downcast to a numpy-scalar.
This strange behavior leads to the test suite using encode_scalars_inplace as an attempt to workaround the warning that "scalars cannot be reliably encoded", then turns around to using the strange "automatic downcast" behavior to recover the original structure.
https://github.com/mverleg/pyjson_tricks/blob/master/tests/test_np.py#L170
However, this would be incorrect if the user mixes "0-dimensional" numpy arrays.
Now, I originally came here to try to address a pretty specific usecase of ours: I am trying to serialize numpy datetime64 scalars. I tried to augment them to 0-dimensional arrays, however, this breaks two assumptions made in pyjson_tricks:
numpy.datetime64 objects are not numpy.generics used in the encoder
numpy.datetime64 constructors cannot be obtained with the getattr used in the decoder since they have units (annoying I know, date is complicated)
My proposal unfortunately requires breaking anybody using replace_scalars_inplace such as the test suite
Other references
It seems that there are some edge cases with serializing scalars and numpy arrays with 0-dimensions
This has always confused me, so for precision, I quote numpy's documentation
So I added the following test to the file
tests/test_np.pyto see if things serialize correctly:After round tripping, we aren't preserving the "0-dimension" and it is being downcast to a numpy-scalar.
This strange behavior leads to the test suite using
encode_scalars_inplaceas an attempt to workaround the warning that "scalars cannot be reliably encoded", then turns around to using the strange "automatic downcast" behavior to recover the original structure.https://github.com/mverleg/pyjson_tricks/blob/master/tests/test_np.py#L170
However, this would be incorrect if the user mixes "0-dimensional" numpy arrays.
Now, I originally came here to try to address a pretty specific usecase of ours: I am trying to serialize numpy datetime64 scalars. I tried to augment them to 0-dimensional arrays, however, this breaks two assumptions made in pyjson_tricks:
numpy.datetime64objects are notnumpy.genericsused in the encodernumpy.datetime64constructors cannot be obtained with thegetattrused in the decoder since they have units (annoying I know, date is complicated)My proposal unfortunately requires breaking anybody using
replace_scalars_inplacesuch as the test suiteOther references