From 1899ec4171f6b151f11a81248543912d95b4d9a4 Mon Sep 17 00:00:00 2001 From: Francesco Fiusco Date: Mon, 27 Jan 2025 17:05:28 +0100 Subject: [PATCH] Added npz example and credited Aalto --- content/scientific-data.rst | 2 ++ content/stack.rst | 46 ++++++++++++++++++++++++++++++------- 2 files changed, 40 insertions(+), 8 deletions(-) diff --git a/content/scientific-data.rst b/content/scientific-data.rst index 209260e..952c725 100644 --- a/content/scientific-data.rst +++ b/content/scientific-data.rst @@ -226,6 +226,8 @@ An overview of common data formats - 🟨 : Ok / depends on a case - ❌ : Bad + Adapted from Aalto university's `Python for scientific computing `__. + Some of these formats (e.g. JSON and CSV) are saved as text files (ASCII), thus they are human-readable. This makes them easier to visually check them (e.g. for format errors) and are supported out of the box by many tools. However, they tend to be slower during I/O and diff --git a/content/stack.rst b/content/stack.rst index 3083010..8adb2c3 100644 --- a/content/stack.rst +++ b/content/stack.rst @@ -359,14 +359,44 @@ Views and copies of arrays I/O with NumPy ^^^^^^^^^^^^^^ -- Numpy provides functions for reading data from file and for writing data - into the files -- Simple text files - - - :meth:`numpy.loadtxt` - - :meth:`numpy.savetxt` - - Data in regular column layout - - Can deal with comments and different column delimiters +Numpy provides functions for reading from/writing to files. Both ASCII and binary +formats are supported with the CSV and npy/npz formats: + +.. tabs:: + + .. tab:: CSV + + The ``numpy.loadtxt()`` and ``numpy.savetxt()`` functions can be used. They + save in a regular column layout and can deal with different delimiters, + column titles and numerical representations. + + .. code-block:: python + + a = np.array([1, 2, 3, 4]) + np.savetxt("my_array.csv", a) + b = np.loadtxt("my_array.csv") + a == b + # True + + .. tab:: Binary + + The npy format is a binary format used to dump arrays of any + shape. Several arrays can be saved into a single npz file, which is + simply a zipped collection of different npy files. All the arrays to + be saved into a npz file can be passed as kwargs to the ``numpy.savez()`` + function. The data can then be recovered using the ``numpy.load()`` method, + which returns a dictionary-like object in which each key points to one of the arrays: + + .. code-block:: python + + a = np.array([1, 2, 3, 4]) + b = np.array([5, 6, 7, 8]) + + np.savez("my_arrays.npz", array_1=a, array_2=b) + data = np.load("my_arrays.npz") + data['array_1'] == a + data['array_2'] == b + # Both are true Random numbers