Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 80 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## nt2.py

Python package for visualization and post-processing of the [`Entity`](https://github.com/entity-toolkit/entity) simulation data. For usage, please refer to the [documentation](https://entity-toolkit.github.io/wiki/getting-started/vis/#nt2py). The package is distributed via [`PyPI`](https://pypi.org/project/nt2py/):
Python package for visualization and post-processing of the [`Entity`](https://github.com/entity-toolkit/entity) simulation data. For usage, please refer to the [documentation](https://entity-toolkit.github.io/wiki/content/2-howto/2-vis/#nt2py). The package is distributed via [`PyPI`](https://pypi.org/project/nt2py/):

```sh
pip install nt2py
Expand All @@ -14,25 +14,56 @@ Simply pass the location to the data when initializing the main `Data` object:
import nt2

data = nt2.Data("path/to/data")
# example:
# data = nt2.Data("path/to/shock")
```

The data is stored in specialized containers which can be accessed via corresponding attributes:

```python
data.fields # < xr.Dataset
data.particles # < dict[int : xr.Dataset]
data.particles # < special object which returns a pd.DataFrame when .load() is called
data.spectra # < xr.Dataset
```

> If using Jupyter notebook, you can quickly preview the loaded metadata by simply running a cell with just `data` in it (or in regular python, by doing `print(data)`).

> Note, that by default, the `hdf5` support is disabled in `nt2py` (i.e., only `ADIOS2` format is supported). To enable it, install the package as `pip install "nt2py[hdf5]"` instead of simply `pip install nt2py`.

#### Examples
#### Accessing the data

Plot a field (in cartesian space) at a specific time (or output step):
Fields and spectra are stored as lazily loaded `xarray` datasets (a collection of equal-sized arrays with shared axis coordinates). You may access the coordinates in each dimension using `.coords`:

```python
data.fields.coords
data.spectra.coords
```

Individual arrays can be requested by simply using, e.g., `data.fields.Ex` etc. One can also use slicing/selecting via the coordinates, i.e.,

```python
data.fields.sel(t=5, method="nearest")
```

accesses all the fields at time `t=5` (using `method="nearest"` means it will take the closest time to value `5`). You may also access by index in each coordinate:

```python
data.fields.isel(x=-1)
```

accesses all the fields in the last position along the `x` coordinate.

Note that all these operations do not load the actual data into memory; instead, the data is only loaded when explicitly requested (i.e., when plotting or explicitly calling `.values` or `.load()`.

Particles are stored in a special lazy container which acts very similar to `xarray`; you can still make selections using specific queries. For instance,

```python
data.particles.sel(sp=[1, 2, 4]).isel(t=-1)
```

selects all the particles of species 1, 2, and 4 on the last timestep. The loading of the data itself is done by calling: `.load()` method, which returns a simple `pandas` dataframe.

#### Plotting

Plot a field (in Cartesian coordinates) at a specific time (or output step):

```python
data.fields.Ex.sel(t=10.0, method="nearest").plot() # time ~ 10
Expand Down Expand Up @@ -78,37 +109,43 @@ data.fields\
You can also create a movie of a single field quantity (can be custom):

```python
(data.fields.Ex * data.fields.Bx).sel(x=slice(None, 0.2)).movie.plot(name="ExBx", vmin=-0.01, vmax=0.01, cmap="BrBG")
(data.fields.Ex * data.fields.Bx).sel(x=slice(None, 0.2)).movie.plot(name="ExBx")
```

For particles, one can also make 2D phase-space plots:

```python
data.particles[1].sel(t=1.0, method="nearest").particles.phaseplot(x="x", y="uy", xnbins=100, ynbins=200, xlims=(0, 100), cmap="inferno")
data.particles.sel(sp=1).sel(t=1.0, method="nearest").phase_plot(
x_quantity=lambda f: f.x,
y_quantity=lambda f: f.ux,
xy_bins=(np.linspace(0, 60, 100), np.linspace(-2, 2, 100)),
)
```

or a spectrum plot:

```python
data.particles.sel(sp=[1, 2]).sel(t=1.0, method="nearest").spectrum_plot()
```

You may also combine different quantities and plots (e.g., fields & particles) to produce a more customized movie:

```python
def plot(t, data):
fig, ax = mpl.pyplot.subplots()
fig, ax = plt.subplots()
data.fields.Ex.sel(t=t, method="nearest").sel(x=slice(None, 0.2)).plot(
ax=ax, vmin=-0.001, vmax=0.001, cmap="BrBG"
)
for sp in range(1, 3):
ax.scatter(
data.particles[sp].sel(t=t, method="nearest").x,
data.particles[sp].sel(t=t, method="nearest").y,
c="r" if sp == 1 else "b",
)
prtls = data.particles.sel(t=t, method="nearest").load()
ax.scatter(prtls.x, prtls.y, c="r" if prtls.sp == 1 else "b")
ax.set_aspect(1)
data.makeMovie(plot)
```

You may also access the movie-making functionality directly in case you want to use it for other things:

```python
import nt2.export as nt2e
import nt2.plotters.export as nt2e

def plot(t):
...
Expand All @@ -127,16 +164,35 @@ nt2e.makeFramesAndMovie(
)
```

### Dashboard
#### Raw readers

Support for the dask dashboard is still in beta, but you can access it by first launching the dashboard client:
In case you want to access the raw data without using `nt2py`'s `xarray`/`dask` lazy-loading, you may do so by using the readers. For example, for `ADIOS2` output data format:

```python
import nt2
nt2.Dashboard()
import nt2.readers.adios2 as nt2a

# define a reader
reader = nt2a.Reader()

# get all the valid steps for particles
valid_steps = reader.GetValidSteps("path/to/sim", "particles")

# get all variable names which have prefix "p" at the first valid step
variable_names = reader.ReadCategoryNamesAtTimestep(
"path/to/sim", "particles", "p", valid_steps[0]
)

# convert the variable set into a list and take the first element
variable = list(variable_names)[0]

# read the actual array from the file
reader.ReadArrayAtTimestep(
"path/to/sim", "particles", variable, valid_steps[0]
)
```

This will output the port where the dashboard server is running, e.g., `Dashboard: http://127.0.0.1:8787/status`. Click on it (or enter in your browser) to open the dashboard.
There are many more functions available within the reader. For `hdf5`, you can simply change the import to `nt2.readers.hdf5`, and the rest should remain the same.


### CLI

Expand Down Expand Up @@ -170,7 +226,7 @@ nt2 plot myrun/mysimulation --fields "E.*;B.*" --sel "x=slice(-5, None); z=0.5"

1. Lazy loading and parallel processing of the simulation data with [`dask`](https://dask.org/).
2. Context-aware data manipulation with [`xarray`](http://xarray.pydata.org/en/stable/).
3. Parallel plotting and movie generation with [`multiprocessing`](https://docs.python.org/3/library/multiprocessing.html) and [`ffmpeg`](https://ffmpeg.org/).
3. Parallel plotting and movie generation with [`loky`](https://pypi.org/project/loky/) and [`ffmpeg`](https://ffmpeg.org/).
4. Command-line interface, the `nt2` command, for quick plotting (both movies and snapshots).

### Testing
Expand All @@ -188,3 +244,5 @@ There are unit tests included with the code which also require downloading test
- [x] Ghost cells support
- [x] Usage examples
- [ ] Parse the log file with timings
- [x] Raw reader
- [x] 3.14-compatible parallel output
8 changes: 8 additions & 0 deletions nt2/containers/container.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,11 @@ def reader(self) -> BaseReader:
def remap(self) -> Optional[dict[str, Callable[[str], str]]]:
"""{ str: (str) -> str } : The coordinate/field remap dictionary."""
return self.__remap

def __dask_tokenize__(self) -> tuple[str, str, str]:
"""Provide a deterministic Dask token for container instances."""
return (
self.__class__.__name__,
self.__path,
self.__reader.format.value,
)
60 changes: 13 additions & 47 deletions nt2/plotters/export.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,56 +161,22 @@ def makeFrames(

"""
from loky import get_reusable_executor
from tqdm import tqdm
import os

os.makedirs(fpath, exist_ok=True)

ex = get_reusable_executor(max_workers=num_cpus or (os.cpu_count() or 1))
futures = [
ex.submit(_plot_and_save, ti, t, fpath, plot, data)
for ti, t in enumerate(times)
]
return [
f.result()
for f in [
ex.submit(_plot_and_save, ti, t, fpath, plot, data)
for ti, t in enumerate(times)
]
for f in tqdm(
futures,
total=len(futures),
desc=f"rendering frames to {fpath}",
unit="frame",
)
]

# from tqdm import tqdm
# import multiprocessing as mp
# import os
#
# ctx = mp.get_context()
# if num_cpus is None:
# num_cpus = os.cpu_count() or 1
#
# tasks = [(ti, t, fpath, plot, data) for ti, t in enumerate(times)]
#
# pool = mp.Pool(num_cpus)
#
# with ctx.Pool(processes=num_cpus) as pool:
# results = pool.starmap_async(_plot_and_save, tasks)
# out = results.get()
#
# return list(tqdm(out))

# global plotAndSave
#
# def plotAndSave(ti: int, t: float) -> bool:
# import matplotlib.pyplot as plt
#
# try:
# if data is None:
# plot(t)
# else:
# plot(t, data)
# plt.savefig(f"{fpath}/{ti:05d}.png")
# plt.close()
# return True
# except Exception as e:
# print(f"Error: {e}")
# return False

# if fpath doesn't exist, create it
# if not os.path.exists(fpath):
# os.makedirs(fpath)
#
# tasks = [(ti, t) for ti, t in enumerate(times)]
# results = [pool.apply_async(plotAndSave, t) for t in tasks]
# return [result.get() for result in tqdm(results)]
52 changes: 52 additions & 0 deletions nt2/tests/test_export.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
from typing import Any

from nt2.plotters.export import makeFrames


class _FakeFuture:
def __init__(self, value: bool):
self._value = value

def result(self) -> bool:
return self._value


class _FakeExecutor:
def __init__(self):
self.calls: list[tuple[int, float, str, Any, Any]] = []

def submit(self, func, ti, t, fpath, plot, data):
self.calls.append((ti, t, fpath, plot, data))
return _FakeFuture(func(ti, t, fpath, plot, data))


def test_make_frames_uses_executor_with_data(tmp_path, monkeypatch):
ex = _FakeExecutor()

monkeypatch.setattr(
"loky.get_reusable_executor",
lambda max_workers=None: ex,
)

progress: list[tuple[int, int, str, str]] = []

def fake_tqdm(iterable, total, desc, unit):
progress.append((len(list(iterable)), total, desc, unit))
return iterable

monkeypatch.setattr("tqdm.auto.tqdm", fake_tqdm)

called: list[float] = []

def plot_frame(t, d):
called.append(t)

times = [0.0, 1.0, 2.0]
result = makeFrames(plot=plot_frame, times=times, fpath=str(tmp_path), data={"ok": 1})

assert result == [True, True, True]
assert len(ex.calls) == len(times)
assert called == times
assert progress == [(len(times), len(times), "Rendering frames", "frame")]
for i in range(len(times)):
assert (tmp_path / f"{i:05d}.png").exists()
20 changes: 20 additions & 0 deletions nt2/tests/test_tokenize.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
from dask.base import tokenize

from nt2.containers.container import BaseContainer
from nt2.readers.base import BaseReader
from nt2.utils import Format


class _Reader(BaseReader):
@property
def format(self) -> Format:
return Format.HDF5


def test_base_container_has_deterministic_dask_token():
container = BaseContainer(path="/tmp/sim", reader=_Reader(), remap=None)

token1 = tokenize(container)
token2 = tokenize(container)

assert token1 == token2
Loading