Converting Arbitrary Snapshots

The convert_to_gridder_format.py tool converts HDF5 simulation snapshots from arbitrary formats into the format required by the gridder. This allows you to process simulations from any code (not just SWIFT) with the FLARES-2 Gridder.

Overview

The conversion process:

Reads particle coordinates and masses from arbitrary HDF5 dataset keys
Creates a spatial cell structure for efficient particle lookup
Sorts particles by cell index
Writes output in the standardized gridder format

The gridder requires a hierarchical cell structure for spatial indexing. The conversion script creates a regular grid of cells (default: 16×16×16 = 4,096 cells) and assigns each particle to a cell based on its position.

Requirements

Python 3.7+
h5py
numpy
mpi4py (optional, for MPI mode)

Basic Usage

python tools/convert_to_gridder_format.py input.hdf5 output.hdf5 \
    --coordinates-key PartType1/Coordinates \
    --masses-key PartType1/Masses \
    --copy-header

Required Arguments

input_file: Path to input HDF5 snapshot
output_file: Path to output HDF5 file
--coordinates-key: HDF5 dataset path for particle coordinates
--masses-key: HDF5 dataset path for particle masses

Optional Arguments

--copy-header: Copy Header group from input to output (recommended)
--header-key: HDF5 key for header group (default: Header)
--particle-type: Output particle type group name (default: PartType1)
--cdim: Number of cells per dimension for spatial indexing (default: 16)
--boxsize X Y Z: Manually specify box size (if not in Header)

Examples

SWIFT-like Format

If your snapshot already uses SWIFT-style keys:

python tools/convert_to_gridder_format.py \
    swift_snapshot.hdf5 \
    gridder_snapshot.hdf5 \
    --coordinates-key PartType1/Coordinates \
    --masses-key PartType1/Masses \
    --copy-header

Custom Format with Non-Standard Keys

For simulations with different naming conventions:

python tools/convert_to_gridder_format.py \
    gadget_snapshot.hdf5 \
    gridder_snapshot.hdf5 \
    --coordinates-key DarkMatter/Positions \
    --masses-key DarkMatter/Mass \
    --copy-header \
    --cdim 32

Without Header (Manual BoxSize)

If your input file doesn't have a Header group:

python tools/convert_to_gridder_format.py \
    custom_snapshot.hdf5 \
    gridder_snapshot.hdf5 \
    --coordinates-key Coordinates \
    --masses-key Masses \
    --boxsize 100.0 100.0 100.0 \
    --cdim 16

Non-Cubic Simulation Box

For simulations with different box sizes in each dimension:

python tools/convert_to_gridder_format.py \
    noncubic_snapshot.hdf5 \
    gridder_snapshot.hdf5 \
    --coordinates-key Coords \
    --masses-key Mass \
    --boxsize 100.0 200.0 150.0 \
    --copy-header

Fine Cell Grid for Large Simulations

For very large simulations, increase cdim for better spatial indexing:

python tools/convert_to_gridder_format.py \
    large_simulation.hdf5 \
    gridder_snapshot.hdf5 \
    --coordinates-key PartType1/Coordinates \
    --masses-key PartType1/Masses \
    --cdim 64 \
    --copy-header

MPI Mode

For very large snapshots, use MPI to parallelize the conversion:

mpirun -np 4 python tools/convert_to_gridder_format.py \
    huge_snapshot.hdf5 \
    gridder_snapshot.hdf5 \
    --coordinates-key PartType1/Coordinates \
    --masses-key PartType1/Masses \
    --copy-header \
    --cdim 32

MPI mode creates:

Per-rank files: gridder_snapshot_rank_0.hdf5, gridder_snapshot_rank_1.hdf5, ...
Virtual file: gridder_snapshot.hdf5 (combines all ranks)

Use the virtual file (gridder_snapshot.hdf5) as input to the gridder.

Output HDF5 Structure

The conversion script produces files with this structure:

/Header                              # Simulation metadata (if --copy-header used)
  BoxSize: [X, Y, Z]                 # Box dimensions
  NumPart_Total: [0, N, 0, 0, 0, 0] # Total particle counts
  Redshift: float                    # Redshift value

/PartType1                           # Dark matter particles
  /Coordinates                       # Particle positions (sorted by cell)
    shape: (N, 3)
    dtype: float64
  /Masses                            # Particle masses (sorted by cell)
    shape: (N,)
    dtype: float64

/Cells                               # Spatial indexing structure
  /Meta-data
    dimension: [cdim, cdim, cdim]    # Number of cells per dimension
    size: [dx, dy, dz]               # Physical size of each cell
  /Counts
    /PartType1                       # Number of particles in each cell
      shape: (cdim³,)
      dtype: int32
  /OffsetsInFile
    /PartType1                       # Starting index for particles in each cell
      shape: (cdim³,)
      dtype: int64

Choosing `cdim`

The cdim parameter controls the cell grid resolution. Choose based on your simulation size:

Simulation Size	Recommended cdim	Total Cells	Use Case
< 1M particles	8-16	512-4,096	Small test simulations
1-10M particles	16-32	4,096-32,768	Medium simulations
10-100M particles	32-64	32,768-262,144	Large simulations
> 100M particles	64-128	262,144-2M	Very large simulations

Guidelines:

Higher cdim → More cells → Better spatial locality → Faster gridder
But: Very high cdim with sparse distributions may waste memory
For uniform distributions: cdim ≈ (nparticles^(1/3)) / 10 is a good starting point

Input File Requirements

Your input HDF5 file must contain:

Required:

Particle coordinates as a (N, 3) array
Particle masses as a (N,) array

Optional (but recommended):

Header/BoxSize: Box dimensions [X, Y, Z]
- If missing, use --boxsize argument
Header/NumPart_Total: Total particle counts
Header/Redshift: Redshift value

Not required:

Cell structure (created by conversion script)
Velocities
Particle IDs (will be created if missing)

Common Issues

Missing BoxSize

Error:

ValueError: BoxSize not found in input file and not provided via --boxsize

Solution: Provide BoxSize manually:

--boxsize 100.0 100.0 100.0

Particles Outside Box

Particles with coordinates outside [0, BoxSize] will be clamped to the nearest cell boundary.

Memory Usage

For very large simulations:

Serial mode: Entire particle array loaded into memory
MPI mode: Particles split across ranks, reducing per-rank memory

Verification

After conversion, verify the output structure:

import h5py

with h5py.File('gridder_snapshot.hdf5', 'r') as f:
    # Check required groups
    assert '/PartType1/Coordinates' in f
    assert '/PartType1/Masses' in f
    assert '/Cells/Counts/PartType1' in f
    assert '/Cells/OffsetsInFile/PartType1' in f

    # Check cell structure
    coords = f['/PartType1/Coordinates'][:]
    cell_counts = f['/Cells/Counts/PartType1'][:]

    print(f"Total particles: {len(coords)}")
    print(f"Particles in cells: {cell_counts.sum()}")
    print(f"Non-empty cells: {(cell_counts > 0).sum()}")

Full Workflow Example

Complete example converting a Gadget snapshot:

# 1. Convert Gadget snapshot to gridder format
python tools/convert_to_gridder_format.py \
    gadget_snapshot_099.hdf5 \
    gridder_input.hdf5 \
    --coordinates-key PartType1/Coordinates \
    --masses-key PartType1/Masses \
    --boxsize 100.0 100.0 100.0 \
    --cdim 32 \
    --copy-header

# 2. Create parameter file (params.yml)
cat > params.yml << EOF
Kernels:
  nkernels: 3
  kernel_radius_1: 1.0
  kernel_radius_2: 2.0
  kernel_radius_3: 5.0

Grid:
  type: uniform
  cdim: 50

Cosmology:
  h: 0.7
  Omega_cdm: 0.25
  Omega_b: 0.05

Tree:
  max_leaf_count: 200

Input:
  filepath: gridder_input.hdf5

Output:
  filepath: output/
  basename: gridded_output.hdf5
  write_masses: 1
EOF

# 3. Run gridder
./build/parent_gridder params.yml 8

# 4. Verify output
ls -lh output/gridded_output.hdf5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Converting Arbitrary Snapshots

Overview

Requirements

Basic Usage

Required Arguments

Optional Arguments

Examples

SWIFT-like Format

Custom Format with Non-Standard Keys

Without Header (Manual BoxSize)

Non-Cubic Simulation Box

Fine Cell Grid for Large Simulations

MPI Mode

Output HDF5 Structure

Choosing `cdim`

Input File Requirements

Common Issues

Missing BoxSize

Particles Outside Box

Memory Usage

Verification

Full Workflow Example

See Also

FilesExpand file tree

conversion.md

Latest commit

History

conversion.md

File metadata and controls

Converting Arbitrary Snapshots

Overview

Requirements

Basic Usage

Required Arguments

Optional Arguments

Examples

SWIFT-like Format

Custom Format with Non-Standard Keys

Without Header (Manual BoxSize)

Non-Cubic Simulation Box

Fine Cell Grid for Large Simulations

MPI Mode

Output HDF5 Structure

Choosing cdim

Input File Requirements

Common Issues

Missing BoxSize

Particles Outside Box

Memory Usage

Verification

Full Workflow Example

See Also

Choosing `cdim`