Particle in Harsh Environment

A high-performance C++ simulation framework for modeling biolocomotion and premelting dynamics in ice using finite difference methods. This implementation numerically solves coupled partial differential equations for chemotaxis and diffusion with two optimized backends: CPU parallelization via OpenMP SIMD and GPU acceleration via Metal for Apple Silicon.

Implementations

This repository provides two optimized implementations:

1. CPU-Only Version (implementation_pragma/)

OpenMP SIMD parallelization with pragma directives
Works on any platform (macOS, Linux, Windows)
Uses g++-13 or compatible compiler
Best for cross-platform deployment
See implementation_pragma/PERFORMANCE_GUIDE.md for details

2. GPU Hybrid Version (implementation_metal/)

Metal GPU acceleration for Apple Silicon and Intel GPUs
Unified GPU pipeline with data residency optimization
Uses clang++ with Metal framework
2.0x speedup over CPU OpenMP on 50×50×1600 grid
All computational kernels execute on GPU
See implementation_metal/PERFORMANCE_GUIDE.md for details

Version	Compiler	Platform	600 Years
Pragma SIMD (CPU)	g++-13	Any	121.6s
Metal GPU	clang++	macOS	61.9s
GPU Speedup	-	-	2.0x

Benchmark: Run ./scripts/compare_performance.sh to compare both implementations on your system.

Mathematical Model

This implementation solves coupled partial differential equations for biolocomotion and premelting dynamics in ice. The numerical method uses:

Discretization: Finite difference method on uniform 3D grid
Time integration: Forward Euler scheme
Spatial derivatives: Second-order central differences
Boundary conditions: Periodic or fixed (configurable)

For the complete mathematical formulation and physical model, see the original paper:

Vachier J and Wettlaufer JS (2022) Biolocomotion and Premelting in Ice. Frontiers in Physics 10:904836. DOI: 10.3389/fphy.2022.904836

Simulation Results

This simulation models particle migration in ice driven by chemotaxis. The behavior depends on the sign of the chemotaxis coefficient β (beta):

β < 0 (Attractive): Particles migrate toward higher concentration regions
β > 0 (Repulsive): Particles migrate away from higher concentration regions

Z-Axis Dynamics

The figures below show concentration and density profiles along the z-axis (vertical direction through the ice column):

Attractive Chemotaxis

Figure 1: Concentration (top) and density (bottom) profiles along the z-axis for attractive chemotaxis showing particle aggregation over 300 years. Particles migrate toward regions of higher concentration, forming distinct peaks in the ice column.

Repulsive Chemotaxis

Figure 2: Concentration (top) and density (bottom) profiles along the z-axis for repulsive chemotaxis showing particle dispersion over 300 years. Particles migrate away from regions of higher concentration, spreading throughout the ice column.

Features

3D finite difference solver with stencil operations
Chemotaxis-driven particle dynamics
OpenMP SIMD parallelization for CPU efficiency
Metal GPU pipeline with unified kernel execution
GPU-resident data eliminates transfer overhead
Configurable simulation parameters via header file
Dual output formats: binary and text
Binary format provides 8x space reduction
Comprehensive test suite for correctness verification
Interactive 3D visualization with Plotly

Prerequisites

Required

C++ Compiler with C++17 support and OpenMP
- GCC 13.0+ (recommended for CPU version)
- Clang 14.0+ (required for Metal version)
Make build system
Python 3.11+ with uv for visualization

macOS Installation

# Install GCC with OpenMP support
brew install gcc@13

# Install Xcode Command Line Tools (for Metal)
xcode-select --install

# Install Python package manager
brew install uv

Linux Installation

# Ubuntu/Debian
sudo apt-get update
sudo apt-get install build-essential g++-13 libomp-dev

# Fedora/RHEL
sudo dnf install gcc-c++ make libomp-devel

# Install uv for Python
curl -LsSf https://astral.sh/uv/install.sh | sh

Installation

1. Clone the repository

git clone https://github.com/jvachier/particle-in-harsh-environment.git
cd particle-in-harsh-environment

2. Choose your implementation

Option A: CPU-Only (Pragma SIMD)

make pragma

Option B: GPU Hybrid (Metal) - macOS only

make metal

Option C: Build both

make all

3. Verify installation

# For CPU version
./implementation_pragma/main_pragma.out --help

# For GPU version
./implementation_metal/main_metal.out --help

Usage

Quick Start

CPU version (works everywhere):

make run:pragma

GPU version (macOS, automatic GPU selection):

make run:metal

Detailed Usage

Configure simulation parameters:

# Edit configuration header
nano config/simulation_parameters.h

Build and run:

# Build specific implementation
make pragma   # or make metal

# Run with default parameters
cd implementation_pragma && ./main_pragma.out

# Run with custom parameters
./main_pragma.out --years 10 --beta -1e-10

Run performance comparison:

./scripts/compare_performance.sh

Expected output:

════════════════════════════════════════════════════════
  Performance Comparison: Pragma SIMD vs Metal GPU
════════════════════════════════════════════════════════

[1/2] Running Pragma SIMD Implementation...
[2/2] Running Metal GPU Implementation...

┌─────────────────────┬──────────────┬──────────────┐
│ Metric              │ Pragma SIMD  │ Metal GPU    │
├─────────────────────┼──────────────┼──────────────┤
│ Real time           │     121.60s  │      61.90s  │
│ User time           │     676.40s  │      57.20s  │
│ System time         │       4.80s  │       3.10s  │
│ Peak memory         │      141 MB  │      183 MB  │
└─────────────────────┴──────────────┴──────────────┘

Performance Gain:
  Speedup: 2.0x
  Time saved: 59.7s (1.0 minutes)

Configuration

Runtime Configuration (Recommended)

Change simulation parameters using command-line arguments - no recompilation required:

# Change number of years to simulate
./main_pragma.out --years 10

# Change grid resolution
./main_metal.out --nz 2000

# Change chemotaxis coefficient
./main_pragma.out --beta -1e-10

# Adjust number of threads
./main_metal.out --threads 8

# Multiple parameters
./main_pragma.out --years 5 --beta -1e-10 --threads 8 --nz 2000

# See all options
./main_pragma.out --help

Available options:

--years VALUE - Number of years to simulate (default: 12)
--beta VALUE - Chemotaxis coefficient (default: -1e-10)
--nz VALUE - Number of z grid points (default: 1600)
--nt VALUE - Time steps per iteration (default: 15768)
--threads VALUE - Number of OpenMP threads (default: 6)
--output-dir PATH - Set output directory
--no-gpu - Disable Metal GPU (Metal implementation only)

Compile-Time Configuration

Edit config/simulation_parameters.h to change default values:

struct GridParameters {
    int nx = 50;            // Grid points in x
    int ny = 50;            // Grid points in y
    int nz = 1600;          // Grid points in z
};

struct TimeParameters {
    int num_years = 12;     // Years to simulate
    int nt = 15768;         // Time steps per year
    double dt = 100.0;      // Time step size (seconds)
};

struct ChemicalParameters {
    double beta = -1e-10;   // Chemotaxis coefficient
    double D_c = 1e-9;      // Concentration diffusion
    double D_rho = 1e-11;   // Density diffusion
};

Output

All simulations write to the data/ directory in the project root.

Output Structure

data/
├── pragma/          # Pragma SIMD output
│   ├── c_0.bin      # Concentration at t=0
│   ├── f_0.bin      # Density at t=0
│   └── ...
└── metal/           # Metal GPU output
    ├── c_0.bin
    ├── f_0.bin
    └── ...

File Format

Binary format (.bin):

Little-endian double precision (IEEE 754)
Layout: contiguous 3D array flattened in row-major order
Size: nx × ny × nz × 8 bytes per field
8x smaller than text format

Access in Python:

import numpy as np

# Read binary file
data = np.fromfile('data/metal/c_0.bin', dtype=np.float64)
nx, ny, nz = 50, 50, 1600

# Reshape to 3D array
c = data.reshape((nx, ny, nz))

# Access point (i,j,k)
value = c[i, j, k]

Access in C++:

#include <fstream>
#include <vector>

std::vector<double> data(nx * ny * nz);
std::ifstream file("data/metal/c_0.bin", std::ios::binary);
file.read(reinterpret_cast<char*>(data.data()), data.size() * sizeof(double));

// Access point (i,j,k)
double value = data[i*ny*nz + j*nz + k];

Visualizations

After running the simulation, visualize results using the interactive Plotly-based tool:

cd scripts

# Visualize Metal GPU results (default)
uv run python visualize_simulation.py

# Visualize specific implementation
uv run python visualize_simulation.py ../data/pragma/*.bin

# Generate all visualizations
uv run python visualize_simulation.py ../data/metal/*.bin --all

Features

3D surface plots of concentration and density fields
Temporal evolution animations showing dynamics over time
Profile plots along Z-axis for vertical distribution
Interactive HTML outputs with zoom, pan, and rotation
Static PNG exports for publications

For detailed usage: uv run python visualize_simulation.py --help

Performance

Computational Complexity

Time per iteration: O(nx × ny × nz) for stencil operations
Memory: O(nx × ny × nz) for field storage
Grid size: 50 × 50 × 1600 = 4,000,000 points

Benchmark Results

Comprehensive comparison on Apple M2 with 600 years simulation (12 iterations × 15768 timesteps):

Implementation	Real Time	User Time	System Time	Peak Memory	Speedup
Pragma SIMD (CPU 6 threads)	121.6s	676.4s	4.8s	141 MB	Baseline
Metal GPU	61.9s	57.2s	3.1s	183 MB	2.0x

Key findings:

GPU provides 2.0x speedup over optimized CPU code
GPU offloads computation, reducing CPU usage from 676s to 57s
Memory overhead of 30% due to GPU buffers
GPU-resident data eliminates transfer bottleneck

GPU Pipeline Architecture

The Metal implementation achieves high performance through:

Unified GPU Pipeline: All three kernels execute on GPU
- oldtonew_kernel - Array copy operation
- concentration_field_kernel - Diffusion computation
- concentration_field_density_kernel - Chemotaxis computation
GPU-Resident Data: Simulation data stays on GPU throughout execution
- Transfer to GPU: Once at startup
- Transfer from GPU: 12 times for output (vs 189,000 for naive approach)
- Result: 99.99% reduction in transfer overhead
Optimized Thread Configuration:
- 1D kernel: 256 threads per group
- 3D kernels: 8×8×4 = 256 threads per group
- Empirically tuned for M2 GPU architecture
Asynchronous Execution: GPU runs without blocking CPU

Running Benchmarks

# Run comprehensive benchmark
./scripts/compare_performance.sh

# Results saved to:
# - pragma_timing.log (Pragma SIMD detailed output)
# - metal_timing.log (Metal GPU detailed output)
# - performance_summary.txt (Comparison summary)

Testing

# Run all tests
make test

# Performance benchmarks
make benchmark

License

This project is licensed under the Apache License 2.0. See LICENSE file for details.

This project is based on research published in:

Vachier J and Wettlaufer JS (2022) Biolocomotion and Premelting in Ice. Front. Phys. 10:904836. DOI: 10.3389/fphy.2022.904836

Citation

If you use this code in your research, please cite both the software and the original paper:

Software Citation

@software{vachier2025particle,
  title={Particle in Harsh Environment: High-Performance Simulation Framework},
  author={Vachier, Jeremy},
  year={2025},
  url={https://github.com/jvachier/Particle-in-harsh-environment-share},
  note={C++ implementation with OpenMP SIMD and Metal GPU acceleration}
}

Original Research Paper

@article{vachier2022biolocomotion,
  title={Biolocomotion and Premelting in Ice},
  author={Vachier, Jeremy and Wettlaufer, John S},
  journal={Frontiers in Physics},
  volume={10},
  pages={904836},
  year={2022},
  publisher={Frontiers Media SA},
  doi={10.3389/fphy.2022.904836}
}

Contact:

Author: Jeremy Vachier
GitHub: @jvachier

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.github		.github
config		config
docs/images		docs/images
implementation_metal		implementation_metal
implementation_pragma		implementation_pragma
scripts		scripts
tests		tests
.gitignore		.gitignore
.python-version		.python-version
CITATION.bib		CITATION.bib
LICENCE		LICENCE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Particle in Harsh Environment

Table of Contents

Implementations

1. CPU-Only Version (implementation_pragma/)

2. GPU Hybrid Version (implementation_metal/)

Mathematical Model

Simulation Results

Z-Axis Dynamics

Attractive Chemotaxis

Repulsive Chemotaxis

Features

Prerequisites

Required

macOS Installation

Linux Installation

Installation

1. Clone the repository

2. Choose your implementation

3. Verify installation

Usage

Quick Start

Detailed Usage

Configuration

Runtime Configuration (Recommended)

Compile-Time Configuration

Output

Output Structure

File Format

Visualizations

Features

Performance

Computational Complexity

Benchmark Results

GPU Pipeline Architecture

Running Benchmarks

Testing

License

Citation

Software Citation

Original Research Paper

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages