Different Implementations and Benchmarks
This project focuses on the simulation and parallelization of the 2D Ising Model, a statistical mechanics framework used to describe ferromagnetic systems. The primary goal is to evaluate and compare the performance of different programming paradigms—Serial, OpenMP, and CUDA—in simulating thermal fluctuations and phase transitions.
The simulation utilizes the Metropolis-Hastings Algorithm (Markov Chain Monte Carlo) to sample configurations according to the Boltzmann distribution.
The project supports four distinct evolution modes:
- Serial: Single-core CPU execution.
- OpenMP: Multi-core CPU parallelization using a checkerboard update scheme.
- CUDA Global: Baseline GPU implementation using global memory.
- CUDA Shared: Optimized GPU implementation using Shared Memory and Read-only cache (
__ldg()) to minimize access latency.
- Lookup Probability Table: Pre-computes Boltzmann weights for all possible energy changes to avoid expensive exponential calls during the simulation.
- Checkerboard Algorithm: Updates "Red" and "Black" cells in separate phases to prevent race conditions during parallel updates.
- Periodic Boundary Conditions (PBC):
- CPU: Implemented via ternary operators for branch prediction efficiency.
- GPU: Employs Halo Cells (Padding) to ensure memory coalescing and eliminate branch divergence.
- CPU: Intel® Core™ i5-13420H (4 P-Cores / 4 E-Cores / 12 Threads).
- GPU: NVIDIA GeForce RTX 4050 Laptop (Ada Lovelace Architecture, 2560 CUDA Cores, 24 MB L2 Cache).
| Mode | Time (s) | Speedup (vs Serial) |
|---|---|---|
| Serial | 32.7749 | 1.00x |
| OpenMP | 3.5483 | 9.24x |
| CUDA (Global) | 0.4315 | 75.96x |
| CUDA (Shared) | 0.3793 | 86.42x |
-
Latency-bound Regime (
$L \le 512$ ): Shared Memory is superior as it minimizes access latency. -
Bandwidth-bound Regime (
$L > 512$ ): Performance is limited by VRAM bandwidth as data size exceeds L2 capacity.
The project is compiled into a high-performance Python library using PyBind11. The Makefile is located in the src/ directory.
- CUDA Toolkit (including
nvccandcurand) - OpenMP
- Python 3 and
pybind11
Fork the repo and run:
cd src
make
cd ..Then you can import ising2d on python and use the library (more details about it on ''benchmarks.ipynb'').
