Skip to content

GPU/CUDA acceleration via CUDA.jl #3

@jc-macdonald

Description

@jc-macdonald

Add GPU acceleration to OpEngine.jl using CUDA.jl. Priority feature — the main motivation for the Julia port over Python.

Motivation

TRIDENT-scale simulations (300 trait bins × 100 depth levels × 1000s of timesteps) are compute-bound on the reaction term evaluation and diffusion matrix solves. GPU parallelism maps naturally onto:

  • Trait-axis parallelism: each trait bin's reaction term is independent
  • Spatial-axis parallelism: each depth level's reaction term is independent
  • Batch parallelism: ensemble runs over parameter sweeps (model-criticism Pareto studies)

Tasks

  • Abstract array backend: AbstractArray throughout so CuArray drops in
  • GPU-compatible reaction term evaluation (avoid scalar indexing)
  • GPU-compatible diffusion operator (tridiagonal solve on GPU — use CUSOLVER or batched Thomas algorithm)
  • GPU-compatible IMEX time-stepping
  • Benchmark: CPU vs GPU for TRIDENT-scale problem sizes
  • Batch solver: run N parameter sets simultaneously on GPU (one kernel launch per ensemble)
  • Optional dependency: CUDA.jl as an extension package (OpEngineCUDAExt)

Dependencies

Cross-references

  • trident: primary consumer — GPU enables full resolution sweeps for the convergence studies
  • model-criticism / ModelCriticism.jl: Pareto front computation over solver settings benefits from batch GPU execution

Boundary

This is solver acceleration — no changes to OpSystem.jl or the model specification layer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions