A reinforcement learning project for lunar alighting simulation using C++ with LibTorch and Python gym environment communication.
This project implements reinforcement learning algorithms (A2C, PPO) for lunar alighting control using:
- C++ Backend: High-performance RL algorithms with CUDA acceleration
- Python Gym Server: Environment simulation and communication via ZMQ
- LibTorch: Deep learning framework with CUDA support
- SDL3: Graphics and simulation rendering
- Linux Ubuntu (tested on Ubuntu 20.04+)
- NVIDIA GPU with CUDA Compute Capability 7.5 or higher
- Minimum 4GB VRAM recommended
- 8GB+ RAM
- CUDA Version: 12.6
- Compute Architecture: 7.5 (configured in CMakeLists.txt)
- CUDA Toolkit Path:
/usr/local/cuda/
- LibTorch: 2.19.2 (CUDA enabled)
- SDL3: Graphics library
- ZeroMQ: Message communication
- msgpack-c: Serialization
- glm: Mathematics library
- spdlog: Logging framework
# Download CUDA 12.6 from NVIDIA
wget https://developer.download.nvidia.com/compute/cuda/12.6.0/local_installers/cuda_12.6.0_560.28.01_linux.run
sudo sh cuda_12.6.0_560.28.01_linux.run
# Add CUDA to PATH
echo 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc# Update package manager
sudo apt update
# Install required system packages
sudo apt install -y build-essential cmake git pkg-config
# Install SDL3 dependencies
sudo apt install -y libsdl3-dev libgl1-mesa-dev libglu1-mesa-dev
# Install ZeroMQ
sudo apt install -y libzmq3-dev
# Install msgpack
sudo apt install -y libmsgpack-dev
# Install glm (mathematics library)
sudo apt install -y libglm-dev# Clone vcpkg
git clone https://github.com/Microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh
# Add vcpkg to PATH
echo 'export PATH=/path/to/vcpkg:$PATH' >> ~/.bashrc
source ~/.bashrc# Install spdlog
./vcpkg install spdlog
# Install other required packages
./vcpkg install fmtThe project uses LibTorch 2.19.2 with CUDA 12.6 support.
# Create External directory in project root
mkdir -p External
# Download LibTorch (CUDA 12.6 version)
cd External
wget https://download.pytorch.org/libtorch/cu126/libtorch-shared-with-deps-latest.zip
unzip libtorch-shared-with-deps-latest.zip
rm libtorch-shared-with-deps-latest.zipNote: The LibTorch files should be placed in:
/home/moinshaikh/CLionProjects/LunarAlightingRL/External/libtorch/
# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install Python dependencies
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
pip install gymnasium numpy pyzmq msgpack# Create build directory (choose one)
mkdir cmake-build-release
# OR
mkdir cmake-build-debug
# Configure with CMake
cd cmake-build-release
cmake ..
# OR for debug
cd cmake-build-debug
cmake .. -DCMAKE_BUILD_TYPE=Debug
# Build the project
make -j$(nproc)- Release Build (
cmake-build-release/): Optimized for performance - Debug Build (
cmake-build-debug/): Includes debugging symbols
Choose the appropriate build directory based on your needs.
The CMakeLists.txt includes explicit CUDA library paths:
# CUDA Runtime (from toolkit)
"/usr/local/cuda/lib64/libcudart.so"
# CUDA Driver (system libs)
"/usr/lib/x86_64-linux-gnu/libcuda.so.1"
# NVRTC (Compiler library)
"/usr/local/cuda/lib64/libnvrtc.so"- Target Architecture: 7.5
- CUDA Arch List: "7.5" (set in CMakeLists.txt)
- cuDNN Version: 9 (configured as CAFFE2_USE_CUDNN)
For a complete training run with data logging and visualization:
# 1. Start the Python Logger (captures training data)
python3 Logger.py
# 2. In a new terminal, start the Gym Server
python3 start_gym_server.py
# 3. In another terminal, build and run the C++ client
# For release build
cd cmake-build-release
./LunarAlightingRL
# OR for debug build
cd cmake-build-debug
./LunarAlightingRLThe logger captures training metrics and saves them to JSON files:
# Activate Python environment
source .venv/bin/activate
# Start the logger (runs in background)
python3 Logger.py &The logger will create:
training_data.json- Main training metrics and episode datatest_data.json- Test run datarealistic_training_data.json- Realistic simulation data
# In a new terminal
source .venv/bin/activate
python3 start_gym_server.pyThe server will:
- Start on port 10201
- Wait for C++ client connections
- Provide the lunar alighting environment
# From project root (choose build type)
# Release build (optimized)
cd cmake-build-release
./LunarAlightingRL
# Debug build (with debugging symbols)
cd cmake-build-debug
./LunarAlightingRLThe system generates comprehensive training data and visualizations:
training_data.json- Complete training metrics and episode resultstest_data.json- Test episode datarealistic_training_data.json- Realistic simulation scenarios
The training process automatically generates the following analysis charts:
Figure: Learning progress showing reward improvement over training episodes
Figure: Model behavior analysis including success rates and landing metrics
Figure: Training efficiency metrics including FPS and convergence analysis
Additional test outputs are available in:
test_output/- Test run visualizationsintegration_test_output/- Integration test resultsrealistic_output/- Realistic simulation analysis
The system generates detailed JSON files containing training metrics:
Contains complete training session data:
{
"metadata": {
"algorithm": "PPO",
"env_name": "LunarAlighting-v1",
"num_envs": 8,
"batch_size": 40,
"max_frames": 10000000,
"reward_threshold": 160
},
"training_metrics": [
{
"update": 10,
"total_frames": 3520,
"fps": 3519999901696.0,
"average_reward": 392.0,
"episode_count": 8,
"policy_loss": -0.0017563585424795747,
"value_loss": 0.2610202431678772,
"entropy": 1.3344768285751343,
"success_rate": 1.0
}
],
"episodes": [
{
"episode": 1,
"reward": 393.8386535644531,
"length": 1000,
"success": true,
"crash": false,
"final_altitude": 0.0,
"final_velocity": 0.0,
"fuel_used": 0.0
}
]
}Template file for test runs with metadata structure.
Data from realistic lunar alighting scenarios with enhanced physics.
- policy_loss: Loss from the policy network (action selection)
- value_loss: Loss from the value network (state evaluation)
- entropy: Exploration measure (higher = more exploration)
- success_rate: Percentage of successful landings
- average_reward: Mean reward across episodes
- fps: Frames per second during training
LunarAlightingRL/
├── CMakeLists.txt # Main build configuration
├── main.cpp # Entry point
├── External/
│ └── libtorch/ # LibTorch library files
├── include/ # Header files
│ ├── Generator/ # Action generators
│ ├── Distribution/ # Probability distributions
│ ├── Model/ # Neural network models
│ └── Algorithms/ # RL algorithms (A2C, PPO)
├── src/ # Source implementations
├── LunarAlighting/ # Simulation and communication
├── GymServer/ # Python gym environment
├── UnitsTest/ # Unit tests
└── start_gym_server.py # Server startup script
- Reinforcement Learning Algorithms: A2C and PPO implementations
- Neural Network Models: MLP and CNN base architectures
- CUDA Acceleration: GPU-based tensor operations
- Real-time Communication: ZMQ-based C++/Python communication
- Modular Design: Extensible generator and distribution systems
-
CUDA not found:
nvidia-smi # Check GPU availability nvcc --version # Check CUDA compiler
-
Library path issues:
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
- LibTorch not found: Ensure LibTorch is in
External/libtorch/ - Missing dependencies: Install all required system packages
- vcpkg issues: Use the provided vcpkg configuration
- Server connection: Ensure Python server is running before C++ client
- Port conflicts: Check that port 10201 is available
- The project is optimized for NVIDIA GPUs with Compute Capability 7.5+
- CUDA acceleration provides significant speedup for tensor operations
- Memory usage scales with batch size and network complexity
- Fork the repository
- Create a feature branch
- Make changes and test thoroughly
- Submit a pull request
This project is provided for research and educational purposes. Please check the license terms for all dependencies.
For issues and questions, please use the GitHub issue tracker.