Deep Q-Learning for Continuous Control

This project implements a Deep Q-Network (DQN) agent for a continuous control environment with discrete action space. The goal was to learn a near-optimal policy directly from pixel-level observations while ensuring stability and convergence within a limited training budget.

Objective

Train a reinforcement learning agent to maximize long-term rewards.
Use DQN components to stabilize training and balance exploration and exploitation.
Evaluate performance across internal validation and official challenge submissions.

Methodology

Observations: Pixel-level environment frames.
DQN components:
- Replay buffer (100,000 samples).
- Target network with soft updates.
- ε-greedy exploration: ε annealed from 1.0 → 0.2 over 100,000 steps.
- Delayed training start after 2,000 steps to ensure buffer diversity.
Hyperparameters:
- Minibatch size: 256
- Learning rate: 0.0001
- Discount factor (γ): 0.99
Training: 1,000 episodes total.
Monitoring: Reward and loss plots used to track convergence.

Results

Agent performance improved rapidly in early episodes and stabilized at high reward levels (>0.9).
Loss curve followed expected DQN dynamics: initial instability → structured updates with variance.
Final evaluation:
- Internal test set: average return 0.9632 over 50 episodes.
- Challenge server: 0.964 score.

Takeaway The chosen hyperparameters and DQN design yielded a robust policy that converged quickly and matched the challenge benchmark.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
code.ipynb		code.ipynb
minigrid_eval.py		minigrid_eval.py
report.pdf		report.pdf
training.png		training.png
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Q-Learning for Continuous Control

Objective

Methodology

Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Deep Q-Learning for Continuous Control

Objective

Methodology

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages