Skip to content

LiuDaiH/rl-experiments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

RL Experiments 🧠

强化学习入门实验 —— 从零实现,一行一行理解

中文 | English


🇨🇳 中文

关于这个项目

这是我在学习 Sutton & Barto《Reinforcement Learning: An Introduction》过程中的配套实验代码。

每个实验从零手写,不依赖 gym / stable-baselines 等框架,确保每一行代码都能被完全理解。

实验列表

# 实验 章节 核心概念
1 多臂老虎机 ε-greedy Ch.2 探索 vs 利用、增量更新、ε-greedy
2 Q-learning 网格世界 Ch.6 TD学习、Q表、ε-greedy策略、收敛

运行方式

# 安装依赖(只需要 numpy + matplotlib)
pip install numpy matplotlib

# 实验 1:老虎机
python bandit/epsilon_greedy.py

# 实验 2:网格世界
python gridworld/q_learning.py

学习资源

作者

Liu Daihong (岱宗) — 大一本科生,与同学组队做 RL 科研。


🇬🇧 English

About

Hands-on experiments accompanying my study of Sutton & Barto's "Reinforcement Learning: An Introduction".

Every experiment is built from scratch — no Gym, no Stable-Baselines. Every line is meant to be understood.

Experiments

# Experiment Chapter Key Concepts
1 Multi-Armed Bandit (ε-greedy) Ch.2 Exploration vs Exploitation, Incremental Updates
2 Q-learning Grid World Ch.6 TD Learning, Q-table, Policy Convergence

Run

pip install numpy matplotlib
python bandit/epsilon_greedy.py
python gridworld/q_learning.py

Author

Liu Daihong — Freshman undergrad, exploring RL with a research group.

About

RL入门实验:从零实现ε-greedy和Q-learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages