RL Experiments 🧠

强化学习入门实验 —— 从零实现，一行一行理解

🇨🇳 中文

关于这个项目

这是我在学习 Sutton & Barto《Reinforcement Learning: An Introduction》过程中的配套实验代码。

每个实验从零手写，不依赖 gym / stable-baselines 等框架，确保每一行代码都能被完全理解。

实验列表

#	实验	章节	核心概念
1	多臂老虎机 ε-greedy	Ch.2	探索 vs 利用、增量更新、ε-greedy
2	Q-learning 网格世界	Ch.6	TD学习、Q表、ε-greedy策略、收敛

运行方式

# 安装依赖（只需要 numpy + matplotlib）
pip install numpy matplotlib

# 实验 1：老虎机
python bandit/epsilon_greedy.py

# 实验 2：网格世界
python gridworld/q_learning.py

学习资源

教材：Sutton & Barto — Reinforcement Learning: An Introduction (2nd Ed.)
笔记：rl-study-notes

作者

Liu Daihong (岱宗) — 大一本科生，与同学组队做 RL 科研。

🇬🇧 English

About

Hands-on experiments accompanying my study of Sutton & Barto's "Reinforcement Learning: An Introduction".

Every experiment is built from scratch — no Gym, no Stable-Baselines. Every line is meant to be understood.

Experiments

#	Experiment	Chapter	Key Concepts
1	Multi-Armed Bandit (ε-greedy)	Ch.2	Exploration vs Exploitation, Incremental Updates
2	Q-learning Grid World	Ch.6	TD Learning, Q-table, Policy Convergence

Run

pip install numpy matplotlib
python bandit/epsilon_greedy.py
python gridworld/q_learning.py

Author

Liu Daihong — Freshman undergrad, exploring RL with a research group.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
bandit		bandit
gridworld		gridworld
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RL Experiments 🧠

🇨🇳 中文

关于这个项目

实验列表

运行方式

学习资源

作者

🇬🇧 English

About

Experiments

Run

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RL Experiments 🧠

🇨🇳 中文

关于这个项目

实验列表

运行方式

学习资源

作者

🇬🇧 English

About

Experiments

Run

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages