人工智能是21世纪最激动人心的技术之一。人工智能，就是像人一样的智能，而人的智能包括感知、决策和认知(从直觉到推理、规划、意识等)。其中，感知解决what，深度学习已经超越人类水平；决策解决how，强化学习在游戏和机器人等领域取得了一定效果；认知解决why，知识图谱、因果推理、持续学习等第三代人工智能正在研究。

强化学习，采用反馈学习的方式解决序贯决策问题，因此必然是通往通用人工智能的终极钥匙。其中，AI 1.0 符号学派， AI 2.0 联结学派，AI 3.0不管是结合也好，另辟蹊径也好，必然离不开行为学派，因为这是自然智能的学习方式。我特别喜欢强化学习，深深被其框架所吸引，智能体通过与环境交互来成长，这不就是生命的进化规律么！

个人作为一名AI独立研究员，一路也是通过知乎、b站、GitHub、公众号和各类博客学习过来，非常感谢网络时代大家的分享，同时将自己在强化学习方面的经验总结整理分享，既是方便自己学习，也希望能帮助一点刷到这条知乎的朋友们。当然，强化学习也面临很多问题，希望我们一起解决，让强化学习变得更好！

https://zhuanlan.zhihu.com/p/104224859

1. 视频（从入门到放弃）

1 .5 Stanford_Emma Brunskill_CS234: Reinforcement Learning | Winter 2019

2. 书籍

2.1 强化学习圣经Rich Sutton中文书、英文电子书、代码 ★★★★★

基础必读，有助于理解强化学习精髓
https://item.jd.com/12696004.html
http://incompleteideas.net/book/the-book-2nd.html
https://github.com/AndyYue1893/reinforcement-learning-an-introduction

2.2 Python深度学习：基于PyTorch[Deep Learning with Python and PyTorch] ★★★★★

思路简洁、清晰，内容经典、精华，深度强化学习研究基础
https://item.jd.com/12590209.html

2.3 Python强化学习实战_Sudharsan Ravichandiran、代码 ★★★★

上手快，代码清晰
https://item.jd.com/12506442.html
https://github.com/AndyYue1893/Hands-On-Reinforcement-Learning-With-Python

2.4 强化学习精要_冯超 ★★★★

从基础到前沿，附代码
https://item.jd.com/12344157.html

2.5 Reinforcement Learning With Open AI TensorFlow and Keras Using Python_OpenAI

注重实战（提取码: av5p）
https://pan.baidu.com/share/init?surl=nQpNbhkI-3WucSD0Mk7Qcg

3. 教程

4. 代码

除了https://github.com/AndyYue1893/spinningup 和 https://github.com/DLR-RM/stable-baselines3，推荐以下个人实现参考。代码有时比论文更重要！！！

4.1 sweetice

https://github.com/AndyYue1893/Deep-reinforcement-learning-with-pytorch

4.2 张楚珩

https://github.com/zhangchuheng123/Reinforcement-Implementation

5. 算法

请问DeepMind和OpenAI身后的两大RL流派有什么具体的区别？
https://www.zhihu.com/question/316626294/answer/627373838

三大经典算法，追根溯源

5.1 DQN(连续状态、离散动作)

Mnih. Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529. (Nature版本)
https://storage.googleapis.com/deepmind-data/assets/papers/DeepMindNature14236Paper.pdf

5.2 DDPG(连续状态、连续动作)

David. Silver, et al. "Deterministic policy gradient algorithms." ICML. 2014.

5.3 A3C & A2C

Mnih. Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." International conference on machine learning. 2016.
https://www.researchgate.net/publication/301847678_Asynchronous_Methods_for_Deep_Reinforcement_Learning
https://openai.com/blog/baselines-acktr-a2c/

6. 环境

6.1 OpenAI Gym

http://gym.openai.com

6.2 Emo Todorov Mujoco

http://www.mujoco.org

6.3 通用格子世界环境类

https://zhuanlan.zhihu.com/p/28109312
https://cs.stanford.edu/people/karpathy/reinforcejs/index.html

7. 框架/平台

目前最好用的大规模强化学习算法训练库是什么？
https://www.zhihu.com/question/377263715/answer/1120555103

7.1 OpenAI Baselines & Stable Baselines

集成度高，经典必读
https://github.com/openai/baselines
https://github.com/hill-a/stable-baselines

7.2 百度 PARL

扩展性强，可复现性好，友好
https://github.com/paddlepaddle/parl

7.3 DeepMind OpenSpiel（仅支持Debian和Ubuntu）

28种棋牌类游戏和24种算法
https://github.com/deepmind/open_spiel

7.4 清华 tianshou

fast-speed modularized framework and pythonic API
完美复现paper结果
https://github.com/thu-ml/tianshou

8. 论文

8.1 Spinning Up推荐论文 ★★★★★

https://zhuanlan.zhihu.com/p/50343077

8.2 NeuronDance ★★★★★

https://blog.csdn.net/gsww404

8.3 清华张楚珩 ★★★★

https://zhuanlan.zhihu.com/p/46600521

8.4 paperswithcode附代码 ★★★★

https://www.paperswithcode.com/area/playing-games
https://github.com/AndyYue1893/pwc

9. PPT

9.1 Reinforcement learning_Nando de Freitas_DeepMind_2019

https://pan.baidu.com/s/1KF10W9GifZCDf9T4FY2H9Q

9.2 Policy Optimization_Pieter Abbeel_OpenAI/UC Berkeley/Gradescope

https://pan.baidu.com/s/1KF10W9GifZCDf9T4FY2H9Q

10. 会议&期刊

10.1 会议

AAAI、NIPS、ICML、ICLR、IJCAI、AAMAS、IROS等

10.2 期刊

AI、JMLR、JAIR、Machine Learning、JAAMAS等

10.3 计算机和人工智能会议（期刊）排名

11. 公众号

1 深度强化学习实验室
2 机器之心
3 AI科技评论
4 新智元
5 学术头条

12. 知乎

12.1 大牛

田渊栋、Flood Sung、许铁-巡洋舰科技（微信公众号同名）、
周博磊、俞扬、张楚珩、天津包子馅儿、JQWang2048 及其互关大牛等

12.2 专栏

David Silver强化学习公开课中文讲解及实践（叶强，比较经典）
强化学习知识大讲堂（《深入浅出强化学习：原理入门》作者天津包子馅儿）
智能单元（杜克、Floodsung、wxam，聚焦通用人工智能，Flood Sung：深度学习论文阅读路线图 Deep Learning Papers Reading Roadmap很棒，Flood Sung：最前沿：深度强化学习的强者之路）
深度强化学习落地方法论（西交大牛，实操经验丰富）
深度强化学习（知乎：JQWang2048，GitHub：NeuronDance，CSDN：J. Q. Wang）
神经网络与强化学习（《Reinforcement Learning: An Introduction》读书笔记）
强化学习基础David Silver笔记（陈雄辉，南大，DiDi AI Labs）

13. 博客

博客大牛理解力超强！

13.1 lilianweng（OpenAI）

https://lilianweng.github.io/lil-log/

13.2 J. Q. Wang