All Resources
DeepMimic
- [Paper] DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
- Proximal Policy Optimization Algorithms
- A Comprehensive Guide to Machine Learning
- Xue Bin (Jason) Peng: Towards a Virtual Stuntman
- SIGGRAPH 2018: DeepMimic paper (main video)
Policy Gradients
- (DDPG) Continuous Control with Deep Reinforcement Learning
- Stack Exchange: When can we interchange integration and differentiation?
- Daniel Takeshi: Going Deeper Into Reinforcement Learning: Fundamentals of Policy Gradients
On Policy vs. Off Policy
- Stack Overflow explanation of On vs Off Policy
- Simple Explanation of On-Policy Q-Learning and Off-Policy SARSA
- Action policies: ϵ-greedy, ϵ-soft, softmax
- Q-Learning and SARSA Algorithm pseudo code:
Textbooks
- Sutton and Barto’s “Reinforcement Learning: An Introduction” Book
- A Comprehensive Guide to Machine Learning