Deep RL Course documentation
Quiz
Unit 0. Welcome to the course
Unit 1. Introduction to Deep Reinforcement Learning
Bonus Unit 1. Introduction to Deep Reinforcement Learning with Huggy
Live 1. How the course work, Q&A, and playing with Huggy
Unit 2. Introduction to Q-Learning
Unit 3. Deep Q-Learning with Atari Games
Bonus Unit 2. Automatic Hyperparameter Tuning with Optuna
Unit 4. Policy Gradient with PyTorch
IntroductionWhat are the policy-based methods?The advantages and disadvantages of policy-gradient methodsDiving deeper into policy-gradient(Optional) the Policy Gradient TheoremGlossaryHands-onQuizConclusionAdditional Readings
Unit 5. Introduction to Unity ML-Agents
Unit 6. Actor Critic methods with Robotics environments
Unit 7. Introduction to Multi-Agents and AI vs AI
Unit 8. Part 1 Proximal Policy Optimization (PPO)
Unit 8. Part 2 Proximal Policy Optimization (PPO) with Doom
Bonus Unit 3. Advanced Topics in Reinforcement Learning
Bonus Unit 5. Imitation Learning with Godot RL Agents
Certification and congratulations
Quiz
The best way to learn and to avoid the illusion of competence is to test yourself. This will help you to find where you need to reinforce your knowledge.
Q1: What are the advantages of policy-gradient over value-based methods? (Check all that apply)
Q2: What is the Policy Gradient Theorem?
Solution
The Policy Gradient Theorem is a formula that will help us to reformulate the objective function into a differentiable function that does not involve the differentiation of the state distribution.

Q3: What’s the difference between policy-based methods and policy-gradient methods? (Check all that apply)
Q4: Why do we use gradient ascent instead of gradient descent to optimize J(θ)?
Congrats on finishing this Quiz 🥳, if you missed some elements, take time to read the chapter again to reinforce (😏) your knowledge.
Update on GitHub