$100
Created by -
0.00
(0 ratings)
English
Wishlist
Overview
In this course, you will learn about several algorithms that can learn near optimal policies based on trial and error interaction with the environment---learning from the agent’s own experience. Learning from actual experience is striking because it requires no prior knowledge of the environment’s dynamics, yet can still attain optimal behavior. We will cover intuitively simple but powerful Monte Carlo methods, and temporal difference learning methods including Q-learning. We will wrap up this course investigating how we can get the best of both worlds: algorithms that can combine model-based planning (similar to dynamic programming) and temporal difference updates to radically accelerate learning.
USD 100
Type: Online
This course includes
Taken this course?
Share your experience with other students
USD 100
Type: Online
This course includes
Taken this course?
Share your experience with other students
Sample-based Learning Methods
Created by -
0.00
(0 ratings)
Start Date: February 10th 2021
Course Description
In this course, you will learn about several algorithms that can learn near optimal policies based on trial and error interaction with the environment---learning from the agent’s own experience. Learning from actual experience is striking because it requires no prior knowledge of the environment’s dynamics, yet can still attain optimal behavior. We will cover intuitively simple but powerful Monte Carlo methods, and temporal difference learning methods including Q-learning. We will wrap up this course investigating how we can get the best of both worlds: algorithms that can combine model-based planning (similar to dynamic programming) and temporal difference updates to radically accelerate learning.
The information used on this page is how each course is described on the Coursera platform.
Course Structure
Tags
Mark Complete
About the Instructor
Martha White,Adam White,University of Alberta
No Reviews at this moment.