Logo

$100

Sample-based Learning Methods

Created by -

Martha White,Adam White
,
University of Alberta

0.00

(0 ratings)

English

Wishlist

Overview

In this course, you will learn about several algorithms that can learn near optimal policies based on trial and error interaction with the environment---learning from the agent’s own experience. Learning from actual experience is striking because it requires no prior knowledge of the environment’s dynamics, yet can still attain optimal behavior. We will cover intuitively simple but powerful Monte Carlo methods, and temporal difference learning methods including Q-learning. We will wrap up this course investigating how we can get the best of both worlds: algorithms that can combine model-based planning (similar to dynamic programming) and temporal difference updates to radically accelerate learning.

course image

USD 100

provider image

Type: Online

This course includes

  • Approx. 22 hours to complete
  • Earn a Certificate upon completion
  • Start instantly and learn at your own schedule.

Taken this course?

Share your experience with other students

Share

Add Review

course image

USD 100

provider image

Type: Online

This course includes

  • Approx. 22 hours to complete
  • Earn a Certificate upon completion
  • Start instantly and learn at your own schedule.

Taken this course?

Share your experience with other students

Share

Add Review

Sample-based Learning Methods

Created by -

Martha White,Adam White
,
University of Alberta

0.00

(0 ratings)

All Levels

Start Date: February 10th 2021

Course Description

In this course, you will learn about several algorithms that can learn near optimal policies based on trial and error interaction with the environment---learning from the agent’s own experience. Learning from actual experience is striking because it requires no prior knowledge of the environment’s dynamics, yet can still attain optimal behavior. We will cover intuitively simple but powerful Monte Carlo methods, and temporal difference learning methods including Q-learning. We will wrap up this course investigating how we can get the best of both worlds: algorithms that can combine model-based planning (similar to dynamic programming) and temporal difference updates to radically accelerate learning.

The information used on this page is how each course is described on the Coursera platform.

Course Structure

Tags

Mark Complete


About the Instructor

Martha White,Adam White,University of Alberta

No Reviews at this moment.

Explore Skillqore

Skillqore Newsletter

Keep me up to date with content, updates, and offers from Skillqore


Copyright © 2020 Skillqore, Inc. All Rights Reserved.