Chapter 2: Confusion about Reaward and Return

georg.novotny2 · September 1, 2021, 6:44pm

Hello there,

in chapter 2.1 you define the reward as

$G_t = R_{t + 1} + \gamma R_{t + 2} + \gamma^2 R_{t + 3} + ... +\gamma^{H-1} R_H $

isn’t that the return? At least google defines it as the return.

So the reward should then just be the “points” the agent receives by transitioning from state s to s’?

albertoezquerro · September 2, 2021, 8:18am

Yes, the return is the sum of all the rewards. I’ve rewritten this in the notebook to avoid confusion.