Chapter 2: Confusion about Reaward and Return

Hello there,

in chapter 2.1 you define the reward as

$G_t = R_{t + 1} + \gamma R_{t + 2} + \gamma^2 R_{t + 3} + ... +\gamma^{H-1} R_H $

isn’t that the return? At least google defines it as the return.

So the reward should then just be the “points” the agent receives by transitioning from state s to s’?

Hello @georg.novotny2 ,

Yes, the return is the sum of all the rewards. I’ve rewritten this in the notebook to avoid confusion.