cs231n lec 14 reinforcement learning

52 views Asked by At

I'm studying CS231N, lecture 14, "Reinforcement Learning". In the lecture, the instructor mentioned the value function, which is shown in the picture:

picture of value function

I am wondering what is that bar between rt and s0? I thought it was something like conditional probability, but I'm not sure about it. Or is it just a division?

1

There are 1 answers

1
AudioBubble On BEST ANSWER

It's the conditional probability. It literally means the reward at time t, given state s, following policy pi.