Reinforcement learning/superintelligence/values/Bostrom: Often, the learning algorithm involves the gradual construction of some kind of evaluation function, which assigns values to states, state-action pairs, or policies.
Problem: The evaluation function, which is continuously updated in light of experience, could be regarded as incorporating a form of learning about value. However, what is being learned is not new final values but increasingly accurate estimates of the instrumental values of reaching particular states (or of taking particular actions in particular states, or of following particular policies). Insofar as a reinforcement-learning agent can be described as having a final goal, that goal remains constant: to maximize future reward. And reward consists of specially designated percepts received from the environment. Therefore, the wireheading syndrome remains a likely outcome in any reinforcement agent that develops a world model sophisticated enough to suggest this alternative way of maximizing reward. >Values/superintelligence/Bostrom.

