Peter Norvig on Information Value - Dictionary of Arguments
Norvig I 629
Information value/AI research/Norvig/Russell: One of the most important parts of decision making is knowing what questions to ask. Tests are often expensive and sometimes hazardous (both directly and because of associated delays). Their importance depends on two factors: whether the test results would lead to a significantly better (…) plan, and how likely the various test results are.
Example: (…) an oil company is hoping to buy one of n indistinguishable blocks of ocean-drilling rights. (…) exactly one of the blocks contains oil worth C dollars, while the others are worthless. The asking price of each block is C/n dollars. (…)a seismologist offers the company the results of a survey of block number 3, which indicates definitively whether the block contains oil. How much should the company be willing to pay for the information?
With probability 1/n, the survey will indicate oil in block 3. In this case, the company will buy block 3 for C/n dollars and make a profit of C −C/n = (n − 1)C/n dollars. With probability (n−1)/n, the survey will show that the block contains no oil, in which case the company will buy a different block. (…) the company should be willing to pay the seismologist up to C/n dollars for the information: the information is worth as much as the block itself.
The value of information derives from the fact that with the information, one’s course of action can be changed to suit the actual situation. One can discriminate according to the situation, whereas without the information, one has to do what’s best on average over the possible situations.
Def Information value/Norvig: (…) the value of a given piece of information is defined to be the difference in expected value between best actions before and after information is obtained. >Multi-attribute utility/AI Research, >Decision networks/Norvig.
Norvig I 631
Nonnegative value of information: can [information] actually have negative expected value? Intuitively, one should expect this to be impossible. After all, one could in the worst case just ignore the information and pretend that one has never received it. This is confirmed by the following theorem, which applies to any decision-theoretic agent: The expected value of information is nonnegative. >Software agents/Norvig.
Agents/AI/information value: A sensible agent should ask questions in a reasonable order, should avoid asking questions that are irrelevant, should take into account the importance of each piece of information in relation to its cost, and should stop asking questions when that is appropriate. All of these capabilities can be achieved by using the value of information as a guide.
Norvig I 639
The theory of information value was explored first in the context of statistical experiments, where a quasi-utility (entropy reduction) was used (Lindley, 1956)(1). The Russian control theorist Ruslan Stratonovich (1965)(2) developed the more general theory presented here, in which information has value by virtue of its ability to affect decisions. Stratonovich’s work was not known in the West, where Ron Howard (1966)(3) pioneered the same idea. His paper ends with the remark “If information value theory and associated decision theoretic structures do not in the future occupy a large part of the education of engineers, then the engineering profession will find that its traditional role of managing scientific and economic resources for the benefit of man has been forfeited to another profession.” To date, the implied revolution in managerial methods has not occurred.
Recent work by Krause and Guestrin (2009)(4) shows that computing the exact nonmyopic value of information is intractable even in polytree networks. There are other cases - more restricted than general value of information—in which the myopic algorithm does provide a provably good approximation to the optimal sequence of observations (Krause et al., 2008)(5). In some cases - for example, looking for treasure buried in one of n places – ranking experiments in order of success probability divided by cost gives an optimal solution (Kadane and Simon, 1977)(6).
1. Lindley, D. V. (1956). On a measure of the information provided by an experiment. Annals of Mathematical Statistics, 27(4), 986–1005.
2. Stratonovich, R. L. (1965). On value of information. Izvestiya of USSR Academy of Sciences, Technical
Cybernetics, 5, 3–12.
3. Howard, R. A. (1966). Information value theory. IEEE Transactions on Systems Science and Cybernetics,
4. Krause, A. and Guestrin, C. (2009). Optimal value of information in graphical models. JAIR, 35, 557 -
5. Krause, A., McMahan, B., Guestrin, C., and Gupta, A. (2008). Robust submodular observation selection.
JMLR, 9, 2761–2801.
6. Kadane, J. B. and Simon, H. A. (1977). Optimal strategies for a class of constrained sequential problems. Annals of Statistics, 5, 237–255._____________Explanation of symbols: Roman numerals indicate the source, arabic numerals indicate the page number. The corresponding books are indicated on the right hand side. ((s)…): Comment by the sender of the contribution. Translations: Dictionary of Arguments The note [Author1]Vs[Author2] or [Author]Vs[term] is an addition from the Dictionary of Arguments. If a German edition is specified, the page numbers refer to this edition.
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010