|Bayesianism: perceives probability as the degree of a belief. See also subjective probability, objective probability, chance, likelihood._____________Annotation: The above characterizations of concepts are neither definitions nor exhausting presentations of problems related to them. Instead, they are intended to give a short introduction to the contributions below. – Lexicon of Arguments. |
Peter Norvig on Bayesianism - Dictionary of Arguments
Norvig I 503
Bayesianism/Norvig/Russell: Bayes’ rule allows unknown probabilities to be computed from known conditional probabilities, usually in the causal direction. Applying Bayes’ rule with many pieces of evidence runs into the same scaling problems as does the full joint distribution.
Conditional independence brought about by direct causal relationships in the domain might allow the full joint distribution to be factored into smaller, conditional distributions.
The naive Bayes model assumes the conditional independence of all effect variables, given a single cause variable, and grows linearly with the number of effects.
Norvig I 505
Bayesian probabilistic reasoning has been used in AI since the 1960s, especially in medical diagnosis. It was used not only to make a diagnosis from available evidence, but also to select further questions and tests by using the theory of information value (…) when available evidence was inconclusive (Gorry, 1968(1); Gorry et al., 1973(2)). One system outperformed human experts in the diagnosis of acute abdominal illnesses (de Dombal et al., 1974)(3). Lucas et al. (2004)(4) gives an overview. These early Bayesian systems suffered from a number of problems, however. Because they lacked any theoretical model of the conditions they were diagnosing, they were vulnerable to unrepresentative data occurring in situations for which only a small sample was available (de Dombal et al., 1981)(5). Even more fundamentally, because they lacked a concise formalism (…) for representing and using conditional independence information, they depended on the acquisition, storage, and processing of enormous tables of probabilistic data. Because of these difficulties, probabilistic methods for coping with uncertainty fell out of favor in AI from the 1970s to the mid-1980s.
The naive Bayes model for joint distributions has been studied extensively in the pattern recognition literature since the 1950s (Duda and Hart, 1973)(6). It has also been used, often unwittingly, in information retrieval, beginning with the work of Maron (1961)(7). The probabilistic foundations of this technique, (…) were elucidated by Robertson and Sparck Jones (1976)(8).
Independence: Domingos and Pazzani (1997)(9) provide an explanation
Norvig I 506
for the surprising success of naive Bayesian reasoning even in domains where the independence assumptions are clearly violated. >Bayesian Networks/Norvig.
1. Gorry, G. A. (1968). Strategies for computer-aided diagnosis. Mathematical Biosciences, 2(3-4), 293- 318.
2. Gorry, G. A., Kassirer, J. P., Essig, A., and Schwartz, W. B. (1973). Decision analysis as the basis for computer-aided management of acute renal failure. American Journal of Medicine, 55, 473-484.
3. de Dombal, F. T., Leaper, D. J., Horrocks, J. C., and Staniland, J. R. (1974). Human and omputeraided diagnosis of abdominal pain: Further report with emphasis on performance of clinicians. British Medical Journal, 1, 376–380.
4. Lucas, P., van der Gaag, L., and Abu-Hanna, A. (2004). Bayesian networks in biomedicine and
health-care. Artificial Intelligence in Medicine
5. de Dombal, F. T., Staniland, J. R., and Clamp, S. E. (1981). Geographical variation in disease presentation. Medical Decision Making, 1, 59–69.
6. Duda, R. O. and Hart, P. E. (1973). Pattern classification and scene analysis. Wiley.
7. Maron, M. E. (1961). Automatic indexing: An experimental inquiry. JACM, 8(3), 404-417.
8. Robertson, S. E. and Sparck Jones, K. (1976). Relevance weighting of search terms. J. American Society for Information Science, 27, 129-146.
9. Domingos, P. and Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero–one loss. Machine Learning, 29, 103–30._____________Explanation of symbols: Roman numerals indicate the source, arabic numerals indicate the page number. The corresponding books are indicated on the right hand side. ((s)…): Comment by the sender of the contribution. Translations: Dictionary of Arguments The note [Author1]Vs[Author2] or [Author]Vs[term] is an addition from the Dictionary of Arguments. If a German edition is specified, the page numbers refer to this edition.
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010