|Hypotheses: Hypotheses are assumptions made before performing experiments to compare the results of these experiments with them. Hypotheses must be fed by a given theory that is at least rudimentary, which determines what belongs to the domain of the objects involved, the concepts used and the possible consequences, and what cannot belong to it. In the course of the theory formation there is a mutual correction of assumptions and test results and the set of concepts and sentences of the theory. See also theories, methods, verification._____________Annotation: The above characterizations of concepts are neither definitions nor exhausting presentations of problems related to them. Instead, they are intended to give a short introduction to the contributions below. – Lexicon of Arguments. |
|Norvig I 768
Hypotheses/examples/learning/AI Research/Norvig/Russell: [a learning rule for decisions will need examples.] Examples [are] described by attributes
Norvig I 769
Each hypothesis predicts that a certain set of examples - namely, those that satisfy its candidate definition - will be examples of the goal predicate ((s) i.e. a description of the resulting action). This set is called the extension of the predicate. Two hypotheses with different extensions are therefore logically inconsistent with each other, because they disagree on their predictions for at least one example.
Norvig I 770
An example can be a false negative for the hypothesis, if the hypothesis says it should be negative but in fact it is positive. An example can be a false positive for the hypothesis, if the hypothesis says it should be positive but in fact it is negative.
Current-best-hypothesis search: The idea behind current-best-hypothesis search is to maintain a single hypothesis, and to adjust it as new examples arrive in order to maintain consistency. The basic algorithm was described by John Stuart Mill (1843)(1), and may well have appeared even earlier.
Generalization: The extension of the hypothesis must be increased to include [the example].
Norvig I 771
Specialization: The extension of the hypothesis must be decreased to exclude the example.
Norvig I 772
Current-best-learning algorithm: (…) generalization and specialization are also logical relationships between hypotheses. If hypothesis h1, with definition C1, is a generalization of hypothesis h2 with definition C2, then we must have
∀x C2(x) ⇒ C1(x).
Therefore in order to construct a generalization of h2, we simply need to find a definition
C1 that is logically implied by C2.
Norvig I 773
The current-best-learning algorithm and its variants have been used in many machine learning systems, starting with Patrick Winston’s (1970)(2) “arch-learning” program.
Problems/VsCBL algorithm: 1. Checking all the previous examples over again for each modification is very expensive. 2. The search process may involve a great deal of backtracking.
Backtracking: arises because the current-best-hypothesis approach has to choose a particular hypothesis as its best guess even though it does not have enough data yet to be sure of the choice. Solution: what we can do instead is to keep around all and only those hypotheses that are consistent with all the data so far. >Prior knowledge/AI Research.
Version space learning algorithm: Assuming that the original hypothesis space does in fact contain the right answer, the reduced disjunction [of hypotheses] must still contain the right answer because only incorrect hypotheses have been removed. The set of hypotheses remaining is called the version space, and the learning algorithm (…) is called the version space learning algorithm (also the candidate elimination algorithm).
Norvig I 776
Problems/VsVersion space learning algorithm: a) If the domain contains noise or insufficient attributes for exact classification, the version space will always collapse. b) If we allow unlimited disjunction in the hypothesis space, the S-set [the most specific boundary] will always contain a single most-specific hypothesis, namely, the disjunction of the descriptions of the positive examples seen to date. Similarly, the G-set [most general boundary] will contain just the negation of the disjunction of the descriptions of the negative examples. c) For some hypothesis spaces, the number of elements in the S-set or G-set may grow exponentially in the number of attributes, even though efficient learning algorithms exist for those hypothesis spaces.
Noise: To date, no completely successful solution has been found for the problem of noise. The problem of disjunction can be addressed by allowing only limited forms of disjunction or by including a generalization hierarchy of more general predicates.
The pure version space algorithm was first applied in the Meta-DENDRAL system, which was designed to learn rules for predicting how molecules would break into pieces in a mass spectrometer (Buchanan and Mitchell, 1978)(3). >Prior knowledge/Norvig.
1. Mill, J. S. (1843). A System of Logic, Ratiocinative and Inductive: Being a Connected View of the Principles of Evidence, and Methods of Scientific Investigation. J.W. Parker, London.
2. Winston, P. H. (1970). Learning structural descriptions from examples. Technical report MAC-TR-76,
Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology.
3. Buchanan, B. G.,Mitchell, T.M., Smith, R. G., and Johnson, C. R. (1978). Models of learning systems.
In Encyclopedia of Computer Science and Technology, Vol. 11. Dekker._____________Explanation of symbols: Roman numerals indicate the source, arabic numerals indicate the page number. The corresponding books are indicated on the right hand side. ((s)…): Comment by the sender of the contribution. The note [Author1]Vs[Author2] or [Author]Vs[term] is an addition from the Dictionary of Arguments. If a German edition is specified, the page numbers refer to this edition.
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010