Peter Norvig on Environment - Dictionary of Arguments
Norvig I 401
Environment/planning/real world/representation/artificial intelligence/Norvig/Russell: algorithms for planning (…) extend both the representation language and the way the planner interacts with the environment. >Planning/Norvig, >Agents/Norvig.
New: [we now have] a) actions with duration and b) plans that are organized hierarchically.
Hierarchy: Hierarchy also lends itself to efficient plan construction because the planner can solve a problem at an abstract level before delving into details
1st approach: “plan first, schedule later”: (…) we divide the overall problem into a planning phase in which actions are selected, with some ordering constraints, to meet the goals of the problem, and a later scheduling phase, in which temporal information is added to the plan to ensure that it meets resource and deadline constraints.
Norvig I 404
Critical path: Mathematically speaking, critical-path problems are easy to solve because they are defined as a conjunction of linear inequalities on the start and end times. When we introduce resource constraints, the resulting constraints on start and end times become more complicated.
Norvig I 405
Scheduling: The “cannot overlap” constraint is a disjunction of two linear inequalities, one for each possible ordering. The introduction of disjunctions turns out to make scheduling with resource constraints NP-hard. >NP-Problems.
Non-overlapping: [when we assume non-overlapping] every scheduling problem can be solved by a non-overlapping sequence that avoids all resource conflicts, provided that each action is feasible by itself. If a scheduling problem is proving very difficult, however, it may not be a good idea to solve it this way - it may be better to reconsider the actions and constraints, in case that leads to a much easier scheduling problem. Thus, it makes sense to integrate planning and scheduling by taking into account durations and overlaps during the construction of a partial-order plan.
Heuristics: partial-order planners can detect resource constraint violations in much the same way they detect conflicts with causal links. Heuristics can be devised to estimate the total completion time of a plan. This is currently an active area of research (see below).
Norvig I 406
Real world planning: AI systems will probably have to do what humans appear to do: plan at higher levels of abstraction. A reasonable plan for the Hawaii vacation might be “Go to
San Francisco airport (…)” ((s) which might be in a different direction). (…) planning can occur both before and during the execution of the plan (…).
Solution: hierarchical decomposition: hierarchical task networks (HTN).
Norvig I 407
a high-level plan achieves the goal from a given state if at least one of its implementations achieves the goal from that state. The “at least one” in this definition is crucial - not all implementations need to achieve the goal, because the agent gets
Norvig I 408
to decide which implementation it will execute. Thus, the set of possible implementations in HTN planning - each of which may have a different outcome - is not the same as the set of possible outcomes in nondeterministic planning. It can be shown that the right collection of HLAs can result in the time complexity of blind search dropping from exponential in the solution depth to linear in the solution depth, although devising such a collection of HLAs may be a nontrivial task in itself.
Norvig I 409
Plan library: The key to HTN planning, then, is the construction of a plan library containing known methods for implementing complex, high-level actions. One method of constructing the library is to learn the methods from problem-solving experience. (>Representation/AI research, >Learning/AI research).
Learning/AI: In this way, the agent can become more and more competent over time as new methods are built on top of old methods. One important aspect of this learning process is the ability to generalize the methods that are constructed, eliminating detail that is specific to the problem instance (…).
Norvig I 410
Nondeterministic action: problem: downward refinement is much too conservative for a real world environment. See >Terminology/Norvig for “demonic nondeterminism” and “angelic nondeterminism”.
Norvig I 411
Reachable sets: The key idea is that the agent can choose which element of the reachable set it ends up in when it executes the HLA; thus, an HLA with multiple refinements is more “powerful” than the same HLA (hig level action) with fewer refinements. The notion of reachable sets yields a straightforward algorithm: search among highlevel plans, looking for one whose reachable set intersects the goal; once that happens, the algorithm can commit to that abstract plan, knowing that it works, and focus on refining the plan further.
Norvig I 415
Unknown environment/planning/nondeterministic domains: [problems here are] sensorless planning (also known as conformant planning) for environments with no observations; contingency planning for partially observable and nondeterministic environments; and online planning and replanning for unknown environments.
Norvig I 417
Sensorless planning: In classical planning, where the closed-world assumption is made, we would assume that any fluent not mentioned in a state is false, but in sensorless (and partially observable) planning we have to switch to an open-world assumption in which states contain both positive and negative fluents, and if a fluent does not appear, its value is unknown. Thus, the belief state corresponds exactly to the set of possible worlds that satisfy the formula.
Norvig I 423
Online replanning: The online agent has a choice of how carefully to monitor the environment. We distinguish three levels: a) Action monitoring: before executing an action, the agent verifies that all the preconditions still hold, b) Plan monitoring: before executing an action, the agent verifies that the remaining plan will still succeed, c) Goal monitoring: before executing an action, the agent checks to see if there is a better set of goals it could be trying to achieve.
Norvig I 425
Multi-agent planning: A multibody problem is still a “standard” single-agent problem as long as the relevant sensor information collected by each body can be pooled - either centrally or within each body - to form a common estimate of the world state that then informs the execution of the overall plan; in this case, the multiple bodies act as a single body. When communication constraints make this impossible, we have
Norvig I 426
what is sometimes called a decentralized planning problem: (…) the subplan constructed for each body may need to include explicit communicative actions with other bodies.
Norvig I 429
Convention: A convention is any constraint on the selection of joint plans.
Communication: In the absence of a convention, agents can use communication to achieve common knowledge of a feasible joint plan.
Plan recognition: works when a single action (or short sequence of actions) is enough to determine a joint plan unambiguously. Note that communication can work as well with competitive agents as with cooperative ones.
Norvig I 430
The most difficult multi-agent problems involve both cooperation with members of one’s own team and competition against members of opposing teams, all without centralized control.
Norvig I 431
Time constraints in plans: Planning with time constraints was first dealt with by DEVISER (Vere, 1983(1)). The representation of time in plans was addressed by Allen (1984(2)) and by Dean et al. (1990)(3) in the FORBIN system. NONLIN+ (Tate and Whiter, 1984)(4) and SIPE (Wilkins, 1988(5), 1990(6)) could reason about the allocation of limited resources to various plan steps.
Forward state-space search: The two planners SAPA (Do and Kambhampati, 2001)(7) and T4 (Haslum and Geffner, 2001)(8) both used forward state-space search with sophisticated heuristics to handle actions with durations and resources.
Human heuristics: An alternative is to use very expressive action languages, but guide them by human-written domain-specific heuristics, as is done by ASPEN (Fukunaga et al., 1997)(9), HSTS (Jonsson et al., 2000)(10), and IxTeT (Ghallab and Laruelle, 1994)(11).
Norvig I 432
Hybrid planning-and-scheduling systems: ISIS (Fox et al., 1982(12); Fox, 1990(13)) has been used for job shop scheduling at Westinghouse, GARI (Descotte and Latombe, 1985)(14) planned the machining and construction of mechanical parts, FORBIN was used for factory control, and NONLIN+ was used for naval logistics planning. We chose to present planning and scheduling as two separate problems; (Cushing et al., 2007)(15) show that this can lead to incompleteness on certain problems.
Scheduling: The literature on scheduling is presented in a classic survey article (Lawler et al., 1993)(16), a recent book (Pinedo, 2008)(17), and an edited handbook (Blazewicz et al., 2007)(18).
Abstraction hierarchy: The ABSTRIPS system (Sacerdoti, 1974)(19) introduced the idea of an abstraction hierarchy, whereby planning at higher levels was permitted to ignore lower-level preconditions of actions in order to derive the general structure of a working plan. Austin Tate’s Ph.D. thesis (1975b) and work by Earl Sacerdoti (1977)(20) developed the basic ideas of HTN planning in its modern form. Many practical planners, including O-PLAN and SIPE, are HTN planners. Yang (1990)(21) discusses properties of actions that make HTN planning efficient. Erol, Hendler, and Nau (1994(22), 1996(23)) present a complete hierarchical decomposition planner as well as a range of complexity results for pure HTN planners. Our presentation of HLAs and angelic semantics is due to Marthi et al. (2007(24), 2008(25)). Kambhampati et al. (1998)(26) have proposed an approach in which decompositions are just another form of plan refinement, similar to the refinements for non-hierarchical partial-order planning.
Explanation-based learning: The technique of explanation-based learning (…) has been applied in several systems as a means of generalizing previously computed plans, including SOAR (Laird et al., 1986)(27) and PRODIGY (Carbonell et al., 1989)(28).
Case-based planning: An alternative approach is to store previously computed plans in their original form and then reuse them to solve new, similar problems by analogy to the original problem. This is the approach taken by the field called case-based planning (Carbonell, 1983(29); Alterman, 1988(30); Hammond, 1989(31)). Kambhampati (1994)(32) argues that case-based planning should be analyzed as a form of refinement planning and provides a formal foundation for case-based partial-order planning.
Norvig I 433
Conformant planning: Goldman and Boddy (1996)(33) introduced the term conformant planning, noting that sensorless plans are often effective even if the agent has sensors. The first moderately efficient conformant planner was Smith and Weld’s (1998)(34) Conformant Graphplan or CGP. Ferraris and Giunchiglia (2000)(35) and Rintanen (1999)(36) independently developed SATPLAN-based conformant planners. Bonet and Geffner (2000)(37) describe a conformant planner based on heuristic search in the space of >belief states (…).
Norvig I 434
Reactive planning: In the mid-1980s, pessimism about the slow run times of planning systems led to the proposal of reflex agents called reactive planning systems (Brooks, 1986(38); Agre and Chapman, 1987)(39). PENGI (Agre and Chapman, 1987)(39) could play a (fully observable) video game by using Boolean circuits combined with a “visual” representation of current goals and the agent’s internal state.
Policies: “Universal plans” (Schoppers, 1987(40), 1989(41)) were developed as a lookup table method for reactive planning, but turned out to be a rediscovery of the idea of policies that had long been used in Markov decision processes (…). >Open Universe/AI research).
1. Vere, S. A. (1983). Planning in time: Windows and durations for activities and goals. PAMI, 5, 246-267.
2. Allen, J. F. (1984). Towards a general theory of action and time. AIJ, 23, 123-154.
3. Dean, T., Kanazawa, K., and Shewchuk, J. (1990). Prediction, observation and estimation in planning and control. In 5th IEEE International Symposium on Intelligent Control, Vol. 2, pp. 645-650.
4. Tate, A. and Whiter, A. M. (1984). Planning with multiple resource constraints and an application to a naval planning problem. In Proc. First Conference on AI Applications, pp. 410-416.
5. Wilkins, D. E. (1988). Practical Planning: Extending the AI Planning Paradigm. Morgan Kaufmann.
6. Wilkins, D. E. (1990). Can AI planners solve practical problems? Computational Intelligence, 6(4), 232-246.
7. Do, M. B. and Kambhampati, S. (2003). Planning as constraint satisfaction: solving the planning graph by compiling it into CSP. AIJ, 132(2), 151-182.
8. Haslum, P. and Geffner, H. (2001). Heuristic planning with time and resources. In Proc. IJCAI-01 Workshop on Planning with Resources.
9. Fukunaga, A. S., Rabideau, G., Chien, S., and Yan, D. (1997). ASPEN: A framework for automated planning and scheduling of spacecraft control and operations. In Proc. International Symposium on AI,
Robotics and Automation in Space, pp. 181-187.
10. Jonsson, A., Morris, P., Muscettola, N., Rajan, K., and Smith, B. (2000). Planning in interplanetary space: Theory and practice. In AIPS-00, pp. 177-186.
11. Ghallab, M. and Laruelle, H. (1994). Representation and control in IxTeT, a temporal planner. In AIPS-94, pp. 61-67.
12. Fox, M. S., Allen, B., and Strohm, G. (1982). Job shop scheduling: An investigation in constraint directed reasoning. In AAAI-82, pp. 155-158.
13. Fox, M. S. (1990). Constraint-guided scheduling: A short history of research at CMU. Computers in
Industry, 14(1–3), 79-88
14. Descotte, Y. and Latombe, J.-C. (1985). Making compromises among antagonist constraints in a planner. AIJ, 27, 183–217.
15. Cushing,W., Kambhampati, S.,Mausam, and Weld, D. S. (2007). When is temporal planning really temporal? In IJCAI-07.
16. Lawler, E. L., Lenstra, J. K., Kan, A., and Shmoys, D. B. (1993). Sequencing and scheduling: Algorithms and complexity. In Graves, S. C., Zipkin, P. H., and Kan, A. H. G. R. (Eds.), Logistics of Production and Inventory: Handbooks in Operations Research and Management Science, Volume 4, pp. 445 - 522. North-Holland.
17. Pinedo, M. (2008). Scheduling: Theory, Algorithms, and Systems. Springer Verlag.
18. Blazewicz, J., Ecker, K., Pesch, E., Schmidt, G., and Weglarz, J. (2007). Handbook on Scheduling: Models and Methods for Advanced Planning (International Handbooks on Information Systems). Springer-Verlag New York, Inc.
19. Sacerdoti, E. D. (1974). Planning in a hierarchy of abstraction spaces. AIJ, 5(2), 115–135.
20. Sacerdoti, E. D. (1977). A Structure for Plans and Behavior. Elsevier/North-Holland
21. Yang, Q. (1990). Formalizing planning knowledge for hierarchical planning. Computational Intelligence, 6, 12–24.
22. Erol, K., Hendler, J., and Nau, D. S. (1994). HTN planning: Complexity and expressivity. In AAAI-94,
23. Erol, K., Hendler, J., and Nau, D. S. (1996). Complexity results for HTN planning. AIJ, 18(1), 69–93.
24. Marthi, B., Russell, S. J., and Wolfe, J. (2007). Angelic semantics for high-level actions. In ICAPS-07.
25. Marthi, B., Russell, S. J., and Wolfe, J. (2008). Angelic hierarchical planning: Optimal and online algorithms. In ICAPS-08.
26. Kambhampati, S., Mali, A. D., and Srivastava, B. (1998). Hybrid planning for partially hierarchical domains. In AAAI-98, pp. 882–888.
27. Laird, J., Rosenbloom, P. S., and Newell, A. (1986). Chunking in Soar: The anatomy of a general learning mechanism. Machine Learning, 1, 11–46.
28. Carbonell, J. G., Knoblock, C. A., and Minton, S. (1989). PRODIGY: An integrated architecture for planning and learning. Technical report CMU-CS- 89-189, Computer Science Department, Carnegie-
29. Carbonell, J. G. (1983). Derivational analogy and its role in problem solving. In AAAI-83, pp. 64–69.
30. Alterman, R. (1988). Adaptive planning. Cognitive Science, 12, 393–422.
31. Hammond, K. (1989). Case-Based Planning: Viewing Planning as a Memory Task. Academic Press.
32. Kambhampati, S. (1994). Exploiting causal structure to control retrieval and refitting during plan reuse. Computational Intelligence, 10, 213–244
33. Goldman, R. and Boddy, M. (1996). Expressive planning and explicit knowledge. In AIPS-96, pp. 110–117.
34. Goldman, R. and Boddy, M. (1996). Expressive planning and explicit knowledge. In AIPS-96, pp. 110–117.
35. Smith, D. E. and Weld, D. S. (1998). Conformant Graphplan. In AAAI-98, pp. 889–896.
36. Rintanen, J. (1999). Improvements to the evaluation of quantified Boolean formulae. In IJCAI-99,
37. Bonet, B. and Geffner, H. (2005). An algorithm better than AO∗? In AAAI-05.
38. Brooks, R. A. (1986). A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation, 2, 14–23.
39. Agre, P. E. and Chapman, D. (1987). Pengi: an implementation of a theory of activity. In IJCAI-87, pp. 268–272.
40. Schoppers, M. J. (1987). Universal plans for reactive robots in unpredictable environments. In IJCAI-
87, pp. 1039–1046.
41. Schoppers, M. J. (1989). In defense of reaction plans as caches. AIMag, 10(4), 51–60._____________Explanation of symbols: Roman numerals indicate the source, arabic numerals indicate the page number. The corresponding books are indicated on the right hand side. ((s)…): Comment by the sender of the contribution. Translations: Dictionary of Arguments The note [Author1]Vs[Author2] or [Author]Vs[term] is an addition from the Dictionary of Arguments. If a German edition is specified, the page numbers refer to this edition.
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010