|
CS 598, Section EA Decision Making Under Uncertainty University of Illinois, Urbana-Champaign Spring 2006 Tentative Course Syllabus |
| Approximate Schedule | (updated 2/15/2006) |
| Session | Date | Topic | Readings | Sample Application Reading |
Assignment |
|---|---|---|---|---|---|
| |
|
Introduction to Decision-Making | Mobile Robotics: inverted helicopter video (by Andrew Ng), retiarius video (by Andrew Ng) |
Signup for paper presentations begins | |
| |
|
Conformant, Nondeterministic Planning |
[Cimatti & Roveri; JAIR 2000], Slides lec.2 (based on slides by Jose Ambite, Paolo Traverso, and Rune Jensen)
|
||
| |
|
Planning using OBDDs |
[Bertoli etal; IJCAI 2001], Slides lec.3 (using slides by Jose Ambite, Paolo Traverso, and Rune Jensen)
|
||
| |
|
Planning with Sensing |
[Bertoli & Pistore; ICAPS 2004], Slides lec.4 (using some slides by Son Tran, Chitta Baral)
|
||
| |
|
Planning with Sensing |
[Bertoli & Pistore; ICAPS 2004], Slides lec.4 (using some slides by Son Tran, Chitta Baral)
|
||
| |
|
Markov-Decision Problems (MDPs) |
[Littman; Brown U. thesis 1996] ch. 2, Slides lec.7 (using slides by Craig Boutilier)
|
||
| |
|
Decision Making Workshop |
Agenda for the workshop will be published soon
|
||
| |
|
Markov-Decision Problems (MDPs) |
[Littman; Brown U. thesis 1996] ch. 2, Slides lec.5 (using slides by Craig Boutilier)
|
||
| |
|
Markov-Decision Problems (MDPs) |
[Littman; Brown U. thesis 1996] ch. 2, Slides lec.5 (using slides by Craig Boutilier)
|
||
| |
|
Reinforcement Learning |
[Kaelbling & Littman; JAIR '96], Slides lec.8 (using slides by Jeremy Wyatt)
|
||
| |
|
Approximate Value Functions |
[Sutton & Barto '98] ch.8 , Slides lec.9 (using slides by Jeremy Wyatt, Ron Parr, Craig Boutilier, and Eduardo Alonso)
|
||
| |
|
Partially Observable MDPs (POMDPs) |
[Littman; Brown U. thesis 1996] ch. 6-7, Slides lec.10 (using slides by Craig Boutilier)
|
Signup for paper presentations ends | |
| |
|
Partially Observable MDPs (POMDPs) |
[Littman; Brown U. thesis 1996] ch. 6-7, Slides lec.10 (using slides by Craig Boutilier)
|
Proposal 1 due; | |
| |
|
Partially Observable MDPs (POMDPs) |
[Littman; Brown U. thesis 1996] ch. 6-7, Slides lec.10 (using slides by Craig Boutilier)
|
| |
| |
|
NO CLASS | |
||
| |
|
POMDP approximation | Peter Young | ||
| |
|
OPEN (POMDPs or factored planning) | |
Extended proposal due |
|
| |
|
Paper Presentation: Poker Playing and Crossword Puzzles | Mark Richards | ||
| |
|
Paper presentation: MDPs | Dave Killian | ||
| |
|
Paper presentation: k-armed Bandit | [Cicirello & Smith; AAAI'05] The Max K-Armed Bandit
|
Deepak Ramachandran | |
| |
|
Projects mid-semester review | |
3-5 min. presentations in class | |
| |
|
Paper presentation: Information Gathering | TBA TBA |
Soumi Sinha | |
| |
|
Paper presentation: POMDPs in NLG/NLP |
[Boutilier, Dearden, & Goldszmidt 2000], Slides lec.24 (Dafna Shahaf)
|
Dafna Shahaf | |
| |
|
Paper presentation: Developmental Reinforcement Learning | Mike Connor | ||
| |
|
Paper presentation: Utility elicitation and POMDPs (PL;ST) | TBA TBA |
Eric Bengtson | |
| |
|
Paper presentation: POMDPs | Jason Skowronski | ||
| |
|
Paper presentation: Exloration | Anthony Cozzie | ||
| |
|
Posters session (5pm-7pm) | |
final project submission |
| Papers for Presentations |
| Number | Topic (PO = Partially Observable; PL = Planning; ST = Stochastic; SM = Semantics; RL = Reinforcement Learning) | Paper/s | Sample Application Reading |
Presenter |
|---|---|---|---|---|
| |
Information Gathering | TBA TBA |
Soumi Sinha | |
| |
Sensing Actions (PO, PL) | TBA TBA |
||
| |
Sensing Actions (PO, PL) | TBA TBA |
||
| |
Triggered Actions | TBA TBA |
||
| |
Formalisms for Planning w/Sensing (PL) | TBA TBA |
||
| |
planning w/sensing (SM) | TBA TBA |
||
| |
POMDPs (RL;ST) | TBA TBA |
||
| |
Exploration (ST) | TBA TBA |
Anthony Cozzie | |
| |
Utility elicitation and POMDPs (PL;ST) | TBA TBA |
Eric Bengtson | |
| |
POMDPs models customers | TBA TBA |
||
| |
POMDP approximation | TBA TBA |
Peter Young | |
| |
POMDP approximation | TBA TBA |
||
| |
POMDP approximation | TBA TBA |
||
| |
POMDP approximation | |||
| |
First-Order MDPs | |||
| |
Factored MDPs | |||
| |
Monitoring POMDPs | |||
| |
MDPs | Dave Killian | ||
| |
MDPs | |||
| |
MDPs (survey) | |||
| |
Reinforcement Learning (survey) | |||
| |
MDPs & POMDPs (survey) | |||
| |
Programmable RL agents | |||
| |
Inverse RL | |||
| |
Function Approximation in MDPs | |||
| |
Reward Shaping in MDPs | |||
| |
Factored MDPs | |||
| |
POMDPs as DBNs | |||
| |
Relational MDPs | |||
| |
Approximating POMDPs | |||
| |
POMDPs | Jason Skowronski | ||
| |
Approximating POMDPs | |||
| |
Approximating POMDPs | |||
| |
Approximating POMDPs | |||
| |
Planning with Nondeterminism and Sensing | |||
| |
Planning with Sensing | |||
| |
Planning with Sensing | |||
| |
Planning with Sensing: unifying view | |||
| |
Risk-sensitive planning | |||
| |
Poker Playing | Mark Richards | ||
| |
Solving Crossword Puzzles | Mark Richards | ||
| |
k-arm Bandit Problems | Deepak Ramachandran | ||
| |
Developmental Reinforcement Learning | Michael Connor |
| Possible Projects |
| Number | Topic | Presenter |
|---|---|---|
| |
Kriegspiel game player | |
| |
First-Order POMDPs | |
| |
Exact Factored MDPs | |
| |
Conformant planning using Logical Filtering | |
| |
Activity detection using filtering (any method) | |
| |
SLAM2.0 improved and applied to a mobile robot | |
| |
Poker Playing | |
| |
Bridge Player | |
| |
First-Order Factored Planning | |
| |
Adventure-Game Exploration using Commonsense knowledge | |
| |
Approximate Deterministic POMDPs via Logical Filtering | |
| |
Probabilistic Resolution in Dynamic Bayesian Networks | |
| |
POMDPs approximated via DBNs | |
| |
Robot control using a POMDP | |
| |
First-Order Reinforcement Learning | |
| |
Reshaping rewards using commonsense knowledge | |
| |
Partially observable LSA-based robot control architecture | |
| |
Reinforcement learning in deterministic domains | |
| |
Controlling a Levitating Robot | |
| |
Robot localization for a basketball game | |
| |
LSA-based control system for a robotic arm | |
| |
Reward shaping for POMDPs |
| Comments to Eyal Amir |