UIUC CS 598, Section EA
Decision Making Under Uncertainty
University of Illinois, Urbana-Champaign
Spring 2006
Tentative Course Syllabus

Approximate Schedule (updated 2/15/2006)

Session Date Topic Readings Sample Application
Reading
Assignment
1
Jan 18
Introduction to Decision-Making Mobile Robotics:
inverted helicopter video (by Andrew Ng), retiarius video (by Andrew Ng)
Signup for paper presentations begins
2
Jan 20
Conformant, Nondeterministic Planning
[Cimatti & Roveri; JAIR 2000], Slides lec.2 (based on slides by Jose Ambite, Paolo Traverso, and Rune Jensen)
   
3
Jan 25
Planning using OBDDs
[Bertoli etal; IJCAI 2001], Slides lec.3 (using slides by Jose Ambite, Paolo Traverso, and Rune Jensen)
   
4
Jan 27
Planning with Sensing
[Bertoli & Pistore; ICAPS 2004], Slides lec.4 (using some slides by Son Tran, Chitta Baral)
   
5
Feb 1
Planning with Sensing
[Bertoli & Pistore; ICAPS 2004], Slides lec.4 (using some slides by Son Tran, Chitta Baral)
   
7
Feb 8
Markov-Decision Problems (MDPs)
[Littman; Brown U. thesis 1996] ch. 2, Slides lec.7 (using slides by Craig Boutilier)  
   
8
Feb 10 (9am-12:30pm)
Decision Making Workshop
Agenda for the workshop will be published soon
   
9
Feb 15
Markov-Decision Problems (MDPs)
[Littman; Brown U. thesis 1996] ch. 2, Slides lec.5 (using slides by Craig Boutilier)  
   
10
Feb 17
Markov-Decision Problems (MDPs)
[Littman; Brown U. thesis 1996] ch. 2, Slides lec.5 (using slides by Craig Boutilier)  
   
11
Feb 22
Reinforcement Learning
[Kaelbling & Littman; JAIR '96], Slides lec.8 (using slides by Jeremy Wyatt)
   
12
Feb 24
Approximate Value Functions
[Sutton & Barto '98] ch.8 , Slides lec.9 (using slides by Jeremy Wyatt, Ron Parr, Craig Boutilier, and Eduardo Alonso)
   
13
Mar 1
Partially Observable MDPs (POMDPs)
[Littman; Brown U. thesis 1996] ch. 6-7, Slides lec.10 (using slides by Craig Boutilier)
  Signup for paper presentations ends
14
Mar 3
Partially Observable MDPs (POMDPs)
[Littman; Brown U. thesis 1996] ch. 6-7, Slides lec.10 (using slides by Craig Boutilier)
  Proposal 1 due;
15
Mar 8
Partially Observable MDPs (POMDPs)
[Littman; Brown U. thesis 1996] ch. 6-7, Slides lec.10 (using slides by Craig Boutilier)
   
16
Mar 10
NO CLASS
 
   
17
Mar 15
POMDP approximation   Peter Young
18
Mar 17
OPEN (POMDPs or factored planning)
 
  Extended proposal due
19
Mar 29
Paper Presentation: Poker Playing and Crossword Puzzles   Mark Richards
20
Mar 31
Paper presentation: MDPs   Dave Killian
21
Apr 5
Paper presentation: k-armed Bandit
[Cicirello & Smith; AAAI'05] The Max K-Armed Bandit
  Deepak Ramachandran
22
Apr 7
Projects mid-semester review
 
  3-5 min. presentations in class
23
Apr 12
Paper presentation: Information Gathering TBA
TBA
Soumi Sinha
24
Apr 14
Paper presentation: POMDPs in NLG/NLP   Dafna Shahaf
25
Apr 19
Paper presentation: Developmental Reinforcement Learning   Mike Connor
26
Apr 21
Paper presentation: Utility elicitation and POMDPs (PL;ST) TBA
TBA
Eric Bengtson
27
Apr 26
Paper presentation: POMDPs   Jason Skowronski
28
Apr 28
Paper presentation: Exloration   Anthony Cozzie
29
May 3
Posters session (5pm-7pm)
 
  final project submission

Papers for Presentations

 
Number Topic (PO = Partially Observable; PL = Planning; ST = Stochastic; SM = Semantics; RL = Reinforcement Learning) Paper/s Sample Application
Reading
Presenter
1
Information Gathering TBA
TBA
Soumi Sinha
2
Sensing Actions (PO, PL) TBA
TBA
 
3
Sensing Actions (PO, PL) TBA
TBA
 
4
Triggered Actions TBA
TBA
 
5
Formalisms for Planning w/Sensing (PL) TBA
TBA
 
6
planning w/sensing (SM) TBA
TBA
 
7
POMDPs (RL;ST) TBA
TBA
 
8
Exploration (ST) TBA
TBA
Anthony Cozzie
9
Utility elicitation and POMDPs (PL;ST) TBA
TBA
Eric Bengtson
10
POMDPs models customers TBA
TBA
 
11
POMDP approximation TBA
TBA
Peter Young
12
POMDP approximation TBA
TBA
 
13
POMDP approximation TBA
TBA
 
14
POMDP approximation    
15
First-Order MDPs    
16
Factored MDPs    
17
Monitoring POMDPs    
18
MDPs   Dave Killian
19
MDPs    
20
MDPs (survey)    
21
Reinforcement Learning (survey)    
22
MDPs & POMDPs (survey)    
23
Programmable RL agents    
24
Inverse RL    
25
Function Approximation in MDPs    
26
Reward Shaping in MDPs    
27
Factored MDPs    
28
POMDPs as DBNs    
29
Relational MDPs    
30
Approximating POMDPs    
31
POMDPs   Jason Skowronski
32
Approximating POMDPs    
33
Approximating POMDPs    
34
Approximating POMDPs    
35
Planning with Nondeterminism and Sensing    
36
Planning with Sensing    
37
Planning with Sensing    
38
Planning with Sensing: unifying view    
39
Risk-sensitive planning    
40
Poker Playing   Mark Richards
41
Solving Crossword Puzzles   Mark Richards
42
k-arm Bandit Problems   Deepak Ramachandran
43
Developmental Reinforcement Learning   Michael Connor

Possible Projects

Number Topic Presenter
1
Kriegspiel game player  
2
First-Order POMDPs  
3
Exact Factored MDPs  
4
Conformant planning using Logical Filtering  
5
Activity detection using filtering (any method)  
6
SLAM2.0 improved and applied to a mobile robot  
7
Poker Playing  
8
Bridge Player  
9
First-Order Factored Planning  
10
Adventure-Game Exploration using Commonsense knowledge  
11
Approximate Deterministic POMDPs via Logical Filtering  
12
Probabilistic Resolution in Dynamic Bayesian Networks  
13
POMDPs approximated via DBNs  
14
Robot control using a POMDP  
15
First-Order Reinforcement Learning  
16
Reshaping rewards using commonsense knowledge  
17
Partially observable LSA-based robot control architecture  
18
Reinforcement learning in deterministic domains  
19
Controlling a Levitating Robot  
20
Robot localization for a basketball game  
21
LSA-based control system for a robotic arm  
21
Reward shaping for POMDPs  


Comments to Eyal Amir