Main.HomePage History

Hide minor edits - Show changes to markup

December 06, 2010, at 08:19 PM by David Hsu -
Deleted line 53:

Leonidas Guibas, Stanford University\\

December 06, 2010, at 08:19 PM by David Hsu -
Added line 59:

Daniela Rus, Massachusetts Institute of Technology\\

December 06, 2010, at 02:15 PM by David Hsu -
Added lines 8-10:
  • Dec 6, 2010
    For poster presenters, the poster board size is roughly A0 (portrait), 1 meter on the long edge.
November 08, 2010, at 11:10 PM by David Hsu -
Added lines 8-10:
  • Nov 8, 2010
    If you intend to submit a poster, please use this LaTeX style file to format your poster abstract for submission.
October 21, 2010, at 02:00 PM by David Hsu -
Added lines 8-10:
  • Oct 21, 2010
    A preliminary program is now available.
October 06, 2010, at 03:18 PM by David Hsu -
Added lines 8-10:
  • Oct 1, 2010
    The registration site is open.
Changed lines 12-15 from:

All authors were notified of the decisions on the submissions. Please submit the final version of accepted papers by

Oct 1, 2010.
to:

All authors were notified of the decisions on the submissions. Please submit the final version of accepted papers by Oct 1, 2010.

Changed lines 15-16 from:

As a result of many requests, the paper submission deadline will be extended to

Jul 15, 11:59pm (Samoa Time).
to:

As a result of many requests, the paper submission deadline will be extended to Jul 15, 11:59pm (Samoa Time).

September 14, 2010, at 04:17 PM by David Hsu -
Changed line 11 from:
Oct 1, 2010.
to:
Oct 1, 2010.
September 14, 2010, at 04:16 PM by David Hsu -
Changed lines 8-13 from:
  • July 8, 2010\\
to:
  • Sep 13, 2010
    All authors were notified of the decisions on the submissions. Please submit the final version of accepted papers by
Oct 1, 2010.
  • Jul 8, 2010\\
September 13, 2010, at 04:12 PM by David Hsu -
Changed line 72 from:
  • Oct 14, 2010 Final paper due
to:
  • Oct 1, 2010 Final paper due
September 13, 2010, at 03:18 PM by David Hsu -
Deleted line 58:

Vijay Kumar, University of Pennsylvania\\

Added line 61:

Vijay Kumar, University of Pennsylvania\\

September 13, 2010, at 03:18 PM by David Hsu -
Changed line 59 from:

Vijay Kumar', University of Pennsylvania\\

to:

Vijay Kumar, University of Pennsylvania\\

September 13, 2010, at 03:17 PM by David Hsu -
Changed line 59 from:

Joel Burdick, California Institute of Technology\\

to:

Vijay Kumar', University of Pennsylvania\\

September 08, 2010, at 02:41 PM by David Hsu -
Added line 76:

We wish to thank the following organizations for their support of WAFR 2010:

September 08, 2010, at 02:34 PM by David Hsu -
Changed lines 76-78 from:
to:
September 08, 2010, at 02:24 PM by David Hsu -
Added lines 74-78:

Sponsors

July 12, 2010, at 08:35 AM by David Hsu -
Changed line 75 from:

Contact

to:

Contact

July 12, 2010, at 08:32 AM by David Hsu -
Added lines 74-77:

Contact

E-mail: wafr AT comp.nus.edu.sg

July 09, 2010, at 12:34 PM by David Hsu -
Changed line 70 from:
  • Jul 15, 2010, 11:59pm (Samoa Time) Paper submission deadline
to:
  • Jul 15, 2010, 11:59pm (Samoa Time) Paper submission deadline
July 09, 2010, at 12:32 PM by David Hsu -
Changed line 10 from:
Jul 15, 11:59pm (Samoa Time).
to:
Jul 15, 11:59pm (Samoa Time).
July 09, 2010, at 12:27 PM by David Hsu -
Changed line 10 from:
Jul 15, 11:59pm (Samoa Time).
to:
Jul 15, 11:59pm (Samoa Time).
July 08, 2010, at 10:14 PM by David Hsu -
Added lines 8-10:
  • July 8, 2010
    As a result of many requests, the paper submission deadline will be extended to
Jul 15, 11:59pm (Samoa Time).
July 08, 2010, at 10:08 PM by David Hsu -
Changed line 67 from:
  • Jul 9, 2010 Paper submission deadline
to:
  • Jul 15, 2010, 11:59pm (Samoa Time) Paper submission deadline
May 21, 2010, at 09:05 PM by David Hsu -
Added lines 8-17:
  • May 21, 2010
    The organization committee received comments from a number of people regarding the new workshop format and its potential implication on the acceptance rate for contributed papers.
In response to the feedback from the community, we have decided to make some adjustments to the workshop format:
  • Presentations of all contributed papers will have the same amount of time.
  • Time dedicated for discussion may be planned at the end of sessions.
This way, topics for discussion can be focused and tailored to the interest of workshop participants. As stated earlier, one primary objective of WAFR 2010 is to encourage interactions and discussions among the attendees.
We would also like to stress that there is no specific limit on the number of accepted papers. It is anticipated that the number of accepted papers will be comparable to earlier WAFRs. All accepted papers will have the same maximum page length in the published proceedings.
May 21, 2010, at 08:51 PM by David Hsu -
Changed lines 19-22 from:

WAFR 2010 will adopt a format that places strong emphasis on interactions and discussions among participants. The technical program will consist of six invited papers for 1-hour presentations, roughly six 1-hour presentation and twelve ˝-hour presentations selected from contributed papers. All papers will be made available on-line prior to the workshop. Each presentation will be followed by ample time for discussion. All accepted papers have the same page limit in the published proceedings. The Program Committee will determine on the length of presentations for the accepted papers. The goal will be to encourage interesting and constructive discussions throughout the workshop.

to:

WAFR 2010 will place strong emphasis on interactions and discussions among participants. Each presentation will be followed by ample time for discussions. There may also be time dedicated for discussions at the end of sessions. To facilitate discussions, all papers will be made available on-line prior to the workshop. The goal will be to encourage interesting and constructive discussions throughout the workshop.

All accepted papers will have the same maximum page length in the published proceedings.

May 19, 2010, at 04:05 PM by David Hsu -
Added lines 9-10:
March 29, 2010, at 01:05 PM by David Hsu -
Changed line 22 from:

Leonida Guibas, Stanford University\\

to:

Leonidas Guibas, Stanford University\\

March 28, 2010, at 03:43 PM by David Hsu -
Changed line 47 from:

Katsu Yamane, Disney Researc & Carnegie Mellon University

to:

Katsu Yamane, Disney Research & Carnegie Mellon University

March 28, 2010, at 03:43 PM by David Hsu -
Changed line 47 from:

Katsu Yamane, Carnegie Mellon University

to:

Katsu Yamane, Disney Researc & Carnegie Mellon University

March 17, 2010, at 09:28 AM by David Hsu -
Changed line 10 from:

Invited speakers are now confirmed.

to:

Invited speakers are now confirmed.

March 17, 2010, at 09:25 AM by David Hsu -
Changed line 24 from:

Jean-Pierre Merlet | INRIA Sophia Antipolis\\

to:

Jean-Pierre Merlet, INRIA Sophia Antipolis\\

March 17, 2010, at 09:24 AM by David Hsu -
Changed line 25 from:

Jose del Millan, EPFL

to:

Jose del Millan, EPFL\\

Changed line 27 from:

Moshe Shoham, Technion - Israel Institute of Technology\\

to:

Moshe Shoham, Technion - Israel Institute of Technology

March 17, 2010, at 09:24 AM by David Hsu -
Changed line 10 from:

The list of invited speakers is confirmed.

to:

Invited speakers are now confirmed.

March 17, 2010, at 09:22 AM by David Hsu -
Changed line 20 from:

Invited Speakers

to:

Invited Speakers

March 17, 2010, at 09:20 AM by David Hsu -
Added lines 9-10:
  • March 17, 2010
    The list of invited speakers is confirmed.
March 17, 2010, at 09:18 AM by David Hsu -
Changed lines 22-23 from:

Jean-Pierre Merlet | INRIA Sophia Antipolis

to:

Jean-Pierre Merlet | INRIA Sophia Antipolis
Jose del Millan, EPFL

Changed line 26 from:

Jose del Millan, EPFL

to:
March 17, 2010, at 09:17 AM by David Hsu -
Changed lines 20-25 from:

Leonida Guibas', Stanford University

to:

Leonida Guibas, Stanford University
Leslie Kaelbling, Massachusetts Institute of Technology
Jean-Pierre Merlet | INRIA Sophia Antipolis Yoshihiko Nakamura, University of Tokyo
Moshe Shoham, Technion - Israel Institute of Technology
Jose del Millan, EPFL

March 17, 2010, at 09:03 AM by David Hsu -
Added line 20:

Leonida Guibas', Stanford University

March 17, 2010, at 09:01 AM by David Hsu -
Changed lines 31-38 from:

Srinivas Akella, University of North Carolina at Charlotte
Joel Burdick, California Institute of Technology
Dan Halperin, Tel Aviv University
Seth Hutchinson, University of Illinois at Urbana-Champaign
Jean-Paul Laumond, LAAS-CNRS
Stephane Redon, INRIA
Daniela Rus, Massachusetts Institute of Technology
Katsu Yamane, Carnegie Mellon University

to:

Srinivas Akella, University of North Carolina at Charlotte
Joel Burdick, California Institute of Technology
Dan Halperin, Tel Aviv University
Seth Hutchinson, University of Illinois at Urbana-Champaign
Jean-Paul Laumond, LAAS-CNRS
Stephane Redon, INRIA
Daniela Rus, Massachusetts Institute of Technology
Katsu Yamane, Carnegie Mellon University

March 17, 2010, at 09:00 AM by David Hsu -
Changed lines 23-26 from:

David Hsu, National University of Singapore
Volkan Isler, University of Minnesota
Jean-Claude Latombe, Stanford University
Ming C. Lin, University of North Carolina, Chapel Hill

to:

David Hsu, National University of Singapore
Volkan Isler, University of Minnesota
Jean-Claude Latombe, Stanford University
Ming C. Lin, University of North Carolina, Chapel Hill

March 17, 2010, at 08:57 AM by David Hsu -
Added lines 16-18:

Invited Speakers

February 15, 2010, at 06:05 PM by David Hsu -
Changed lines 28-33 from:
Srinivas Akella
University of North Carolina at Charlotte
Joel Burdick
California Institute of Technology
Dan Halperin
Tel Aviv University
Seth Hutchinson
University of Illinois at Urbana-Champaign
Jean-Paul Laumond
LAAS-CNRS
Stephane Redon
INRIA
Daniela Rus
Massachusetts Institute of Technology
Katsu Yamane
Carnegie Mellon University
to:

Srinivas Akella, University of North Carolina at Charlotte
Joel Burdick, California Institute of Technology
Dan Halperin, Tel Aviv University
Seth Hutchinson, University of Illinois at Urbana-Champaign
Jean-Paul Laumond, LAAS-CNRS
Stephane Redon, INRIA
Daniela Rus, Massachusetts Institute of Technology
Katsu Yamane, Carnegie Mellon University

February 15, 2010, at 06:00 PM by David Hsu -
Changed line 29 from:
to:
Changed line 33 from:
Daniela Rus
Massachusetts Institute of Technology
Katsu Yamane
Carnegie Mellon University
to:
Daniela Rus
Massachusetts Institute of Technology
Katsu Yamane
Carnegie Mellon University
February 15, 2010, at 05:53 PM by David Hsu -
Added lines 27-33:
Srinivas Akella
University of North Carolina at Charlotte
Joel Burdick
California Institute of Technology
Dan Halperin
Tel Aviv University
Seth Hutchinson
University of Illinois at Urbana-Champaign
Jean-Paul Laumond
LAAS-CNRS
Stephane Redon
INRIA
Daniela Rus
Massachusetts Institute of Technology
Katsu Yamane
Carnegie Mellon University
February 15, 2010, at 04:47 PM by David Hsu -
Changed lines 20-23 from:

David Hsu
Volkan Isler
Jean-Claude Latombe
Ming C. Lin

to:

David Hsu, National University of Singapore
Volkan Isler, University of Minnesota
Jean-Claude Latombe, Stanford University
Ming C. Lin, University of North Carolina, Chapel Hill

February 15, 2010, at 04:43 PM by David Hsu -
Changed line 5 from:

Robot algorithms are a fundamental component of robotic systems. These algorithms process inputs from sensors that provide noisy and partial data, build geometric and physical models of the world, plan high- and low-level actions at different time horizons, and execute these actions on actuators with limited precision. The design and analysis of robot algorithms raise a unique combination of questions in many fields, including control theory, computational geometry and topology, geometrical and physical modeling, reasoning under uncertainty, probabilistic algorithms, game theory, and theoretical computer science.

to:

Robot algorithms are a fundamental component of robotic systems. These algorithms process inputs from sensors that provide noisy and partial data, build geometric and physical models of the world, plan high- and low-level actions at different time horizons, and execute these actions on actuators with limited precision. The design and analysis of robot algorithms raise a unique combination of questions from many fields, including control theory, computational geometry and topology, geometrical and physical modeling, reasoning under uncertainty, probabilistic algorithms, game theory, and theoretical computer science.

February 15, 2010, at 04:41 PM by David Hsu -
Changed line 3 from:

The International Workshop on the Algorithmic Foundations of Robotics (WAFR) is a single-track workshop devoted to recent advances on algorithmic problems in robotics. The workshop proceedings will be published in a hardcover volume in the Springer STAR series, and selected papers will be invited for publication in a special issue of the International Journal of Robotics Research. WAFR 2010 will return to the format of the early WAFRs and place strong emphasis on interactions and discussions among participants.

to:

The International Workshop on the Algorithmic Foundations of Robotics (WAFR) is a single-track workshop devoted to recent advances on algorithmic problems in robotics. The workshop proceedings will be published in a hardcover volume in the Springer STAR series, and selected papers will be invited for publication in a special issue of the International Journal of Robotics Research. WAFR 2010 will return to the format of the early WAFRs and place strong emphasis on interactions and discussions among participants.

February 15, 2010, at 04:36 PM by David Hsu -
Changed lines 30-33 from:
  • Jul 9, 2010 Paper submission deadline.
  • Sep 14, 2010 Notification of accepted papers.
  • Oct 14, 2010 Final paper due.
  • Dec13-15, 2010 Workshop in Singapore.
to:
  • Jul 9, 2010 Paper submission deadline
  • Sep 14, 2010 Notification of accepted papers
  • Oct 14, 2010 Final paper due
  • Dec13-15, 2010 Workshop in Singapore
February 15, 2010, at 04:35 PM by David Hsu -
Changed lines 3-10 from:

Algorithms are a fundamental component of robotic systems: they control or reason about motion and perception in the physical world. They receive input from noisy sensors, consider geometric and physical constraints, and operate on the world through imprecise actuators. The design and analysis of robot algorithms therefore raises a unique combination of questions in control theory, computational and differential geometry, and computer science.

The Workshop on the Algorithmic Foundations of Robotics (WAFR) is a single-track workshop with submitted and invited papers on advances on algorithmic problems in robotics. The workshop proceedings will be published in a hard-cover volume in the Springer STAR series, and selected papers will be invited for publication in a special issue of the International Journal of Robotics Research.

The topics of interest are very broad since the focus of WAFR is on algorithm development and analysis rather than specific problems or applications. Increasingly, robotics algorithms are finding use in areas far beyond the traditional scope of robots. Therefore, while we encourage submissions on "fundamental" topics such as complexity, completeness, and computational geometry, we also welcome papers in applications such as computational biology, virtual environments, sensor networks, manufacturing, and medical robotics. Papers on algorithmic developments in "traditional" areas of robotics, such as motion planning, manipulation, sensing, and mobile robotics, as well as papers in newer areas such as distributed robotics and simultaneous localization and mapping, are also encouraged.

WAFR 2010 will be held in Singapore, December 13-15, 2010.

to:

The International Workshop on the Algorithmic Foundations of Robotics (WAFR) is a single-track workshop devoted to recent advances on algorithmic problems in robotics. The workshop proceedings will be published in a hardcover volume in the Springer STAR series, and selected papers will be invited for publication in a special issue of the International Journal of Robotics Research. WAFR 2010 will return to the format of the early WAFRs and place strong emphasis on interactions and discussions among participants.

Robot algorithms are a fundamental component of robotic systems. These algorithms process inputs from sensors that provide noisy and partial data, build geometric and physical models of the world, plan high- and low-level actions at different time horizons, and execute these actions on actuators with limited precision. The design and analysis of robot algorithms raise a unique combination of questions in many fields, including control theory, computational geometry and topology, geometrical and physical modeling, reasoning under uncertainty, probabilistic algorithms, game theory, and theoretical computer science.

Added lines 8-15:

Topics

The focus of WAFR is on the design and analysis of robot algorithms from both theoretical and practical angles. The topics of interest are very broad. We encourage papers on fundamental algorithmic issues, such as complexity, completeness, machine learning, probabilistic reasoning, and new programming paradigms, to name a few. We also encourage papers on applications of robot algorithms to important or new domains, such as manufacturing, legged locomotion, distributed robotics, human-robot interaction, surgical robots, intelligent prosthetics, and brain-controlled robots. Furthermore, robot algorithms are being applied in domains beyond the traditional scope of robotics, e.g., computational biology, computer animation, sensor networks, etc. Papers on these topics are also welcomed.

Format

WAFR 2010 will adopt a format that places strong emphasis on interactions and discussions among participants. The technical program will consist of six invited papers for 1-hour presentations, roughly six 1-hour presentation and twelve ˝-hour presentations selected from contributed papers. All papers will be made available on-line prior to the workshop. Each presentation will be followed by ample time for discussion. All accepted papers have the same page limit in the published proceedings. The Program Committee will determine on the length of presentations for the accepted papers. The goal will be to encourage interesting and constructive discussions throughout the workshop.

February 04, 2010, at 09:30 PM by David Hsu -
Changed lines 26-28 from:
  • Jul 9, 2010 Submission deadline.
to:
  • Jul 9, 2010 Paper submission deadline.
  • Sep 14, 2010 Notification of accepted papers.
  • Oct 14, 2010 Final paper due.
December 27, 2009, at 08:47 PM by David Hsu -
Changed line 14 from:

General Chairs

to:

Workshop Co-Chairs

December 19, 2009, at 09:13 PM by David Hsu -
Changed line 1 from:
to:
December 19, 2009, at 09:12 PM by David Hsu -
Changed line 1 from:
to:
December 19, 2009, at 09:05 PM by David Hsu -
Added lines 1-2:
December 19, 2009, at 08:26 PM by David Hsu -
Added lines 6-7:

WAFR 2010 will be held in Singapore, December 13-15, 2010.

December 19, 2009, at 08:24 PM by David Hsu -
Changed lines 22-24 from:
to:
  • Jul 9, 2010 Submission deadline.
  • Dec13-15, 2010 Workshop in Singapore.
December 19, 2009, at 06:31 PM by David Hsu -
Added lines 11-15:

David Hsu
Volkan Isler
Jean-Claude Latombe
Ming C. Lin

December 17, 2009, at 01:05 PM by David Hsu -
Added lines 1-6:

Algorithms are a fundamental component of robotic systems: they control or reason about motion and perception in the physical world. They receive input from noisy sensors, consider geometric and physical constraints, and operate on the world through imprecise actuators. The design and analysis of robot algorithms therefore raises a unique combination of questions in control theory, computational and differential geometry, and computer science.

The Workshop on the Algorithmic Foundations of Robotics (WAFR) is a single-track workshop with submitted and invited papers on advances on algorithmic problems in robotics. The workshop proceedings will be published in a hard-cover volume in the Springer STAR series, and selected papers will be invited for publication in a special issue of the International Journal of Robotics Research.

The topics of interest are very broad since the focus of WAFR is on algorithm development and analysis rather than specific problems or applications. Increasingly, robotics algorithms are finding use in areas far beyond the traditional scope of robots. Therefore, while we encourage submissions on "fundamental" topics such as complexity, completeness, and computational geometry, we also welcome papers in applications such as computational biology, virtual environments, sensor networks, manufacturing, and medical robotics. Papers on algorithmic developments in "traditional" areas of robotics, such as motion planning, manipulation, sensing, and mobile robotics, as well as papers in newer areas such as distributed robotics and simultaneous localization and mapping, are also encouraged.

December 16, 2009, at 11:03 PM by David Hsu -
Deleted lines 3-6:

Important Dates

Added lines 9-10:

Important Dates

December 16, 2009, at 10:51 PM by David Hsu -
Changed line 8 from:

Co-Chairs

to:

General Chairs

December 16, 2009, at 10:49 PM by David Hsu -
Changed line 1 from:

General Information

to:

News

December 10, 2009, at 11:44 AM by Lim Zhan Wei -
Changed lines 3-14 from:

x

x

x

x

x

x

to:
Changed lines 6-21 from:

x

x

x

x

x

x

x

x

to:
Deleted lines 8-30:

x

x

x

x

x

x

x

x

x

December 10, 2009, at 11:43 AM by Lim Zhan Wei -
Added line 4:
Added line 6:
Added line 8:
Added line 10:
Added line 12:
Added line 18:
Added line 20:
Added line 22:
Added line 24:
Added line 26:
Added line 28:
Added line 30:
Added line 36:
Added line 38:
Added line 40:
Added line 42:
Added line 44:
Added line 46:
Added line 48:
Added line 50:
December 10, 2009, at 11:42 AM by Lim Zhan Wei -
Changed lines 3-10 from:
to:

x x x x x x

Changed lines 12-19 from:
to:

x x x x x x x x

Added lines 23-31:

x x x x x x x x x

December 10, 2009, at 11:41 AM by Lim Zhan Wei -
Added lines 4-10:
Added lines 14-20:
Added lines 22-27:
December 10, 2009, at 11:41 AM by Lim Zhan Wei -
Changed lines 1-10 from:

General Information

Important Dates

Co-Chairs

Program Committee

to:

General Information

Important Dates

Co-Chairs

Program Committee

December 10, 2009, at 11:40 AM by Lim Zhan Wei -
Deleted lines 0-2:

WAFR: Workshop on the Algorithmic Foundations of Robotics

December 10, 2009, at 11:16 AM by Lim Zhan Wei -
Changed line 1 from:

WAFR: Workshop on the Algorithmic Foundations of Robotics

to:

WAFR: Workshop on the Algorithmic Foundations of Robotics

December 10, 2009, at 11:15 AM by Lim Zhan Wei -
Changed lines 4-7 from:

General Information (main page)

Important Dates (main page)

Co-Chairs (main page)

Program Committee (main page)

to:

General Information

Important Dates

Co-Chairs

Program Committee

December 10, 2009, at 11:14 AM by Lim Zhan Wei -
Changed lines 1-14 from:

APPL is a C++ implementation of the SARSOP algorithm [1], using the factored MOMDP representation [2]. It takes as input a POMDP model in the POMDP or POMDPX file format and produces a policy file. It also contains a simple simulator for evaluating the quality of the computed policy. More information can be found at here. For bug reports and suggestions, please email motion@comp.nus.edu.sg

  1. H. Kurniawati, D. Hsu, and W.S. Lee. SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In Proc. Robotics: Science and Systems, 2008.
  2. S.C.W. Ong, S.W. Png, D. Hsu, and W.S. Lee. POMDPs for robotic tasks with mixed observability. In Proc. Robotics: Science and Systems, 2009.
navigationgraspingtarget tracking
 
   
 integrated exploration 
to:

WAFR: Workshop on the Algorithmic Foundations of Robotics

General Information (main page)

Important Dates (main page)

Co-Chairs (main page)

Program Committee (main page)

November 29, 2009, at 01:43 PM by David Hsu -
Changed lines 3-4 from:
  1. H. Kurniawati, D. Hsu, and W.S. Lee. SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In Proc. Robotics: Science and Systems, 2008.
  2. S.C.W. Ong, S.W. Png, D. Hsu, and W.S. Lee. POMDPs for robotic tasks with mixed observability. In Proc. Robotics: Science and Systems, 2009.
to:
  1. H. Kurniawati, D. Hsu, and W.S. Lee. SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In Proc. Robotics: Science and Systems, 2008.
  2. S.C.W. Ong, S.W. Png, D. Hsu, and W.S. Lee. POMDPs for robotic tasks with mixed observability. In Proc. Robotics: Science and Systems, 2009.
November 29, 2009, at 11:59 AM by David Hsu -
Changed lines 3-4 from:
  1. H. Kurniawati, D. Hsu, and W.S. Lee. SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In Proc. Robotics: Science and Systems, 2008.
  2. S.C.W. Ong, S.W. Png, D. Hsu, and W.S. Lee. POMDPs for robotic tasks with mixed observability. In Proc. Robotics: Science and Systems, 2009.
to:
  1. H. Kurniawati, D. Hsu, and W.S. Lee. SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In Proc. Robotics: Science and Systems, 2008.
  2. S.C.W. Ong, S.W. Png, D. Hsu, and W.S. Lee. POMDPs for robotic tasks with mixed observability. In Proc. Robotics: Science and Systems, 2009.
November 25, 2009, at 05:37 PM by 219.74.241.45 -
Changed lines 5-18 from:

Download

README (release 0.3, 15-Feb-2009)
source code (Linux, Mac OS X)
source code (Windows)

navigation

grasping

Target tracking



Integrated exploration


to:
navigationgraspingtarget tracking
 
   
 integrated exploration 
November 25, 2009, at 04:58 PM by 219.74.241.45 -
Changed lines 1-4 from:

Approximate POMDP Planning (APPL) Toolkit

APPL is a C++ implementation of the SARSOP algorithm [1], using the factored MOMDP representation [2]. It takes as input a POMDP model in the POMDP or POMDPX file format and produces a policy file. It also contains a simple simulator for evaluating the quality of the computed policy. More information can be found at here

to:

APPL is a C++ implementation of the SARSOP algorithm [1], using the factored MOMDP representation [2]. It takes as input a POMDP model in the POMDP or POMDPX file format and produces a policy file. It also contains a simple simulator for evaluating the quality of the computed policy. More information can be found at here. For bug reports and suggestions, please email motion@comp.nus.edu.sg

  1. H. Kurniawati, D. Hsu, and W.S. Lee. SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In Proc. Robotics: Science and Systems, 2008.
  2. S.C.W. Ong, S.W. Png, D. Hsu, and W.S. Lee. POMDPs for robotic tasks with mixed observability. In Proc. Robotics: Science and Systems, 2009.
Changed lines 20-26 from:

For bug reports and suggestions, please email motion@comp.nus.edu.sg

Reference

[1] H. Kurniawati, D. Hsu, and W.S. Lee. SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In Proc. Robotics: Science and Systems, 2008.

[2] S.C.W. Ong, S.W. Png, D. Hsu, and W.S. Lee. POMDPs for robotic tasks with mixed observability. In Proc. Robotics: Science and Systems, 2009.

to:
November 24, 2009, at 03:27 PM by 172.18.179.104 -
Deleted lines 1-3:
November 23, 2009, at 02:56 PM by 172.18.178.85 -
Added lines 2-4:
November 20, 2009, at 03:28 PM by 172.18.179.173 -
Deleted line 4:
November 20, 2009, at 03:28 PM by 172.18.179.173 -
Added lines 3-5:

APPL is a C++ implementation of the SARSOP algorithm [1], using the factored MOMDP representation [2]. It takes as input a POMDP model in the POMDP or POMDPX file format and produces a policy file. It also contains a simple simulator for evaluating the quality of the computed policy. More information can be found at here

Deleted line 11:

APPL is a C++ implementation of the SARSOP algorithm [1], using the factored MOMDP representation [2]. It takes as input a POMDP model in the POMDP or POMDPX file format and produces a policy file. It also contains a simple simulator for evaluating the quality of the computed policy. More information can be found at here

November 20, 2009, at 03:28 PM by 172.18.179.173 -
November 20, 2009, at 03:28 PM by 172.18.179.173 -
Added line 3:

Download

November 20, 2009, at 03:27 PM by 172.18.179.173 -
Changed lines 4-8 from:

README (release 0.3, 15-Feb-2009)

source code (Linux, Mac OS X) source code (Windows)

to:
README (release 0.3, 15-Feb-2009)
source code (Linux, Mac OS X)
source code (Windows)
November 20, 2009, at 03:27 PM by 172.18.179.173 -
Changed lines 4-6 from:

README? (release 0.3, 15-Feb-2009)

to:

README (release 0.3, 15-Feb-2009)

November 20, 2009, at 03:27 PM by 172.18.179.173 -
Changed line 4 from:

Attach:README.txt | README (release 0.3, 15-Feb-2009)

to:

README? (release 0.3, 15-Feb-2009)

November 20, 2009, at 03:26 PM by 172.18.179.173 -
Changed line 4 from:

Attach:README.txt|README (release 0.3, 15-Feb-2009)

to:

Attach:README.txt | README (release 0.3, 15-Feb-2009)

November 20, 2009, at 03:25 PM by 172.18.179.173 -
Changed line 4 from:

README (release 0.3, 15-Feb-2009)

to:

Attach:README.txt|README (release 0.3, 15-Feb-2009)

November 20, 2009, at 03:24 PM by 172.18.179.173 -
Changed lines 4-5 from:

README (release 0.3, 15-Feb-2009) source code (Linux, Mac OS X) source code (Windows)

to:

README (release 0.3, 15-Feb-2009) source code (Linux, Mac OS X) source code (Windows)

Changed line 17 from:

For bug reports and suggestions, please email *motion@comp.nus.edu.sg

to:

For bug reports and suggestions, please email motion@comp.nus.edu.sg

November 20, 2009, at 03:23 PM by 172.18.179.173 -
Changed line 15 from:

For bug reports and suggestions, please email

to:

For bug reports and suggestions, please email *motion@comp.nus.edu.sg

November 20, 2009, at 03:22 PM by 172.18.179.173 -
Changed line 15 from:

For bug reports and suggestions, please email <motion@comp.nus.edu.sg>.

to:

For bug reports and suggestions, please email

November 20, 2009, at 03:21 PM by 172.18.179.173 -
Changed line 10 from:

Target tracking
to:

Target tracking
November 20, 2009, at 03:21 PM by 172.18.179.173 -
Changed line 10 from:

Target tracking
to:

Target tracking
Changed lines 12-13 from:

Integrated exploration
to:

Integrated exploration


November 20, 2009, at 03:20 PM by 172.18.179.173 -
Changed lines 11-12 from:
to:


November 20, 2009, at 03:20 PM by 172.18.179.173 -
November 20, 2009, at 03:20 PM by 172.18.179.173 -
Changed lines 12-14 from:

Integrated exploration
to:

Integrated exploration
November 20, 2009, at 03:19 PM by 172.18.179.173 -
Changed lines 8-10 from:

navigation
| grasping

Target tracking

Integrated exploration
to:

navigation

grasping

Target tracking

Integrated exploration
November 20, 2009, at 03:18 PM by 172.18.179.173 -
Changed line 8 from:

navigation | grasping
to:

navigation
| grasping
November 20, 2009, at 03:18 PM by 172.18.179.173 -
Changed line 8 from:

navigation | grasping
to:

navigation | grasping
November 20, 2009, at 03:17 PM by 172.18.179.173 -
Changed lines 8-9 from:

navigation

grasping
to:

navigation | grasping
November 20, 2009, at 03:16 PM by 172.18.179.173 -
Changed lines 8-11 from:

navigation

grasping

Target tracking

Integrated exploration
to:

navigation

grasping

Target tracking

Integrated exploration
November 20, 2009, at 03:16 PM by 172.18.179.173 -
Changed lines 8-12 from:

navigation

grasping

Target tracking

Integrated exploration
to:

navigation

grasping

Target tracking

Integrated exploration
Added lines 14-15:

Reference

November 20, 2009, at 03:13 PM by 172.18.179.173 -
Changed lines 8-11 from:

[navigation] | grasping | Target tracking | Integrated exploration
to:

navigation

grasping

Target tracking

Integrated exploration
November 20, 2009, at 03:13 PM by 172.18.179.173 -
Changed line 8 from:

[ navigation] | grasping | Target tracking | Integrated exploration
to:

[navigation] | grasping | Target tracking | Integrated exploration
November 20, 2009, at 03:12 PM by 172.18.179.173 -
Changed line 8 from:

navigation | grasping | Target tracking | Integrated exploration
to:

[ navigation] | grasping | Target tracking | Integrated exploration
November 20, 2009, at 03:11 PM by 172.18.179.173 -
Changed lines 8-11 from:

navigation

grasping

Integrated exploration

Target tracking
to:

navigation | grasping | Target tracking | Integrated exploration
November 20, 2009, at 03:10 PM by 172.18.179.173 -
Changed lines 1-3 from:

Approximate POMDP Planning (APPL) Toolkit

to:

Approximate POMDP Planning (APPL) Toolkit

Changed lines 8-11 from:
to:

navigation

grasping

Integrated exploration

Target tracking
November 20, 2009, at 03:02 PM by 172.18.179.173 -
Changed line 6 from:

APPL is a C++ implementation of the SARSOP algorithm [1], using the factored MOMDP representation [2]. It takes as input a POMDP model in the POMDP or POMDPX file format and produces a policy file. It also contains a simple simulator for evaluating the quality of the computed policy. More information can be found at http://motion.comp.nus.edu.sg/projects/pomdp/pomdp.html.

to:

APPL is a C++ implementation of the SARSOP algorithm [1], using the factored MOMDP representation [2]. It takes as input a POMDP model in the POMDP or POMDPX file format and produces a policy file. It also contains a simple simulator for evaluating the quality of the computed policy. More information can be found at here

November 20, 2009, at 03:01 PM by 172.18.179.173 -
Changed lines 6-10 from:

APPL is a C++ implementation of the SARSOP algorithm [1]. It takes in a POMDP model file in Tony Cassandra's POMDP file format and produces a policy file. It also contains a simple simulator for evaluating the quality of the computed policy. APPL has been tested on a number of large POMDPs with up to 15,000 states. More information is avalable here.

H. Kurniawati, D. Hsu, and W.S. Lee. SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In Proc. Robotics: Science and Systems, 2008.

to:

APPL is a C++ implementation of the SARSOP algorithm [1], using the factored MOMDP representation [2]. It takes as input a POMDP model in the POMDP or POMDPX file format and produces a policy file. It also contains a simple simulator for evaluating the quality of the computed policy. More information can be found at http://motion.comp.nus.edu.sg/projects/pomdp/pomdp.html.

For bug reports and suggestions, please email <motion@comp.nus.edu.sg>.

[1] H. Kurniawati, D. Hsu, and W.S. Lee. SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In Proc. Robotics: Science and Systems, 2008.

[2] S.C.W. Ong, S.W. Png, D. Hsu, and W.S. Lee. POMDPs for robotic tasks with mixed observability. In Proc. Robotics: Science and Systems, 2009.

November 20, 2009, at 02:53 PM by 172.18.179.173 -
Changed lines 1-768 from:

Table of Contents (:toc:) (:num:)

Overview

PomdpX is an XML file format for describing POMDPs (partially observable Markov decision processes), MOMDPs (mixed observability Markov decision processes)[1] and MDPs (Markov decision processes) in a factored representation. It allows multiple state, observation, action and reward variables to be specified in a model. The specified model must have at least one state, action and reward variable, while the observation variable is optional. Each state variable must be specified as either partially observed (default) or fully observed. Thus a PomdpX input document can specify a pomdp (all state variables partially observed), a momdp (mixture of partially observed and fully observed state variables), or an mdp (all state variables fully observed, no observation variables) problem. In general, the model can be represented by the dynamic Bayesian network (DBN) in Figure 1.1. Each of xt , yt , ot and at represents possibly multiple variables. xt represents fully observed state variables while yt represents partially observed state variables. (The reward variables are omitted to prevent clutter).


Figure 1.1 – The general model specified in a PomdpX document. Each of

xt , yt , ot and at represents multiple variables. The state (st ) is represented as multiple fully observed (xt ) and partially observed (yt ) state variables.

PomdpX Tutorial

The purpose of this section is to provide a tutorial-like approach to using the PomdpX format. We make no assumptions about the user’s familiarity with existing pomdp solvers.

Example Problem

We will be using a modified version of the RockSample problem, first proposed by Smith and Simmons [2] as our running example to encode into the PomdpX format. It models a rover on an exploration mission and it can achieve rewards by sampling rocks in its immediate area. Consider a map of size 1 × 3 as shown in Figure 2.1, with one rock at the left end and the terminal state at the right end. The rover starts off at the center and its possible actions are A = {West, East, Sample, Check}. The DBN for the RockSample problem is shown in Figure 2.2.


Figure 2.1 – The 1 × 3 RockSample problem world.

This is a trivial problem but is adequate to showcase the salient features of PomdpX. As with the original version of the problem, the Sample action samples the rock at the rover’s current location. If the rock is good, the rover receives a reward of 10 and the rock becomes bad. If the rock is bad, it receives a penalty of −10. Moving into the terminal area yields a reward of 10. A penalty of −100 is imposed for moving off the grid and sampling in a grid where there is no rock. All other moves have no cost or reward. The Check action returns a noisy observation from O = {Good, Bad}.


Figure 2.2 – Dynamic Bayesian network of the RockSample problem. The rover’s position is fully observed whereas the rock type is partially observed.

Example 1. A PomdpX document.


<?xml version="1.0" encoding="ISO-8859-1"?>
<pomdpx version="0.1" id="rockSample"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:noNamespaceSchemaLocation="pomdpx.xsd">
      <Description>             · · · </Description>
      <Discount>                · · · </Discount>
      <Variable>                · · · </Variable>
      <InitialStateBelief>      · · · </InitialStateBelief>
      <StateTransitionFunction> · · · </StateTransitionFunction>
      <ObsFunction>             · · · </ObsFunction>
      <RewardFunction>          · · · </RewardFunction>
</pomdpx>

File Format Structure

A PomdpX document consists of a header and a pomdpx root element which in turn contains child elements, as shown in Example 1 below. The first line of the document is an XML processing instruction which defines that the document adheres to the XML 1.0 standard and that the encoding of the document is ISO-8859-1. Other encodings such as UTF-8 are also possible.

<pomdpx> Tag

Continuing with the example above, the second line contains the root-element of a PomdpX document—the pomdpx element—which has the following attributes:

  • version
  • id – optional name for the specified model.
  • xmlns:xsi – defines xsi as the XML Schema namespace.
  • xsi:noNamespaceSchemaLocation – this is where we put our XML Schema

definition, pomdpx.xsd. The PomdpX input should be validated with this schema to ensure well-formedness.

The conventional ordering of the child elements is Description, Discount, Variable and thereafter: InitialStateBelief, StateTransitionFunction, ObsFunction and RewardFunction. However this ordering is not strictly re- quired and one may permute their orderings. Description is an optional, short description of the specified model. The other child elements specify the POMDP tuple (S, A, O, T , Z, R, γ) and the initial belief b0 .

In general these elements should all be present, and each can appear only once. ObsFunction may be omitted if there are no observation variables in the model. Similarly, InitialBeliefState may be omitted if all state variables are fully observed (for example an mdp model). pomdpx’s child elements are described in greater detail in the following subsections.

<Description> Tag

This is an optional tag that one may provide to give a brief description of the specified problem. For example:

Example 2. Contents of Description.


 <Description> RockSample problem for map size 1 x 3.
 Rock is at 0, Rover’s initial position is at 1.
 Exit is at 2.
 </Description>

<Discount> Tag

This specifies the discount factor γ. It has to be a real-valued number, for our RockSample problem, we will be using a discount factor of 0.95 and it is entered as shown:

Example 3. Contents of Discount.


 <Discount> 0.95 </Discount>

<Variable> Tag

The state, action and observation variables which factorize the state S, action A, and observation O spaces are declared within the Variable element. Reward variables, R are also declared here. Example 4 gives the declaration of the variables for the RockSample problem.

Each state variable is declared with the <StateVar> tag. It contains the following attributes:

  • vnamePrev – identifier for the variable’s start state.
  • vnameCurr – identifier for the variable’s end state.
  • fullyObs – set to true if the variable is fully observed. The default is false. Thus for the variable rock in Example 4, it is partially observed, as implied by the omission of the fullyObs attribute.

Example 4. Variable declaration. Defining S, A, O, and R variables.


<Variable>
     <StateVar vnamePrev="rover_0" vnameCurr="rover_1"
      fullyObs="true">
         <NumValues>3</NumValues>
     </StateVar>
     <StateVar vnamePrev="rock_0" vnameCurr="rock_1>"
         <ValueEnum>good bad</ValueEnum>
     </StateVar>
     <ObsVar vname="obs sensor">
         <ValueEnum>ogood obad</ValueEnum>
     </ObsVar>
     <ActionVar vname="action_rover">
         <ValueEnum>amw ame ac as</ValueEnum>
     </ActionVar>
     <RewardVar vname="reward rover" />
</Variable>

The possible values that a variable can assume are either specified with regards to the <NumValues> or <ValueEnum> tags. In the former, we would give an inte- ger to indicate the number of values/states for the variable. For instance, in the example, the rover is declared with three possible values. The values are sub- sequently referenced internally using numerals, starting from 0 and prepended with ‘s’. Hence the states for the rover variable would be s0, s1 and s2. When using <NumValues> it is up to the user to attach semantic meaning to the values, in our example, s0 denotes the left grid, s1 the center and s2 the right terminal grid.

In the latter, the user will have to manually enumerate all the possible values/states the variable may take on. In our example, the rock has two possible values, it is either good or bad.

The observation and action variables are also declared similarly with the <ObsVar> and <ActionVar> tags respectively. Both require the attribute vname which serves as the identifier for the variable. The possible values that an observation or action can assume can also be specified with either <NumValues> or <ValueEnum>. If <NumValues> is used, ‘o’ and ‘a’ would be prepended to the values of observation and action variables respectively.

In the case of <ValueEnum>, the user will once again need to enumerate all possible values/states manually. In our example, for the action_rover variable, we enumerate all the four possible actions. ‘amw’ is a mnemonic for action move west and ‘ac’ stands for action check and so on.

Finally, reward variables are declared with the <RewardVar> tags which must contain the vname attribute. The vname serves as an identifier for the reward variable. The <RewardVar> is an empty XML tag and no values are specified. Note that we may use the XML shorthand of <RewardVar vname="· · · " /> to close an empty tag here.

2.2.5 <InitialStateBelief> Tag This is an optional tag. It specifies the initial belief b0 , and may be omitted if all state variables are fully observed. The PomdpX format allows the initial belief to be specified as multiple multiplicative factors, with each <CondProb> tag specifying one of these factors. From our running RockSample problem, since the initial belief is not conditional on anything, it is factored as b0 = P (rover_0|∅)P (rock_0|∅). We will need two <CondProb> tags to specify it fully as shown below.

Example 5. Contents of InitialStateBelief.


  <InitialStateBelief>
      <CondProb>
         <Var>rover_0</Var>
         <Parent>null</Parent>
         <Parameter> · · · </Parameter>
      </CondProb>
      <CondProb>
         <Var>rock_0</Var>
         <Parent>null</Parent>
         <Parameter> · · · </Parameter>
      </CondProb>
   </InitialStateBelief>

The <CondProb> tag has no attributes and require the following three children tags:

  • <Var> – identifies the factor being specified. Only identifiers declared as vnamePrev of state variables are allowed here (see Section 2.2.4).
  • <Parent> – the set of conditioning variables. Only identifiers declared as vnamePrev or vnameCurr of state variables are allowed here. The previous statement is actually slightly misleading, as PomdpX allows certain combinations of vnamePrev and vnameCurr identifiers. Referring to Figure 1.1, we only allow conditioning arrows from xt (fully observed variables) to yt (partially observed variables) and not the other way round. Specifically, a vnameCurr identifier is allowed as parent only if the variable is fully observed. In addition, the keyword null may be used to signify the absence of any vconditioning variables.
  • <Parameter> – specifies the actual probabilities in the factor and is described in detail in Section 2.3.

The previous example is somewhat cumbersome to declare if we have too many state variables. We could have alternatively specified b0 as simply the joint belief of all state variables, P (rover_0, rock_0), with a single <CondProb> tag as shown in Example 6.

Example 6. Initial joint belief specification.


<InitialStateBelief>
     <CondProb>
         <Var>rover_0 rock_0</Var>
         <Parent>null</Parent>
         <Parameter> · · · </Parameter>
     </CondProb>
</InitialStateBelief>

<StateTransitionFunction> Tag

This specifies the transition function T , which in general is the multiplicative result of the individual transition functions of each state variable in the model. Each <CondProb> tag specifies the transition function for each state variable. For our RockSample problem, with reference to Figure 2.2, the overall transition function is: P (rover_1, rock_1|action_rover, rover_0, rock_0) = P (rover_1|action_rover, rover_0) × P (rock_1|action_rover, rover_0, rock_0).

This is translated to the following in PomdpX. One can see that it is very similar to its equational counterpart, only it has XML tags wrapped around it. We need to provide two CondProb elements, one each for the variable rover and rock.

Example 7. Contents of StateTransitionFunction.


 <StateTransitionFunction>
      <CondProb>
           <Var>rover_1</Var>
           <Parent>action_rover rover_0</Parent>
           <Parameter> · · · </Parameter>
      </CondProb>
      <CondProb>
           <Var>rock_1</Var>
           <Parent>action_rover rover_0 rock_0</Parent>
           <Parameter> · · · </Parameter>
      </CondProb>
 </StateTransitionFunction>

As described in 2.2.5, the <Var> tag identifies the state variable whose transition function is being specified. In this case, only identifiers declared as the vnameCurr attribute of state variables may be allowed here.

The identifiers within the <Parent> tag identify the conditioning variables in the transition function. They may be identifiers which had been declared as either the vnamePrev or vnameCurr attributes of state variables, or identifiers which had been declared as the vname attribute of action variables (see Section 2.2.4). Once again, we point out the caveat that PomdpX only allows certain combinations of vnamePrev and vnameCurr. One may only use vnameCurr identifiers within the <Parent> tag if the variable is fully observed. We defer the description of <Parameter> tag to Section 2.3 as it is fairly involved.

<ObsFunction> Tag

This specifies the observation function Z, which in general is the multiplicative result of the individual observation functions of each observation variable in the model. Each <CondProb> tag specifies one of these individual observation functions. In the RockSample problem, the probability of an observation is conditional on taking an action and ending in a new state. Thus its parents are action_rover, rover_1 and rock_1, as given in Example 8.

Example 8. Contents of ObsFunction.


 <ObsFunction>
       <CondProb>
            <Var>obs sensor</Var>
            <Parent>action_rover rover_1 rock_1</Parent>
            <Parameter> · · · </Parameter>
       </CondProb>
 </ObsFunction>

For each CondProb element, the identifier within the <Var> tags identifies the observation variable whose observation function is being specified. The identifiers within the <Parent> tags identifies the conditioning variables in the observation function. Identifiers that appear within the <Var> tags must be identifiers which had been declared as the vname attribute of observation vari- ables. Identifiers that appear within the <Parent> tags must be identifiers which had been declared as the vnameCurr attribute of state variables, or the vname attribute of action variables (see Section 2.2.4). Parameter specifies the actual probabilities in the function and will be described in Section 2.3.

<RewardFunction> Tag

This specifies the reward function R, which in general is the additive result of the individual reward functions of each reward variable in the model. Each <Func> tag specifies one of these individual reward functions. For our RockSample problem, the reward depends on the action taken at the current state, thus its parents are action_rover, rover_0 and rock_0. This is shown in Example 9.

Example 9. Contents of RewardFunction.


 <RewardFunction>
       <Func>
            <Var>reward rover</Var>
            <Parent>action_rover rover_0 rock_0</Parent>
            <Parameter> · · · </Parameter>
       </Func>
 </RewardFunction>

Similar to the <CondProd> tag, the <Func> tag has no attributes and requires the following three children tags to be defined:

  • <Var> – this identifies the reward variable whose reward function is being specified. Only identifiers that had been declared as the vname attribute

of reward variables may appear here.

  • <Parent> – this identifies the domain of the reward function. All identifiers declared as vnamePrev or vnameCurr attributes of state variables, vname attribute of action variables or vname attribute of observation variables are allowed here.
  • <Parameter> – specifies the actual values in the function and is described in detail in Section 2.3.

<Parameter> Tag

The <Parameter> tag is a fairly complicated component of PomdpX, introducing several new keywords and symbols, thus it warrants an individual section in itself. It has an optional attribute called type, which has possible values TBL (default) and DD, short for table and decision diagram, respectively. We will describe how to encode the RockSample problem both in TBL and DD.

Table Type (TBL)

When the <Parameter> tag appears as a child of a CondProb element, it must contain <Entry> child tags. Each Entry element specifies the probability entry of a function table. The <Entry> tag itself must consist of the following:

  • <Instance> – declares all the variables for the probability function. Each variable value must correspond to the identifiers that appear between the enclosing <Parent> tag, followed by the identifier that appears between the enclosing <Var> tag.
  • <ProbTable> – specifies the actual numerical values of the probabilities. This is best illustrated by Example 10 below. With reference to Figure 2.2, we show the full encoding of the rock ’s transition function for the rover ’s action of moving West. From the example, the <Var> tag declares that we are defining the transition function for the variable rock (line 3). It is conditional on action_rover, rover_0 and rock_0, which appear between the <Parent> tag

(line 4). The first <Entry> set (lines 6–9) specifies:

P (rock_1 = good|action_rover = amw, rover_0 = s0, rock_0 = good) = 1.0.

In this case, when action_rover is amw, and rock_0 is good, rock_1 will be good as well, since a move action will not disturb its state. Conversely, if action_rover is amw, and rock_0 is good it is impossible for rock_1 to be bad as specified by lines 18–29.

Note that order matters here and it might be the source of some subtle bugs if overlooked. As mentioned before, the conditioning variables declared between the <Instance> tag (first three elements in line 7) correspond to the order they appear in the enclosing <Parent> tag, the last element corresponds to the variable being defined. One may arbitarily re-order the conditioning variables as long as they match-up within the <Parent> and <Instance> tags and the last element is always the identifier defined by <Var>. The convention that we adopt is to declare actions, fully observed variables followed by partially observed variables.

Example 10. Contents of Parameter type="TBL", within CondProb.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock_1</Var>
4.           <Parent>action_rover rover_0 rock_0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw s0 good good</Instance>
8.                    <ProbTable>1.0</ProbTable>
9.               </Entry>
10.              <Entry>
11.                   <Instance>amw s1 good good</Instance>
12.                   <ProbTable>1.0</ProbTable>
13.              </Entry>
14.              <Entry>
15.                   <Instance>amw s2 good good</Instance>
16.                   <ProbTable>1.0</ProbTable>
17.              </Entry>
18.              <Entry>
19.                   <Instance>amw s0 good bad</Instance>
20.                   <ProbTable>0.0</ProbTable>
21.              </Entry>
22.              <Entry>
23.                   <Instance>amw s1 good bad</Instance>
24.                   <ProbTable>0.0</ProbTable>
25.              </Entry>
26.              <Entry>
27.                   <Instance>amw s2 good bad</Instance>
28.                   <ProbTable>0.0</ProbTable>
29.              </Entry>
30.              <Entry>
31.                   <Instance>amw s0 bad good</Instance>
32.                   <ProbTable>0.0</ProbTable>
33.              </Entry>
34.              <Entry>
35.                   <Instance>amw s1 bad good</Instance>
36.                   <ProbTable>0.0</ProbTable>
37.              </Entry>
38.              <Entry>
39.                   <Instance>amw s2 bad good</Instance>
40.                   <ProbTable>0.0</ProbTable>
41.              </Entry>
42.              <Entry>
43.                   <Instance>amw s0 bad bad</Instance>
44.                   <ProbTable>1.0</ProbTable>
45.              </Entry>
46.              <Entry>
47.                   <Instance>amw s1 bad bad</Instance>
48.                   <ProbTable>1.0</ProbTable>
49.              </Entry>
50.              <Entry>
51.                   <Instance>amw s2 bad bad</Instance>
52.                   <ProbTable>1.0</ProbTable>
53.              </Entry>
54.          </Parameter>
55.      </CondProb>
56. </StateTransitionFunction>

It seems a bit daunting that it takes 56 lines just to declare the transition function for the rock for a simple 1 × 3 grid. And this only for the rover’s action of moving West. But XML is verbose by nature and that is the price to pay for interoperability and extensibility. However, PomdpX does provide several convenience features to ease the encoding task.

First and foremost, lines 18–41 are actually redundant since any entry not specified is assumed to be zero. Secondly, we observe that the first three <Entry> sets (lines 6–17) are very similar. They differ only in the state of rover_0 and s0 to s2 are all the possible states of the rover. In such a situation, we may use the wildcard character “*”, which means that this is true for all possible values that could appear here. Therefore, lines 6–17 could be replaced by just one <Entry> tag, this is true for lines 42–53 too. Example 10 is re-written more succinctly and shown as Example 11.

Example 11. Usage of wildcard character *.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw * good good</Instance>
8.                    <ProbTable>1.0</ProbTable>
9.               </Entry>
10.              <Entry>
11.                   <Instance>amw * bad bad</Instance>
12.                   <ProbTable>1.0</ProbTable>
13.              </Entry>
14.          </Parameter>
15.      </CondProb>
16. </StateTransitionFunction>

As some probabilities of the rock ’s transition are zero, they may be conveniently left out. However in certain cases, some variables may have all non-zero transition probabilities. PomdpX specifically provides another special character “-” to handle this. The “-” character means cycle through all possible values that could appear here and match the listed probabilities (in <ProbTable>) accordingly. Hence, Example 11 can also be expressed as:

Example 12. Usage of character -.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                     <Instance>amw * good - </Instance>
8.                     <ProbTable>1.0 0.0</ProbTable>
9.               </Entry>
10.              <Entry>
11.                    <Instance>amw * bad - </Instance>
12.                    <ProbTable>0.0 1.0</ProbTable>
13.              </Entry>
14.          </Parameter>
15.      </CondProb>
16. </StateTransitionFunction>

Although it is not obvious here, one can imagine if the entries were both non- zero, the use of “-” would save us from having to specify another set of <Entry> tag.

With the introduction of the “-” character, the first <Entry> set (lines 6–9) in Example 12 is in effect specifying the following:

P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = good) = 1.0 and P (rock 1 = bad|action rover = amw, rover 0 = ∗, rock 0 = good) = 0.0.

There is also an implicit ordering in Example 12. For instance, the usage of “-” for the first <Entry> set (lines 6–9), considers the possible values of rock to be good first then bad, hence the <ProbTable> entries are listed as (1.0 0.0) rather than (0.0 1.0). This “internal” order is actually taken from the way rock is declared in the <ValueEnum> tag (see Section 2.2.4), in which its possible values were declared to be first good then bad.

In the quest for further compression, there is a final modification we can make to Example 12. We make the observation that the two <Entry> sets seem some- what complementary differing only in the states of rock_0 and <ProbTable> entries. Thus employing the same trick for Example 12, we can replace the states of ''rock_0' with a “-”. This gives us Example 13.

Example 13. Usage of double -.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw * - - </Instance>
8.                    <ProbTable>1.0 0.0 0.0 1.0</ProbTable>
9.               </Entry>
10.          </Parameter>
11.      </CondProb>
12. </StateTransitionFunction>

By using double “-”, the single <Entry> set in Example 13 is equivalent to specifying the following:

P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = good) = 1.0
P (rock 1 = bad|action rover = amw, rover 0 = ∗, rock 0 = good) = 0.0
P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = bad) = 0.0
and
P (rock 1 = bad|action rover = amw, rover 0 = ∗, rock 0 = bad) = 1.0.

The <ProbTable> entries in Example 13 are in effect a 2 × 2 identity matrix. Hence our PomdpX format also allows for the keyword identity2 to be used in lieu of having to enumerate all the ones and zeros (like line 8). Therefore Examples 13 and 14 are functionally equivalent.

Example 14. Usage of keyword identity.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw * - - </Instance>
8.                    <ProbTable>identity</ProbTable>
9.               </Entry>
10.          </Parameter>
11.      </CondProb>
12. </StateTransitionFunction>

Another recognized keyword which may also be used in the <ProbTable> tags is uniform. This is equivalent to the probability 1/n repeated n times, where n is the number of possible values that could appear here. For example, the <Entry> tag below,

Example 15. Usage of keyword uniform.


<InitialStateBelief>
     <CondProb>
         <Var>rock 0</Var>
         <Parent>null</Parent>
         <Parameter type = "TBL">
             <Entry>
                  <Instance> - </Instance>
                  <ProbTable>uniform</ProbTable>
             </Entry>
         </Parameter>
     </CondProb>
</InitialStateBelief>

gives: P (rock 0 = good|∅) = 0.5 and P (rock 0 = bad|∅) = 0.5 , which specifies our initial belief that the rock has equal probability of being good or bad.

Besides being a child of the CondProb element, the <Parameter> tag may also appear as a child of the Func element which is used to define the reward function. In this case, the <Entry> tag within the <Parameter> must contain

the following:

  • <Instance> – declares values of all the variables for the reward function. Each variable value must correspond to the identifiers that appear between the enclosing <Parent> tag.
  • <ValueTable> – specifies the actual numerical reward.

Example 16 shows a snippet defining the reward function for the rover. In this example, the <Entry> specifies:

Rreward rover (action rover = ame, rover 0 = s1, rock 0 = ∗) = 10.

By now, the wildcard character “*” should be familiar to the user. Its use here denotes the fact that the rover will obtain a reward of 10 moving East from s1 (to the terminal state), regardless of whether the rock is good or bad.

Note that the characters “*” and “-” can be used in a similar manner as described in the previous sections. However, the keywords uniform and identity cannot appear between <ValueTable> tags, since those keywords only make sense for probabilities and not rewards.

We reiterate here that any probability or value entries of a function table which are not specified within a <Parameter> tag are assumed to be zero. Fur- thermore, a particular probability or value entry can also be specified more than once. The definition that appears last within a <Parameter> tag is the one that will take effect. This is convenient for specifying exceptions to a more general specification. The full compact version of the PomdpX input file for the RockSample problem with <Parameter type="TBL"> is given in Appendix A.

Appendix A

RockSample.pomdpx, type="TBL"

Full Specification of RockSample problem in PomdpX.

   <?xml version="1.0" encoding="ISO-8859-1"?>
   <pomdpx version="0.1" id="rockSample"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:noNamespaceSchemaLocation="pomdpx.xsd">
         <Description>RockSample problem for map size 1 x 3.
           Rock is at 0, Rover’s initial position is at 1.
           Exit is at 2.
         </Description>
         <Discount>0.95</Discount>
         <Variable>
              <StateVar vnamePrev="rover 0" vnameCurr="rover 1"
                fullyObs="true">
                  <NumValues>3</NumValues>
              </StateVar>
              <StateVar vnamePrev="rock 0" vnameCurr="rock 1">
                  <ValueEnum>good bad</ValueEnum>
              </StateVar>
              <ObsVar vname="obs sensor">
                  <ValueEnum>ogood obad</ValueEnum>
              </ObsVar>
              <ActionVar vname="action rover">
                  <ValueEnum>amw ame ac as</ValueEnum>
              </ActionVar>
              <RewardVar vname="reward rover" />
         </Variable>
         <InitialStateBelief>
              <CondProb>
                  <Var>rover 0</Var>
                  <Parent>null</Parent>
                  <Parameter type="TBL">
                        <Entry>
                            <Instance> - </Instance>
                            <ProbTable>0.0 1.0 0.0</ProbTable>
                        </Entry>
              </Parameter>
         </CondProb>
         <CondProb>
              <Var>rock 0</Var>
              <Parent>null</Parent>
              <Parameter type="TBL">
                  <Entry>
                      <Instance>-</Instance>
                      <ProbTable>uniform</ProbTable>
                  </Entry>
              </Parameter>
         </CondProb>
      </InitialStateBelief>
      <StateTransitionFunction>
          <CondProb>
              <Var>rover 1</Var>
              <Parent>action rover rover 0</Parent>
              <Parameter type="TBL">
                  <Entry>
                      <Instance>amw s0 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>amw s1 s0</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ame s0 s1</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ame s1 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ac s0 s0</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ac s1 s1</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>as s0 s0</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>as s1 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>* s2 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
           </Parameter>
       </CondProb>
       <CondProb>
           <Var>rock 1</Var>
           <Parent>action rover rover 0 rock 0</Parent>
           <Parameter>
               <Entry>
                   <Instance>amw * - - </Instance>
                   <ProbTable>1.0 0.0 0.0 1.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ame * - - </Instance>
                   <ProbTable>identity</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ac * - - </Instance>
                   <ProbTable>identity</ProbTable>
               </Entry>
               <Entry>
                   <Instance>as * - - </Instance>
                   <ProbTable>identity</ProbTable>
               </Entry>
               <Entry>
                   <Instance>as s0 * - </Instance>
                   <ProbTable>0.0 1.0</ProbTable>
               </Entry>
           </Parameter>
       </CondProb>
   </StateTransitionFunction>
   <ObsFunction>
       <CondProb>
           <Var>obs sensor</Var>
           <Parent>action rover rover 1 rock 1</Parent>
           <Parameter type="TBL">
               <Entry>
                   <Instance>amw * * - </Instance>
                   <ProbTable>1.0 0.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ame * * - </Instance>
                   <ProbTable>1.0 0.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>as * * - </Instance>
                   <ProbTable>1.0 0.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ac s0 - - </Instance>
                   <ProbTable>1.0 0.0 0.0 1.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ac s1 - - </Instance>
                   <ProbTable>0.8 0.2 0.2 0.8</ProbTable>
               </Entry>
                <Entry>
                     <Instance>ac s2 * - </Instance>
                     <ProbTable>1.0 0.0</ProbTable>
                </Entry>
            </Parameter>
        </CondProb>
    </ObsFunction>
    <RewardFunction>
        <Func>
            <Var>reward rover</Var>
            <Parent>action rover rover 0 rock 0</Parent>
            <Parameter type="TBL">
                <Entry>
                     <Instance>ame s1 *</Instance>
                     <ValueTable>10</ValueTable>
                </Entry>
                <Entry>
                     <Instance>amw s0 *</Instance>
                     <ValueTable>-100</ValueTable>
                </Entry>
                <Entry>
                     <Instance>as s1 *</Instance>
                     <ValueTable>-100</ValueTable>
                </Entry>
                <Entry>
                     <Instance>as s0 good</Instance>
                     <ValueTable>10</ValueTable>
                </Entry>
                <Entry>
                     <Instance>as s0 bad</Instance>
                     <ValueTable>-10</ValueTable>
                </Entry>
            </Parameter>
        </Func>
    </RewardFunction>
 </pomdpx>
to:

Approximate POMDP Planning (APPL) Toolkit

README (release 0.3, 15-Feb-2009) source code (Linux, Mac OS X) source code (Windows)

APPL is a C++ implementation of the SARSOP algorithm [1]. It takes in a POMDP model file in Tony Cassandra's POMDP file format and produces a policy file. It also contains a simple simulator for evaluating the quality of the computed policy. APPL has been tested on a number of large POMDPs with up to 15,000 states. More information is avalable here.

H. Kurniawati, D. Hsu, and W.S. Lee. SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In Proc. Robotics: Science and Systems, 2008.

November 20, 2009, at 02:45 PM by 172.18.179.173 -
Deleted lines 3-4:

(:pagelist:)

November 20, 2009, at 02:42 PM by 172.18.179.173 -
Added lines 4-5:

(:pagelist:)

November 20, 2009, at 02:20 PM by 172.18.179.173 -
Changed line 328 from:

2.3.1 Table Type (TBL)

to:

Table Type (TBL)

November 20, 2009, at 02:20 PM by 172.18.179.173 -
Changed line 321 from:

<Parameter> Tag

to:

<Parameter> Tag

November 20, 2009, at 02:20 PM by 172.18.179.173 -
Changed line 297 from:

2.2.8 <RewardFunction> Tag

to:

<RewardFunction> Tag

November 20, 2009, at 02:20 PM by 172.18.179.173 -
Changed line 269 from:

2.2.7 <ObsFunction> Tag

to:

<ObsFunction> Tag

November 20, 2009, at 02:20 PM by 172.18.179.173 -
Changed line 228 from:

2.2.6 <StateTransitionFunction> Tag

to:

<StateTransitionFunction> Tag

November 20, 2009, at 02:19 PM by 172.18.179.173 -
Changed line 120 from:

2.2.4 <Variable> Tag

to:

<Variable> Tag

November 20, 2009, at 02:19 PM by 172.18.179.173 -
Changed line 112 from:

2.2.3 <Discount> Tag

to:

<Discount> Tag

November 20, 2009, at 02:19 PM by 172.18.179.173 -
Changed line 101 from:

2.2.2 <Description> Tag

to:

<Description> Tag

November 20, 2009, at 02:18 PM by 172.18.179.173 -
Changed line 78 from:

""<pomdpx>"" Tag

to:

<pomdpx> Tag

November 20, 2009, at 02:18 PM by 172.18.179.173 -
Changed line 78 from:

2.2.1 [<pomdpx>] Tag

to:

""<pomdpx>"" Tag

November 20, 2009, at 02:17 PM by 172.18.178.201 -
Deleted line 3:
November 20, 2009, at 02:16 PM by 172.18.178.201 -
Changed line 79 from:

2.2.1 <pomdpx> Tag

to:

2.2.1 [<pomdpx>] Tag

November 20, 2009, at 01:57 PM by 172.18.178.201 -
Added line 1:

Table of Contents

Added line 3:

(:num:)

November 20, 2009, at 01:37 PM by 172.18.179.173 -
Changed lines 1-7 from:
to:

(:toc:) template?

Overview

PomdpX is an XML file format for describing POMDPs (partially observable Markov decision processes), MOMDPs (mixed observability Markov decision processes)[1] and MDPs (Markov decision processes) in a factored representation. It allows multiple state, observation, action and reward variables to be specified in a model. The specified model must have at least one state, action and reward variable, while the observation variable is optional. Each state variable must be specified as either partially observed (default) or fully observed. Thus a PomdpX input document can specify a pomdp (all state variables partially observed), a momdp (mixture of partially observed and fully observed state variables), or an mdp (all state variables fully observed, no observation variables) problem. In general, the model can be represented by the dynamic Bayesian network (DBN) in Figure 1.1. Each of xt , yt , ot and at represents possibly multiple variables. xt represents fully observed state variables while yt represents partially observed state variables. (The reward variables are omitted to prevent clutter).


Figure 1.1 – The general model specified in a PomdpX document. Each of

xt , yt , ot and at represents multiple variables. The state (st ) is represented as multiple fully observed (xt ) and partially observed (yt ) state variables.

PomdpX Tutorial

The purpose of this section is to provide a tutorial-like approach to using the PomdpX format. We make no assumptions about the user’s familiarity with existing pomdp solvers.

Example Problem

We will be using a modified version of the RockSample problem, first proposed by Smith and Simmons [2] as our running example to encode into the PomdpX format. It models a rover on an exploration mission and it can achieve rewards by sampling rocks in its immediate area. Consider a map of size 1 × 3 as shown in Figure 2.1, with one rock at the left end and the terminal state at the right end. The rover starts off at the center and its possible actions are A = {West, East, Sample, Check}. The DBN for the RockSample problem is shown in Figure 2.2.


Figure 2.1 – The 1 × 3 RockSample problem world.

This is a trivial problem but is adequate to showcase the salient features of PomdpX. As with the original version of the problem, the Sample action samples the rock at the rover’s current location. If the rock is good, the rover receives a reward of 10 and the rock becomes bad. If the rock is bad, it receives a penalty of −10. Moving into the terminal area yields a reward of 10. A penalty of −100 is imposed for moving off the grid and sampling in a grid where there is no rock. All other moves have no cost or reward. The Check action returns a noisy observation from O = {Good, Bad}.


Figure 2.2 – Dynamic Bayesian network of the RockSample problem. The rover’s position is fully observed whereas the rock type is partially observed.

Example 1. A PomdpX document.


<?xml version="1.0" encoding="ISO-8859-1"?>
<pomdpx version="0.1" id="rockSample"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:noNamespaceSchemaLocation="pomdpx.xsd">
      <Description>             · · · </Description>
      <Discount>                · · · </Discount>
      <Variable>                · · · </Variable>
      <InitialStateBelief>      · · · </InitialStateBelief>
      <StateTransitionFunction> · · · </StateTransitionFunction>
      <ObsFunction>             · · · </ObsFunction>
      <RewardFunction>          · · · </RewardFunction>
</pomdpx>

File Format Structure

A PomdpX document consists of a header and a pomdpx root element which in turn contains child elements, as shown in Example 1 below. The first line of the document is an XML processing instruction which defines that the document adheres to the XML 1.0 standard and that the encoding of the document is ISO-8859-1. Other encodings such as UTF-8 are also possible.

2.2.1 <pomdpx> Tag

Continuing with the example above, the second line contains the root-element of a PomdpX document—the pomdpx element—which has the following attributes:

  • version
  • id – optional name for the specified model.
  • xmlns:xsi – defines xsi as the XML Schema namespace.
  • xsi:noNamespaceSchemaLocation – this is where we put our XML Schema

definition, pomdpx.xsd. The PomdpX input should be validated with this schema to ensure well-formedness.

The conventional ordering of the child elements is Description, Discount, Variable and thereafter: InitialStateBelief, StateTransitionFunction, ObsFunction and RewardFunction. However this ordering is not strictly re- quired and one may permute their orderings. Description is an optional, short description of the specified model. The other child elements specify the POMDP tuple (S, A, O, T , Z, R, γ) and the initial belief b0 .

In general these elements should all be present, and each can appear only once. ObsFunction may be omitted if there are no observation variables in the model. Similarly, InitialBeliefState may be omitted if all state variables are fully observed (for example an mdp model). pomdpx’s child elements are described in greater detail in the following subsections.

2.2.2 <Description> Tag

This is an optional tag that one may provide to give a brief description of the specified problem. For example:

Example 2. Contents of Description.


 <Description> RockSample problem for map size 1 x 3.
 Rock is at 0, Rover’s initial position is at 1.
 Exit is at 2.
 </Description>

2.2.3 <Discount> Tag

This specifies the discount factor γ. It has to be a real-valued number, for our RockSample problem, we will be using a discount factor of 0.95 and it is entered as shown:

Example 3. Contents of Discount.


 <Discount> 0.95 </Discount>

2.2.4 <Variable> Tag

The state, action and observation variables which factorize the state S, action A, and observation O spaces are declared within the Variable element. Reward variables, R are also declared here. Example 4 gives the declaration of the variables for the RockSample problem.

Each state variable is declared with the <StateVar> tag. It contains the following attributes:

  • vnamePrev – identifier for the variable’s start state.
  • vnameCurr – identifier for the variable’s end state.
  • fullyObs – set to true if the variable is fully observed. The default is false. Thus for the variable rock in Example 4, it is partially observed, as implied by the omission of the fullyObs attribute.

Example 4. Variable declaration. Defining S, A, O, and R variables.


<Variable>
     <StateVar vnamePrev="rover_0" vnameCurr="rover_1"
      fullyObs="true">
         <NumValues>3</NumValues>
     </StateVar>
     <StateVar vnamePrev="rock_0" vnameCurr="rock_1>"
         <ValueEnum>good bad</ValueEnum>
     </StateVar>
     <ObsVar vname="obs sensor">
         <ValueEnum>ogood obad</ValueEnum>
     </ObsVar>
     <ActionVar vname="action_rover">
         <ValueEnum>amw ame ac as</ValueEnum>
     </ActionVar>
     <RewardVar vname="reward rover" />
</Variable>

The possible values that a variable can assume are either specified with regards to the <NumValues> or <ValueEnum> tags. In the former, we would give an inte- ger to indicate the number of values/states for the variable. For instance, in the example, the rover is declared with three possible values. The values are sub- sequently referenced internally using numerals, starting from 0 and prepended with ‘s’. Hence the states for the rover variable would be s0, s1 and s2. When using <NumValues> it is up to the user to attach semantic meaning to the values, in our example, s0 denotes the left grid, s1 the center and s2 the right terminal grid.

In the latter, the user will have to manually enumerate all the possible values/states the variable may take on. In our example, the rock has two possible values, it is either good or bad.

The observation and action variables are also declared similarly with the <ObsVar> and <ActionVar> tags respectively. Both require the attribute vname which serves as the identifier for the variable. The possible values that an observation or action can assume can also be specified with either <NumValues> or <ValueEnum>. If <NumValues> is used, ‘o’ and ‘a’ would be prepended to the values of observation and action variables respectively.

In the case of <ValueEnum>, the user will once again need to enumerate all possible values/states manually. In our example, for the action_rover variable, we enumerate all the four possible actions. ‘amw’ is a mnemonic for action move west and ‘ac’ stands for action check and so on.

Finally, reward variables are declared with the <RewardVar> tags which must contain the vname attribute. The vname serves as an identifier for the reward variable. The <RewardVar> is an empty XML tag and no values are specified. Note that we may use the XML shorthand of <RewardVar vname="· · · " /> to close an empty tag here.

2.2.5 <InitialStateBelief> Tag This is an optional tag. It specifies the initial belief b0 , and may be omitted if all state variables are fully observed. The PomdpX format allows the initial belief to be specified as multiple multiplicative factors, with each <CondProb> tag specifying one of these factors. From our running RockSample problem, since the initial belief is not conditional on anything, it is factored as b0 = P (rover_0|∅)P (rock_0|∅). We will need two <CondProb> tags to specify it fully as shown below.

Example 5. Contents of InitialStateBelief.


  <InitialStateBelief>
      <CondProb>
         <Var>rover_0</Var>
         <Parent>null</Parent>
         <Parameter> · · · </Parameter>
      </CondProb>
      <CondProb>
         <Var>rock_0</Var>
         <Parent>null</Parent>
         <Parameter> · · · </Parameter>
      </CondProb>
   </InitialStateBelief>

The <CondProb> tag has no attributes and require the following three children tags:

  • <Var> – identifies the factor being specified. Only identifiers declared as vnamePrev of state variables are allowed here (see Section 2.2.4).
  • <Parent> – the set of conditioning variables. Only identifiers declared as vnamePrev or vnameCurr of state variables are allowed here. The previous statement is actually slightly misleading, as PomdpX allows certain combinations of vnamePrev and vnameCurr identifiers. Referring to Figure 1.1, we only allow conditioning arrows from xt (fully observed variables) to yt (partially observed variables) and not the other way round. Specifically, a vnameCurr identifier is allowed as parent only if the variable is fully observed. In addition, the keyword null may be used to signify the absence of any vconditioning variables.
  • <Parameter> – specifies the actual probabilities in the factor and is described in detail in Section 2.3.

The previous example is somewhat cumbersome to declare if we have too many state variables. We could have alternatively specified b0 as simply the joint belief of all state variables, P (rover_0, rock_0), with a single <CondProb> tag as shown in Example 6.

Example 6. Initial joint belief specification.


<InitialStateBelief>
     <CondProb>
         <Var>rover_0 rock_0</Var>
         <Parent>null</Parent>
         <Parameter> · · · </Parameter>
     </CondProb>
</InitialStateBelief>

2.2.6 <StateTransitionFunction> Tag

This specifies the transition function T , which in general is the multiplicative result of the individual transition functions of each state variable in the model. Each <CondProb> tag specifies the transition function for each state variable. For our RockSample problem, with reference to Figure 2.2, the overall transition function is: P (rover_1, rock_1|action_rover, rover_0, rock_0) = P (rover_1|action_rover, rover_0) × P (rock_1|action_rover, rover_0, rock_0).

This is translated to the following in PomdpX. One can see that it is very similar to its equational counterpart, only it has XML tags wrapped around it. We need to provide two CondProb elements, one each for the variable rover and rock.

Example 7. Contents of StateTransitionFunction.


 <StateTransitionFunction>
      <CondProb>
           <Var>rover_1</Var>
           <Parent>action_rover rover_0</Parent>
           <Parameter> · · · </Parameter>
      </CondProb>
      <CondProb>
           <Var>rock_1</Var>
           <Parent>action_rover rover_0 rock_0</Parent>
           <Parameter> · · · </Parameter>
      </CondProb>
 </StateTransitionFunction>

As described in 2.2.5, the <Var> tag identifies the state variable whose transition function is being specified. In this case, only identifiers declared as the vnameCurr attribute of state variables may be allowed here.

The identifiers within the <Parent> tag identify the conditioning variables in the transition function. They may be identifiers which had been declared as either the vnamePrev or vnameCurr attributes of state variables, or identifiers which had been declared as the vname attribute of action variables (see Section 2.2.4). Once again, we point out the caveat that PomdpX only allows certain combinations of vnamePrev and vnameCurr. One may only use vnameCurr identifiers within the <Parent> tag if the variable is fully observed. We defer the description of <Parameter> tag to Section 2.3 as it is fairly involved.

2.2.7 <ObsFunction> Tag

This specifies the observation function Z, which in general is the multiplicative result of the individual observation functions of each observation variable in the model. Each <CondProb> tag specifies one of these individual observation functions. In the RockSample problem, the probability of an observation is conditional on taking an action and ending in a new state. Thus its parents are action_rover, rover_1 and rock_1, as given in Example 8.

Example 8. Contents of ObsFunction.


 <ObsFunction>
       <CondProb>
            <Var>obs sensor</Var>
            <Parent>action_rover rover_1 rock_1</Parent>
            <Parameter> · · · </Parameter>
       </CondProb>
 </ObsFunction>

For each CondProb element, the identifier within the <Var> tags identifies the observation variable whose observation function is being specified. The identifiers within the <Parent> tags identifies the conditioning variables in the observation function. Identifiers that appear within the <Var> tags must be identifiers which had been declared as the vname attribute of observation vari- ables. Identifiers that appear within the <Parent> tags must be identifiers which had been declared as the vnameCurr attribute of state variables, or the vname attribute of action variables (see Section 2.2.4). Parameter specifies the actual probabilities in the function and will be described in Section 2.3.

2.2.8 <RewardFunction> Tag

This specifies the reward function R, which in general is the additive result of the individual reward functions of each reward variable in the model. Each <Func> tag specifies one of these individual reward functions. For our RockSample problem, the reward depends on the action taken at the current state, thus its parents are action_rover, rover_0 and rock_0. This is shown in Example 9.

Example 9. Contents of RewardFunction.


 <RewardFunction>
       <Func>
            <Var>reward rover</Var>
            <Parent>action_rover rover_0 rock_0</Parent>
            <Parameter> · · · </Parameter>
       </Func>
 </RewardFunction>

Similar to the <CondProd> tag, the <Func> tag has no attributes and requires the following three children tags to be defined:

  • <Var> – this identifies the reward variable whose reward function is being specified. Only identifiers that had been declared as the vname attribute

of reward variables may appear here.

  • <Parent> – this identifies the domain of the reward function. All identifiers declared as vnamePrev or vnameCurr attributes of state variables, vname attribute of action variables or vname attribute of observation variables are allowed here.
  • <Parameter> – specifies the actual values in the function and is described in detail in Section 2.3.

<Parameter> Tag

The <Parameter> tag is a fairly complicated component of PomdpX, introducing several new keywords and symbols, thus it warrants an individual section in itself. It has an optional attribute called type, which has possible values TBL (default) and DD, short for table and decision diagram, respectively. We will describe how to encode the RockSample problem both in TBL and DD.

2.3.1 Table Type (TBL)

When the <Parameter> tag appears as a child of a CondProb element, it must contain <Entry> child tags. Each Entry element specifies the probability entry of a function table. The <Entry> tag itself must consist of the following:

  • <Instance> – declares all the variables for the probability function. Each variable value must correspond to the identifiers that appear between the enclosing <Parent> tag, followed by the identifier that appears between the enclosing <Var> tag.
  • <ProbTable> – specifies the actual numerical values of the probabilities. This is best illustrated by Example 10 below. With reference to Figure 2.2, we show the full encoding of the rock ’s transition function for the rover ’s action of moving West. From the example, the <Var> tag declares that we are defining the transition function for the variable rock (line 3). It is conditional on action_rover, rover_0 and rock_0, which appear between the <Parent> tag

(line 4). The first <Entry> set (lines 6–9) specifies:

P (rock_1 = good|action_rover = amw, rover_0 = s0, rock_0 = good) = 1.0.

In this case, when action_rover is amw, and rock_0 is good, rock_1 will be good as well, since a move action will not disturb its state. Conversely, if action_rover is amw, and rock_0 is good it is impossible for rock_1 to be bad as specified by lines 18–29.

Note that order matters here and it might be the source of some subtle bugs if overlooked. As mentioned before, the conditioning variables declared between the <Instance> tag (first three elements in line 7) correspond to the order they appear in the enclosing <Parent> tag, the last element corresponds to the variable being defined. One may arbitarily re-order the conditioning variables as long as they match-up within the <Parent> and <Instance> tags and the last element is always the identifier defined by <Var>. The convention that we adopt is to declare actions, fully observed variables followed by partially observed variables.

Example 10. Contents of Parameter type="TBL", within CondProb.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock_1</Var>
4.           <Parent>action_rover rover_0 rock_0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw s0 good good</Instance>
8.                    <ProbTable>1.0</ProbTable>
9.               </Entry>
10.              <Entry>
11.                   <Instance>amw s1 good good</Instance>
12.                   <ProbTable>1.0</ProbTable>
13.              </Entry>
14.              <Entry>
15.                   <Instance>amw s2 good good</Instance>
16.                   <ProbTable>1.0</ProbTable>
17.              </Entry>
18.              <Entry>
19.                   <Instance>amw s0 good bad</Instance>
20.                   <ProbTable>0.0</ProbTable>
21.              </Entry>
22.              <Entry>
23.                   <Instance>amw s1 good bad</Instance>
24.                   <ProbTable>0.0</ProbTable>
25.              </Entry>
26.              <Entry>
27.                   <Instance>amw s2 good bad</Instance>
28.                   <ProbTable>0.0</ProbTable>
29.              </Entry>
30.              <Entry>
31.                   <Instance>amw s0 bad good</Instance>
32.                   <ProbTable>0.0</ProbTable>
33.              </Entry>
34.              <Entry>
35.                   <Instance>amw s1 bad good</Instance>
36.                   <ProbTable>0.0</ProbTable>
37.              </Entry>
38.              <Entry>
39.                   <Instance>amw s2 bad good</Instance>
40.                   <ProbTable>0.0</ProbTable>
41.              </Entry>
42.              <Entry>
43.                   <Instance>amw s0 bad bad</Instance>
44.                   <ProbTable>1.0</ProbTable>
45.              </Entry>
46.              <Entry>
47.                   <Instance>amw s1 bad bad</Instance>
48.                   <ProbTable>1.0</ProbTable>
49.              </Entry>
50.              <Entry>
51.                   <Instance>amw s2 bad bad</Instance>
52.                   <ProbTable>1.0</ProbTable>
53.              </Entry>
54.          </Parameter>
55.      </CondProb>
56. </StateTransitionFunction>

It seems a bit daunting that it takes 56 lines just to declare the transition function for the rock for a simple 1 × 3 grid. And this only for the rover’s action of moving West. But XML is verbose by nature and that is the price to pay for interoperability and extensibility. However, PomdpX does provide several convenience features to ease the encoding task.

First and foremost, lines 18–41 are actually redundant since any entry not specified is assumed to be zero. Secondly, we observe that the first three <Entry> sets (lines 6–17) are very similar. They differ only in the state of rover_0 and s0 to s2 are all the possible states of the rover. In such a situation, we may use the wildcard character “*”, which means that this is true for all possible values that could appear here. Therefore, lines 6–17 could be replaced by just one <Entry> tag, this is true for lines 42–53 too. Example 10 is re-written more succinctly and shown as Example 11.

Example 11. Usage of wildcard character *.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw * good good</Instance>
8.                    <ProbTable>1.0</ProbTable>
9.               </Entry>
10.              <Entry>
11.                   <Instance>amw * bad bad</Instance>
12.                   <ProbTable>1.0</ProbTable>
13.              </Entry>
14.          </Parameter>
15.      </CondProb>
16. </StateTransitionFunction>

As some probabilities of the rock ’s transition are zero, they may be conveniently left out. However in certain cases, some variables may have all non-zero transition probabilities. PomdpX specifically provides another special character “-” to handle this. The “-” character means cycle through all possible values that could appear here and match the listed probabilities (in <ProbTable>) accordingly. Hence, Example 11 can also be expressed as:

Example 12. Usage of character -.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                     <Instance>amw * good - </Instance>
8.                     <ProbTable>1.0 0.0</ProbTable>
9.               </Entry>
10.              <Entry>
11.                    <Instance>amw * bad - </Instance>
12.                    <ProbTable>0.0 1.0</ProbTable>
13.              </Entry>
14.          </Parameter>
15.      </CondProb>
16. </StateTransitionFunction>

Although it is not obvious here, one can imagine if the entries were both non- zero, the use of “-” would save us from having to specify another set of <Entry> tag.

With the introduction of the “-” character, the first <Entry> set (lines 6–9) in Example 12 is in effect specifying the following:

P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = good) = 1.0 and P (rock 1 = bad|action rover = amw, rover 0 = ∗, rock 0 = good) = 0.0.

There is also an implicit ordering in Example 12. For instance, the usage of “-” for the first <Entry> set (lines 6–9), considers the possible values of rock to be good first then bad, hence the <ProbTable> entries are listed as (1.0 0.0) rather than (0.0 1.0). This “internal” order is actually taken from the way rock is declared in the <ValueEnum> tag (see Section 2.2.4), in which its possible values were declared to be first good then bad.

In the quest for further compression, there is a final modification we can make to Example 12. We make the observation that the two <Entry> sets seem some- what complementary differing only in the states of rock_0 and <ProbTable> entries. Thus employing the same trick for Example 12, we can replace the states of ''rock_0' with a “-”. This gives us Example 13.

Example 13. Usage of double -.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw * - - </Instance>
8.                    <ProbTable>1.0 0.0 0.0 1.0</ProbTable>
9.               </Entry>
10.          </Parameter>
11.      </CondProb>
12. </StateTransitionFunction>

By using double “-”, the single <Entry> set in Example 13 is equivalent to specifying the following:

P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = good) = 1.0
P (rock 1 = bad|action rover = amw, rover 0 = ∗, rock 0 = good) = 0.0
P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = bad) = 0.0
and
P (rock 1 = bad|action rover = amw, rover 0 = ∗, rock 0 = bad) = 1.0.

The <ProbTable> entries in Example 13 are in effect a 2 × 2 identity matrix. Hence our PomdpX format also allows for the keyword identity2 to be used in lieu of having to enumerate all the ones and zeros (like line 8). Therefore Examples 13 and 14 are functionally equivalent.

Example 14. Usage of keyword identity.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw * - - </Instance>
8.                    <ProbTable>identity</ProbTable>
9.               </Entry>
10.          </Parameter>
11.      </CondProb>
12. </StateTransitionFunction>

Another recognized keyword which may also be used in the <ProbTable> tags is uniform. This is equivalent to the probability 1/n repeated n times, where n is the number of possible values that could appear here. For example, the <Entry> tag below,

Example 15. Usage of keyword uniform.


<InitialStateBelief>
     <CondProb>
         <Var>rock 0</Var>
         <Parent>null</Parent>
         <Parameter type = "TBL">
             <Entry>
                  <Instance> - </Instance>
                  <ProbTable>uniform</ProbTable>
             </Entry>
         </Parameter>
     </CondProb>
</InitialStateBelief>

gives: P (rock 0 = good|∅) = 0.5 and P (rock 0 = bad|∅) = 0.5 , which specifies our initial belief that the rock has equal probability of being good or bad.

Besides being a child of the CondProb element, the <Parameter> tag may also appear as a child of the Func element which is used to define the reward function. In this case, the <Entry> tag within the <Parameter> must contain

the following:

  • <Instance> – declares values of all the variables for the reward function. Each variable value must correspond to the identifiers that appear between the enclosing <Parent> tag.
  • <ValueTable> – specifies the actual numerical reward.

Example 16 shows a snippet defining the reward function for the rover. In this example, the <Entry> specifies:

Rreward rover (action rover = ame, rover 0 = s1, rock 0 = ∗) = 10.

By now, the wildcard character “*” should be familiar to the user. Its use here denotes the fact that the rover will obtain a reward of 10 moving East from s1 (to the terminal state), regardless of whether the rock is good or bad.

Note that the characters “*” and “-” can be used in a similar manner as described in the previous sections. However, the keywords uniform and identity cannot appear between <ValueTable> tags, since those keywords only make sense for probabilities and not rewards.

We reiterate here that any probability or value entries of a function table which are not specified within a <Parameter> tag are assumed to be zero. Fur- thermore, a particular probability or value entry can also be specified more than once. The definition that appears last within a <Parameter> tag is the one that will take effect. This is convenient for specifying exceptions to a more general specification. The full compact version of the PomdpX input file for the RockSample problem with <Parameter type="TBL"> is given in Appendix A.

Appendix A

RockSample.pomdpx, type="TBL"

Full Specification of RockSample problem in PomdpX.

   <?xml version="1.0" encoding="ISO-8859-1"?>
   <pomdpx version="0.1" id="rockSample"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:noNamespaceSchemaLocation="pomdpx.xsd">
         <Description>RockSample problem for map size 1 x 3.
           Rock is at 0, Rover’s initial position is at 1.
           Exit is at 2.
         </Description>
         <Discount>0.95</Discount>
         <Variable>
              <StateVar vnamePrev="rover 0" vnameCurr="rover 1"
                fullyObs="true">
                  <NumValues>3</NumValues>
              </StateVar>
              <StateVar vnamePrev="rock 0" vnameCurr="rock 1">
                  <ValueEnum>good bad</ValueEnum>
              </StateVar>
              <ObsVar vname="obs sensor">
                  <ValueEnum>ogood obad</ValueEnum>
              </ObsVar>
              <ActionVar vname="action rover">
                  <ValueEnum>amw ame ac as</ValueEnum>
              </ActionVar>
              <RewardVar vname="reward rover" />
         </Variable>
         <InitialStateBelief>
              <CondProb>
                  <Var>rover 0</Var>
                  <Parent>null</Parent>
                  <Parameter type="TBL">
                        <Entry>
                            <Instance> - </Instance>
                            <ProbTable>0.0 1.0 0.0</ProbTable>
                        </Entry>
              </Parameter>
         </CondProb>
         <CondProb>
              <Var>rock 0</Var>
              <Parent>null</Parent>
              <Parameter type="TBL">
                  <Entry>
                      <Instance>-</Instance>
                      <ProbTable>uniform</ProbTable>
                  </Entry>
              </Parameter>
         </CondProb>
      </InitialStateBelief>
      <StateTransitionFunction>
          <CondProb>
              <Var>rover 1</Var>
              <Parent>action rover rover 0</Parent>
              <Parameter type="TBL">
                  <Entry>
                      <Instance>amw s0 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>amw s1 s0</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ame s0 s1</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ame s1 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ac s0 s0</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ac s1 s1</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>as s0 s0</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>as s1 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>* s2 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
           </Parameter>
       </CondProb>
       <CondProb>
           <Var>rock 1</Var>
           <Parent>action rover rover 0 rock 0</Parent>
           <Parameter>
               <Entry>
                   <Instance>amw * - - </Instance>
                   <ProbTable>1.0 0.0 0.0 1.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ame * - - </Instance>
                   <ProbTable>identity</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ac * - - </Instance>
                   <ProbTable>identity</ProbTable>
               </Entry>
               <Entry>
                   <Instance>as * - - </Instance>
                   <ProbTable>identity</ProbTable>
               </Entry>
               <Entry>
                   <Instance>as s0 * - </Instance>
                   <ProbTable>0.0 1.0</ProbTable>
               </Entry>
           </Parameter>
       </CondProb>
   </StateTransitionFunction>
   <ObsFunction>
       <CondProb>
           <Var>obs sensor</Var>
           <Parent>action rover rover 1 rock 1</Parent>
           <Parameter type="TBL">
               <Entry>
                   <Instance>amw * * - </Instance>
                   <ProbTable>1.0 0.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ame * * - </Instance>
                   <ProbTable>1.0 0.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>as * * - </Instance>
                   <ProbTable>1.0 0.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ac s0 - - </Instance>
                   <ProbTable>1.0 0.0 0.0 1.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ac s1 - - </Instance>
                   <ProbTable>0.8 0.2 0.2 0.8</ProbTable>
               </Entry>
                <Entry>
                     <Instance>ac s2 * - </Instance>
                     <ProbTable>1.0 0.0</ProbTable>
                </Entry>
            </Parameter>
        </CondProb>
    </ObsFunction>
    <RewardFunction>
        <Func>
            <Var>reward rover</Var>
            <Parent>action rover rover 0 rock 0</Parent>
            <Parameter type="TBL">
                <Entry>
                     <Instance>ame s1 *</Instance>
                     <ValueTable>10</ValueTable>
                </Entry>
                <Entry>
                     <Instance>amw s0 *</Instance>
                     <ValueTable>-100</ValueTable>
                </Entry>
                <Entry>
                     <Instance>as s1 *</Instance>
                     <ValueTable>-100</ValueTable>
                </Entry>
                <Entry>
                     <Instance>as s0 good</Instance>
                     <ValueTable>10</ValueTable>
                </Entry>
                <Entry>
                     <Instance>as s0 bad</Instance>
                     <ValueTable>-10</ValueTable>
                </Entry>
            </Parameter>
        </Func>
    </RewardFunction>
 </pomdpx>
November 20, 2009, at 01:04 PM by 172.18.179.173 -
Changed lines 7-10 from:
to:
November 20, 2009, at 01:02 PM by 172.18.179.173 -
Changed lines 7-10 from:
to:
November 20, 2009, at 12:48 PM by 172.18.179.173 -
Deleted lines 5-8:
November 20, 2009, at 12:47 PM by 172.18.179.173 -
Changed line 11 from:
to:
November 20, 2009, at 12:47 PM by 172.18.179.173 -
Changed line 11 from:
to:
November 20, 2009, at 12:46 PM by 172.18.179.173 -
Changed line 11 from:
to:
November 20, 2009, at 12:46 PM by 172.18.179.173 -
Changed line 11 from:
to:
November 20, 2009, at 12:41 PM by 172.18.179.173 -
Added line 4:
Added line 6:
Added line 8:
Added line 10:
November 20, 2009, at 12:40 PM by 172.18.179.173 -
Changed line 1 from:

PomdpX Documentation

to:

PomdpX Documentation

November 20, 2009, at 12:40 PM by 172.18.179.173 -
Changed lines 1-767 from:

(:toc:)

Overview

PomdpX is an XML file format for describing POMDPs (partially observable Markov decision processes), MOMDPs (mixed observability Markov decision processes)[1] and MDPs (Markov decision processes) in a factored representation. It allows multiple state, observation, action and reward variables to be specified in a model. The specified model must have at least one state, action and reward variable, while the observation variable is optional. Each state variable must be specified as either partially observed (default) or fully observed. Thus a PomdpX input document can specify a pomdp (all state variables partially observed), a momdp (mixture of partially observed and fully observed state variables), or an mdp (all state variables fully observed, no observation variables) problem. In general, the model can be represented by the dynamic Bayesian network (DBN) in Figure 1.1. Each of xt , yt , ot and at represents possibly multiple variables. xt represents fully observed state variables while yt represents partially observed state variables. (The reward variables are omitted to prevent clutter).


Figure 1.1 – The general model specified in a PomdpX document. Each of

xt , yt , ot and at represents multiple variables. The state (st ) is represented as multiple fully observed (xt ) and partially observed (yt ) state variables.

PomdpX Tutorial

The purpose of this section is to provide a tutorial-like approach to using the PomdpX format. We make no assumptions about the user’s familiarity with existing pomdp solvers.

Example Problem

We will be using a modified version of the RockSample problem, first proposed by Smith and Simmons [2] as our running example to encode into the PomdpX format. It models a rover on an exploration mission and it can achieve rewards by sampling rocks in its immediate area. Consider a map of size 1 × 3 as shown in Figure 2.1, with one rock at the left end and the terminal state at the right end. The rover starts off at the center and its possible actions are A = {West, East, Sample, Check}. The DBN for the RockSample problem is shown in Figure 2.2.


Figure 2.1 – The 1 × 3 RockSample problem world.

This is a trivial problem but is adequate to showcase the salient features of PomdpX. As with the original version of the problem, the Sample action samples the rock at the rover’s current location. If the rock is good, the rover receives a reward of 10 and the rock becomes bad. If the rock is bad, it receives a penalty of −10. Moving into the terminal area yields a reward of 10. A penalty of −100 is imposed for moving off the grid and sampling in a grid where there is no rock. All other moves have no cost or reward. The Check action returns a noisy observation from O = {Good, Bad}.


Figure 2.2 – Dynamic Bayesian network of the RockSample problem. The rover’s position is fully observed whereas the rock type is partially observed.

Example 1. A PomdpX document.


<?xml version="1.0" encoding="ISO-8859-1"?>
<pomdpx version="0.1" id="rockSample"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:noNamespaceSchemaLocation="pomdpx.xsd">
      <Description>             · · · </Description>
      <Discount>                · · · </Discount>
      <Variable>                · · · </Variable>
      <InitialStateBelief>      · · · </InitialStateBelief>
      <StateTransitionFunction> · · · </StateTransitionFunction>
      <ObsFunction>             · · · </ObsFunction>
      <RewardFunction>          · · · </RewardFunction>
</pomdpx>

File Format Structure

A PomdpX document consists of a header and a pomdpx root element which in turn contains child elements, as shown in Example 1 below. The first line of the document is an XML processing instruction which defines that the document adheres to the XML 1.0 standard and that the encoding of the document is ISO-8859-1. Other encodings such as UTF-8 are also possible.

2.2.1 <pomdpx> Tag

Continuing with the example above, the second line contains the root-element of a PomdpX document—the pomdpx element—which has the following attributes:

  • version
  • id – optional name for the specified model.
  • xmlns:xsi – defines xsi as the XML Schema namespace.
  • xsi:noNamespaceSchemaLocation – this is where we put our XML Schema

definition, pomdpx.xsd. The PomdpX input should be validated with this schema to ensure well-formedness.

The conventional ordering of the child elements is Description, Discount, Variable and thereafter: InitialStateBelief, StateTransitionFunction, ObsFunction and RewardFunction. However this ordering is not strictly re- quired and one may permute their orderings. Description is an optional, short description of the specified model. The other child elements specify the POMDP tuple (S, A, O, T , Z, R, γ) and the initial belief b0 .

In general these elements should all be present, and each can appear only once. ObsFunction may be omitted if there are no observation variables in the model. Similarly, InitialBeliefState may be omitted if all state variables are fully observed (for example an mdp model). pomdpx’s child elements are described in greater detail in the following subsections.

2.2.2 <Description> Tag

This is an optional tag that one may provide to give a brief description of the specified problem. For example:

Example 2. Contents of Description.


 <Description> RockSample problem for map size 1 x 3.
 Rock is at 0, Rover’s initial position is at 1.
 Exit is at 2.
 </Description>

2.2.3 <Discount> Tag

This specifies the discount factor γ. It has to be a real-valued number, for our RockSample problem, we will be using a discount factor of 0.95 and it is entered as shown:

Example 3. Contents of Discount.


 <Discount> 0.95 </Discount>

2.2.4 <Variable> Tag

The state, action and observation variables which factorize the state S, action A, and observation O spaces are declared within the Variable element. Reward variables, R are also declared here. Example 4 gives the declaration of the variables for the RockSample problem.

Each state variable is declared with the <StateVar> tag. It contains the following attributes:

  • vnamePrev – identifier for the variable’s start state.
  • vnameCurr – identifier for the variable’s end state.
  • fullyObs – set to true if the variable is fully observed. The default is false. Thus for the variable rock in Example 4, it is partially observed, as implied by the omission of the fullyObs attribute.

Example 4. Variable declaration. Defining S, A, O, and R variables.


<Variable>
     <StateVar vnamePrev="rover_0" vnameCurr="rover_1"
      fullyObs="true">
         <NumValues>3</NumValues>
     </StateVar>
     <StateVar vnamePrev="rock_0" vnameCurr="rock_1>"
         <ValueEnum>good bad</ValueEnum>
     </StateVar>
     <ObsVar vname="obs sensor">
         <ValueEnum>ogood obad</ValueEnum>
     </ObsVar>
     <ActionVar vname="action_rover">
         <ValueEnum>amw ame ac as</ValueEnum>
     </ActionVar>
     <RewardVar vname="reward rover" />
</Variable>

The possible values that a variable can assume are either specified with regards to the <NumValues> or <ValueEnum> tags. In the former, we would give an inte- ger to indicate the number of values/states for the variable. For instance, in the example, the rover is declared with three possible values. The values are sub- sequently referenced internally using numerals, starting from 0 and prepended with ‘s’. Hence the states for the rover variable would be s0, s1 and s2. When using <NumValues> it is up to the user to attach semantic meaning to the values, in our example, s0 denotes the left grid, s1 the center and s2 the right terminal grid.

In the latter, the user will have to manually enumerate all the possible values/states the variable may take on. In our example, the rock has two possible values, it is either good or bad.

The observation and action variables are also declared similarly with the <ObsVar> and <ActionVar> tags respectively. Both require the attribute vname which serves as the identifier for the variable. The possible values that an observation or action can assume can also be specified with either <NumValues> or <ValueEnum>. If <NumValues> is used, ‘o’ and ‘a’ would be prepended to the values of observation and action variables respectively.

In the case of <ValueEnum>, the user will once again need to enumerate all possible values/states manually. In our example, for the action_rover variable, we enumerate all the four possible actions. ‘amw’ is a mnemonic for action move west and ‘ac’ stands for action check and so on.

Finally, reward variables are declared with the <RewardVar> tags which must contain the vname attribute. The vname serves as an identifier for the reward variable. The <RewardVar> is an empty XML tag and no values are specified. Note that we may use the XML shorthand of <RewardVar vname="· · · " /> to close an empty tag here.

2.2.5 <InitialStateBelief> Tag This is an optional tag. It specifies the initial belief b0 , and may be omitted if all state variables are fully observed. The PomdpX format allows the initial belief to be specified as multiple multiplicative factors, with each <CondProb> tag specifying one of these factors. From our running RockSample problem, since the initial belief is not conditional on anything, it is factored as b0 = P (rover_0|∅)P (rock_0|∅). We will need two <CondProb> tags to specify it fully as shown below.

Example 5. Contents of InitialStateBelief.


  <InitialStateBelief>
      <CondProb>
         <Var>rover_0</Var>
         <Parent>null</Parent>
         <Parameter> · · · </Parameter>
      </CondProb>
      <CondProb>
         <Var>rock_0</Var>
         <Parent>null</Parent>
         <Parameter> · · · </Parameter>
      </CondProb>
   </InitialStateBelief>

The <CondProb> tag has no attributes and require the following three children tags:

  • <Var> – identifies the factor being specified. Only identifiers declared as vnamePrev of state variables are allowed here (see Section 2.2.4).
  • <Parent> – the set of conditioning variables. Only identifiers declared as vnamePrev or vnameCurr of state variables are allowed here. The previous statement is actually slightly misleading, as PomdpX allows certain combinations of vnamePrev and vnameCurr identifiers. Referring to Figure 1.1, we only allow conditioning arrows from xt (fully observed variables) to yt (partially observed variables) and not the other way round. Specifically, a vnameCurr identifier is allowed as parent only if the variable is fully observed. In addition, the keyword null may be used to signify the absence of any vconditioning variables.
  • <Parameter> – specifies the actual probabilities in the factor and is described in detail in Section 2.3.

The previous example is somewhat cumbersome to declare if we have too many state variables. We could have alternatively specified b0 as simply the joint belief of all state variables, P (rover_0, rock_0), with a single <CondProb> tag as shown in Example 6.

Example 6. Initial joint belief specification.


<InitialStateBelief>
     <CondProb>
         <Var>rover_0 rock_0</Var>
         <Parent>null</Parent>
         <Parameter> · · · </Parameter>
     </CondProb>
</InitialStateBelief>

2.2.6 <StateTransitionFunction> Tag

This specifies the transition function T , which in general is the multiplicative result of the individual transition functions of each state variable in the model. Each <CondProb> tag specifies the transition function for each state variable. For our RockSample problem, with reference to Figure 2.2, the overall transition function is: P (rover_1, rock_1|action_rover, rover_0, rock_0) = P (rover_1|action_rover, rover_0) × P (rock_1|action_rover, rover_0, rock_0).

This is translated to the following in PomdpX. One can see that it is very similar to its equational counterpart, only it has XML tags wrapped around it. We need to provide two CondProb elements, one each for the variable rover and rock.

Example 7. Contents of StateTransitionFunction.


 <StateTransitionFunction>
      <CondProb>
           <Var>rover_1</Var>
           <Parent>action_rover rover_0</Parent>
           <Parameter> · · · </Parameter>
      </CondProb>
      <CondProb>
           <Var>rock_1</Var>
           <Parent>action_rover rover_0 rock_0</Parent>
           <Parameter> · · · </Parameter>
      </CondProb>
 </StateTransitionFunction>

As described in 2.2.5, the <Var> tag identifies the state variable whose transition function is being specified. In this case, only identifiers declared as the vnameCurr attribute of state variables may be allowed here.

The identifiers within the <Parent> tag identify the conditioning variables in the transition function. They may be identifiers which had been declared as either the vnamePrev or vnameCurr attributes of state variables, or identifiers which had been declared as the vname attribute of action variables (see Section 2.2.4). Once again, we point out the caveat that PomdpX only allows certain combinations of vnamePrev and vnameCurr. One may only use vnameCurr identifiers within the <Parent> tag if the variable is fully observed. We defer the description of <Parameter> tag to Section 2.3 as it is fairly involved.

2.2.7 <ObsFunction> Tag

This specifies the observation function Z, which in general is the multiplicative result of the individual observation functions of each observation variable in the model. Each <CondProb> tag specifies one of these individual observation functions. In the RockSample problem, the probability of an observation is conditional on taking an action and ending in a new state. Thus its parents are action_rover, rover_1 and rock_1, as given in Example 8.

Example 8. Contents of ObsFunction.


 <ObsFunction>
       <CondProb>
            <Var>obs sensor</Var>
            <Parent>action_rover rover_1 rock_1</Parent>
            <Parameter> · · · </Parameter>
       </CondProb>
 </ObsFunction>

For each CondProb element, the identifier within the <Var> tags identifies the observation variable whose observation function is being specified. The identifiers within the <Parent> tags identifies the conditioning variables in the observation function. Identifiers that appear within the <Var> tags must be identifiers which had been declared as the vname attribute of observation vari- ables. Identifiers that appear within the <Parent> tags must be identifiers which had been declared as the vnameCurr attribute of state variables, or the vname attribute of action variables (see Section 2.2.4). Parameter specifies the actual probabilities in the function and will be described in Section 2.3.

2.2.8 <RewardFunction> Tag

This specifies the reward function R, which in general is the additive result of the individual reward functions of each reward variable in the model. Each <Func> tag specifies one of these individual reward functions. For our RockSample problem, the reward depends on the action taken at the current state, thus its parents are action_rover, rover_0 and rock_0. This is shown in Example 9.

Example 9. Contents of RewardFunction.


 <RewardFunction>
       <Func>
            <Var>reward rover</Var>
            <Parent>action_rover rover_0 rock_0</Parent>
            <Parameter> · · · </Parameter>
       </Func>
 </RewardFunction>

Similar to the <CondProd> tag, the <Func> tag has no attributes and requires the following three children tags to be defined:

  • <Var> – this identifies the reward variable whose reward function is being specified. Only identifiers that had been declared as the vname attribute

of reward variables may appear here.

  • <Parent> – this identifies the domain of the reward function. All identifiers declared as vnamePrev or vnameCurr attributes of state variables, vname attribute of action variables or vname attribute of observation variables are allowed here.
  • <Parameter> – specifies the actual values in the function and is described in detail in Section 2.3.

<Parameter> Tag

The <Parameter> tag is a fairly complicated component of PomdpX, introducing several new keywords and symbols, thus it warrants an individual section in itself. It has an optional attribute called type, which has possible values TBL (default) and DD, short for table and decision diagram, respectively. We will describe how to encode the RockSample problem both in TBL and DD.

2.3.1 Table Type (TBL)

When the <Parameter> tag appears as a child of a CondProb element, it must contain <Entry> child tags. Each Entry element specifies the probability entry of a function table. The <Entry> tag itself must consist of the following:

  • <Instance> – declares all the variables for the probability function. Each variable value must correspond to the identifiers that appear between the enclosing <Parent> tag, followed by the identifier that appears between the enclosing <Var> tag.
  • <ProbTable> – specifies the actual numerical values of the probabilities. This is best illustrated by Example 10 below. With reference to Figure 2.2, we show the full encoding of the rock ’s transition function for the rover ’s action of moving West. From the example, the <Var> tag declares that we are defining the transition function for the variable rock (line 3). It is conditional on action_rover, rover_0 and rock_0, which appear between the <Parent> tag

(line 4). The first <Entry> set (lines 6–9) specifies:

P (rock_1 = good|action_rover = amw, rover_0 = s0, rock_0 = good) = 1.0.

In this case, when action_rover is amw, and rock_0 is good, rock_1 will be good as well, since a move action will not disturb its state. Conversely, if action_rover is amw, and rock_0 is good it is impossible for rock_1 to be bad as specified by lines 18–29.

Note that order matters here and it might be the source of some subtle bugs if overlooked. As mentioned before, the conditioning variables declared between the <Instance> tag (first three elements in line 7) correspond to the order they appear in the enclosing <Parent> tag, the last element corresponds to the variable being defined. One may arbitarily re-order the conditioning variables as long as they match-up within the <Parent> and <Instance> tags and the last element is always the identifier defined by <Var>. The convention that we adopt is to declare actions, fully observed variables followed by partially observed variables.

Example 10. Contents of Parameter type="TBL", within CondProb.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock_1</Var>
4.           <Parent>action_rover rover_0 rock_0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw s0 good good</Instance>
8.                    <ProbTable>1.0</ProbTable>
9.               </Entry>
10.              <Entry>
11.                   <Instance>amw s1 good good</Instance>
12.                   <ProbTable>1.0</ProbTable>
13.              </Entry>
14.              <Entry>
15.                   <Instance>amw s2 good good</Instance>
16.                   <ProbTable>1.0</ProbTable>
17.              </Entry>
18.              <Entry>
19.                   <Instance>amw s0 good bad</Instance>
20.                   <ProbTable>0.0</ProbTable>
21.              </Entry>
22.              <Entry>
23.                   <Instance>amw s1 good bad</Instance>
24.                   <ProbTable>0.0</ProbTable>
25.              </Entry>
26.              <Entry>
27.                   <Instance>amw s2 good bad</Instance>
28.                   <ProbTable>0.0</ProbTable>
29.              </Entry>
30.              <Entry>
31.                   <Instance>amw s0 bad good</Instance>
32.                   <ProbTable>0.0</ProbTable>
33.              </Entry>
34.              <Entry>
35.                   <Instance>amw s1 bad good</Instance>
36.                   <ProbTable>0.0</ProbTable>
37.              </Entry>
38.              <Entry>
39.                   <Instance>amw s2 bad good</Instance>
40.                   <ProbTable>0.0</ProbTable>
41.              </Entry>
42.              <Entry>
43.                   <Instance>amw s0 bad bad</Instance>
44.                   <ProbTable>1.0</ProbTable>
45.              </Entry>
46.              <Entry>
47.                   <Instance>amw s1 bad bad</Instance>
48.                   <ProbTable>1.0</ProbTable>
49.              </Entry>
50.              <Entry>
51.                   <Instance>amw s2 bad bad</Instance>
52.                   <ProbTable>1.0</ProbTable>
53.              </Entry>
54.          </Parameter>
55.      </CondProb>
56. </StateTransitionFunction>

It seems a bit daunting that it takes 56 lines just to declare the transition function for the rock for a simple 1 × 3 grid. And this only for the rover’s action of moving West. But XML is verbose by nature and that is the price to pay for interoperability and extensibility. However, PomdpX does provide several convenience features to ease the encoding task.

First and foremost, lines 18–41 are actually redundant since any entry not specified is assumed to be zero. Secondly, we observe that the first three <Entry> sets (lines 6–17) are very similar. They differ only in the state of rover_0 and s0 to s2 are all the possible states of the rover. In such a situation, we may use the wildcard character “*”, which means that this is true for all possible values that could appear here. Therefore, lines 6–17 could be replaced by just one <Entry> tag, this is true for lines 42–53 too. Example 10 is re-written more succinctly and shown as Example 11.

Example 11. Usage of wildcard character *.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw * good good</Instance>
8.                    <ProbTable>1.0</ProbTable>
9.               </Entry>
10.              <Entry>
11.                   <Instance>amw * bad bad</Instance>
12.                   <ProbTable>1.0</ProbTable>
13.              </Entry>
14.          </Parameter>
15.      </CondProb>
16. </StateTransitionFunction>

As some probabilities of the rock ’s transition are zero, they may be conveniently left out. However in certain cases, some variables may have all non-zero transition probabilities. PomdpX specifically provides another special character “-” to handle this. The “-” character means cycle through all possible values that could appear here and match the listed probabilities (in <ProbTable>) accordingly. Hence, Example 11 can also be expressed as:

Example 12. Usage of character -.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                     <Instance>amw * good - </Instance>
8.                     <ProbTable>1.0 0.0</ProbTable>
9.               </Entry>
10.              <Entry>
11.                    <Instance>amw * bad - </Instance>
12.                    <ProbTable>0.0 1.0</ProbTable>
13.              </Entry>
14.          </Parameter>
15.      </CondProb>
16. </StateTransitionFunction>

Although it is not obvious here, one can imagine if the entries were both non- zero, the use of “-” would save us from having to specify another set of <Entry> tag.

With the introduction of the “-” character, the first <Entry> set (lines 6–9) in Example 12 is in effect specifying the following:

P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = good) = 1.0 and P (rock 1 = bad|action rover = amw, rover 0 = ∗, rock 0 = good) = 0.0.

There is also an implicit ordering in Example 12. For instance, the usage of “-” for the first <Entry> set (lines 6–9), considers the possible values of rock to be good first then bad, hence the <ProbTable> entries are listed as (1.0 0.0) rather than (0.0 1.0). This “internal” order is actually taken from the way rock is declared in the <ValueEnum> tag (see Section 2.2.4), in which its possible values were declared to be first good then bad.

In the quest for further compression, there is a final modification we can make to Example 12. We make the observation that the two <Entry> sets seem some- what complementary differing only in the states of rock_0 and <ProbTable> entries. Thus employing the same trick for Example 12, we can replace the states of ''rock_0' with a “-”. This gives us Example 13.

Example 13. Usage of double -.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw * - - </Instance>
8.                    <ProbTable>1.0 0.0 0.0 1.0</ProbTable>
9.               </Entry>
10.          </Parameter>
11.      </CondProb>
12. </StateTransitionFunction>

By using double “-”, the single <Entry> set in Example 13 is equivalent to specifying the following:

P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = good) = 1.0
P (rock 1 = bad|action rover = amw, rover 0 = ∗, rock 0 = good) = 0.0
P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = bad) = 0.0
and
P (rock 1 = bad|action rover = amw, rover 0 = ∗, rock 0 = bad) = 1.0.

The <ProbTable> entries in Example 13 are in effect a 2 × 2 identity matrix. Hence our PomdpX format also allows for the keyword identity2 to be used in lieu of having to enumerate all the ones and zeros (like line 8). Therefore Examples 13 and 14 are functionally equivalent.

Example 14. Usage of keyword identity.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw * - - </Instance>
8.                    <ProbTable>identity</ProbTable>
9.               </Entry>
10.          </Parameter>
11.      </CondProb>
12. </StateTransitionFunction>

Another recognized keyword which may also be used in the <ProbTable> tags is uniform. This is equivalent to the probability 1/n repeated n times, where n is the number of possible values that could appear here. For example, the <Entry> tag below,

Example 15. Usage of keyword uniform.


<InitialStateBelief>
     <CondProb>
         <Var>rock 0</Var>
         <Parent>null</Parent>
         <Parameter type = "TBL">
             <Entry>
                  <Instance> - </Instance>
                  <ProbTable>uniform</ProbTable>
             </Entry>
         </Parameter>
     </CondProb>
</InitialStateBelief>

gives: P (rock 0 = good|∅) = 0.5 and P (rock 0 = bad|∅) = 0.5 , which specifies our initial belief that the rock has equal probability of being good or bad.

Besides being a child of the CondProb element, the <Parameter> tag may also appear as a child of the Func element which is used to define the reward function. In this case, the <Entry> tag within the <Parameter> must contain

the following:

  • <Instance> – declares values of all the variables for the reward function. Each variable value must correspond to the identifiers that appear between the enclosing <Parent> tag.
  • <ValueTable> – specifies the actual numerical reward.

Example 16 shows a snippet defining the reward function for the rover. In this example, the <Entry> specifies:

Rreward rover (action rover = ame, rover 0 = s1, rock 0 = ∗) = 10.

By now, the wildcard character “*” should be familiar to the user. Its use here denotes the fact that the rover will obtain a reward of 10 moving East from s1 (to the terminal state), regardless of whether the rock is good or bad.

Note that the characters “*” and “-” can be used in a similar manner as described in the previous sections. However, the keywords uniform and identity cannot appear between <ValueTable> tags, since those keywords only make sense for probabilities and not rewards.

We reiterate here that any probability or value entries of a function table which are not specified within a <Parameter> tag are assumed to be zero. Fur- thermore, a particular probability or value entry can also be specified more than once. The definition that appears last within a <Parameter> tag is the one that will take effect. This is convenient for specifying exceptions to a more general specification. The full compact version of the PomdpX input file for the RockSample problem with <Parameter type="TBL"> is given in Appendix A.

Appendix A

RockSample.pomdpx, type="TBL"

Full Specification of RockSample problem in PomdpX.

   <?xml version="1.0" encoding="ISO-8859-1"?>
   <pomdpx version="0.1" id="rockSample"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:noNamespaceSchemaLocation="pomdpx.xsd">
         <Description>RockSample problem for map size 1 x 3.
           Rock is at 0, Rover’s initial position is at 1.
           Exit is at 2.
         </Description>
         <Discount>0.95</Discount>
         <Variable>
              <StateVar vnamePrev="rover 0" vnameCurr="rover 1"
                fullyObs="true">
                  <NumValues>3</NumValues>
              </StateVar>
              <StateVar vnamePrev="rock 0" vnameCurr="rock 1">
                  <ValueEnum>good bad</ValueEnum>
              </StateVar>
              <ObsVar vname="obs sensor">
                  <ValueEnum>ogood obad</ValueEnum>
              </ObsVar>
              <ActionVar vname="action rover">
                  <ValueEnum>amw ame ac as</ValueEnum>
              </ActionVar>
              <RewardVar vname="reward rover" />
         </Variable>
         <InitialStateBelief>
              <CondProb>
                  <Var>rover 0</Var>
                  <Parent>null</Parent>
                  <Parameter type="TBL">
                        <Entry>
                            <Instance> - </Instance>
                            <ProbTable>0.0 1.0 0.0</ProbTable>
                        </Entry>
              </Parameter>
         </CondProb>
         <CondProb>
              <Var>rock 0</Var>
              <Parent>null</Parent>
              <Parameter type="TBL">
                  <Entry>
                      <Instance>-</Instance>
                      <ProbTable>uniform</ProbTable>
                  </Entry>
              </Parameter>
         </CondProb>
      </InitialStateBelief>
      <StateTransitionFunction>
          <CondProb>
              <Var>rover 1</Var>
              <Parent>action rover rover 0</Parent>
              <Parameter type="TBL">
                  <Entry>
                      <Instance>amw s0 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>amw s1 s0</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ame s0 s1</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ame s1 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ac s0 s0</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ac s1 s1</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>as s0 s0</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>as s1 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>* s2 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
           </Parameter>
       </CondProb>
       <CondProb>
           <Var>rock 1</Var>
           <Parent>action rover rover 0 rock 0</Parent>
           <Parameter>
               <Entry>
                   <Instance>amw * - - </Instance>
                   <ProbTable>1.0 0.0 0.0 1.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ame * - - </Instance>
                   <ProbTable>identity</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ac * - - </Instance>
                   <ProbTable>identity</ProbTable>
               </Entry>
               <Entry>
                   <Instance>as * - - </Instance>
                   <ProbTable>identity</ProbTable>
               </Entry>
               <Entry>
                   <Instance>as s0 * - </Instance>
                   <ProbTable>0.0 1.0</ProbTable>
               </Entry>
           </Parameter>
       </CondProb>
   </StateTransitionFunction>
   <ObsFunction>
       <CondProb>
           <Var>obs sensor</Var>
           <Parent>action rover rover 1 rock 1</Parent>
           <Parameter type="TBL">
               <Entry>
                   <Instance>amw * * - </Instance>
                   <ProbTable>1.0 0.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ame * * - </Instance>
                   <ProbTable>1.0 0.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>as * * - </Instance>
                   <ProbTable>1.0 0.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ac s0 - - </Instance>
                   <ProbTable>1.0 0.0 0.0 1.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ac s1 - - </Instance>
                   <ProbTable>0.8 0.2 0.2 0.8</ProbTable>
               </Entry>
                <Entry>
                     <Instance>ac s2 * - </Instance>
                     <ProbTable>1.0 0.0</ProbTable>
                </Entry>
            </Parameter>
        </CondProb>
    </ObsFunction>
    <RewardFunction>
        <Func>
            <Var>reward rover</Var>
            <Parent>action rover rover 0 rock 0</Parent>
            <Parameter type="TBL">
                <Entry>
                     <Instance>ame s1 *</Instance>
                     <ValueTable>10</ValueTable>
                </Entry>
                <Entry>
                     <Instance>amw s0 *</Instance>
                     <ValueTable>-100</ValueTable>
                </Entry>
                <Entry>
                     <Instance>as s1 *</Instance>
                     <ValueTable>-100</ValueTable>
                </Entry>
                <Entry>
                     <Instance>as s0 good</Instance>
                     <ValueTable>10</ValueTable>
                </Entry>
                <Entry>
                     <Instance>as s0 bad</Instance>
                     <ValueTable>-10</ValueTable>
                </Entry>
            </Parameter>
        </Func>
    </RewardFunction>
 </pomdpx>
to:
November 20, 2009, at 12:27 PM by 172.18.179.173 -
Changed line 1 from:

(::toc::)

to:

(:toc:)

November 20, 2009, at 12:27 PM by 172.18.179.173 -
November 20, 2009, at 12:25 PM by 172.18.179.173 -
Changed line 1 from:

(:toc:)

to:

(::toc::)

November 19, 2009, at 10:05 PM by 220.255.134.72 -
Changed line 585 from:

Appendix A

to:

Appendix A

September 23, 2009, at 09:49 AM by 172.18.178.201 -
Changed line 66 from:
      <RewardFunction>         · · · </RewardFunction>
to:
      <RewardFunction>          · · · </RewardFunction>
September 22, 2009, at 06:00 PM by 172.18.178.220 -
Changed line 767 from:

</pomdpx>

to:
 </pomdpx>
September 18, 2009, at 02:11 PM by 172.18.179.74 -
Changed line 585 from:

Appendix A

to:

Appendix A

September 18, 2009, at 02:11 PM by 172.18.179.74 -
Added lines 583-767:

Appendix A

RockSample.pomdpx, type="TBL"

Full Specification of RockSample problem in PomdpX.

   <?xml version="1.0" encoding="ISO-8859-1"?>
   <pomdpx version="0.1" id="rockSample"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:noNamespaceSchemaLocation="pomdpx.xsd">
         <Description>RockSample problem for map size 1 x 3.
           Rock is at 0, Rover’s initial position is at 1.
           Exit is at 2.
         </Description>
         <Discount>0.95</Discount>
         <Variable>
              <StateVar vnamePrev="rover 0" vnameCurr="rover 1"
                fullyObs="true">
                  <NumValues>3</NumValues>
              </StateVar>
              <StateVar vnamePrev="rock 0" vnameCurr="rock 1">
                  <ValueEnum>good bad</ValueEnum>
              </StateVar>
              <ObsVar vname="obs sensor">
                  <ValueEnum>ogood obad</ValueEnum>
              </ObsVar>
              <ActionVar vname="action rover">
                  <ValueEnum>amw ame ac as</ValueEnum>
              </ActionVar>
              <RewardVar vname="reward rover" />
         </Variable>
         <InitialStateBelief>
              <CondProb>
                  <Var>rover 0</Var>
                  <Parent>null</Parent>
                  <Parameter type="TBL">
                        <Entry>
                            <Instance> - </Instance>
                            <ProbTable>0.0 1.0 0.0</ProbTable>
                        </Entry>
              </Parameter>
         </CondProb>
         <CondProb>
              <Var>rock 0</Var>
              <Parent>null</Parent>
              <Parameter type="TBL">
                  <Entry>
                      <Instance>-</Instance>
                      <ProbTable>uniform</ProbTable>
                  </Entry>
              </Parameter>
         </CondProb>
      </InitialStateBelief>
      <StateTransitionFunction>
          <CondProb>
              <Var>rover 1</Var>
              <Parent>action rover rover 0</Parent>
              <Parameter type="TBL">
                  <Entry>
                      <Instance>amw s0 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>amw s1 s0</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ame s0 s1</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ame s1 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ac s0 s0</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>ac s1 s1</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>as s0 s0</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>as s1 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
                  <Entry>
                      <Instance>* s2 s2</Instance>
                      <ProbTable>1.0</ProbTable>
                  </Entry>
           </Parameter>
       </CondProb>
       <CondProb>
           <Var>rock 1</Var>
           <Parent>action rover rover 0 rock 0</Parent>
           <Parameter>
               <Entry>
                   <Instance>amw * - - </Instance>
                   <ProbTable>1.0 0.0 0.0 1.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ame * - - </Instance>
                   <ProbTable>identity</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ac * - - </Instance>
                   <ProbTable>identity</ProbTable>
               </Entry>
               <Entry>
                   <Instance>as * - - </Instance>
                   <ProbTable>identity</ProbTable>
               </Entry>
               <Entry>
                   <Instance>as s0 * - </Instance>
                   <ProbTable>0.0 1.0</ProbTable>
               </Entry>
           </Parameter>
       </CondProb>
   </StateTransitionFunction>
   <ObsFunction>
       <CondProb>
           <Var>obs sensor</Var>
           <Parent>action rover rover 1 rock 1</Parent>
           <Parameter type="TBL">
               <Entry>
                   <Instance>amw * * - </Instance>
                   <ProbTable>1.0 0.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ame * * - </Instance>
                   <ProbTable>1.0 0.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>as * * - </Instance>
                   <ProbTable>1.0 0.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ac s0 - - </Instance>
                   <ProbTable>1.0 0.0 0.0 1.0</ProbTable>
               </Entry>
               <Entry>
                   <Instance>ac s1 - - </Instance>
                   <ProbTable>0.8 0.2 0.2 0.8</ProbTable>
               </Entry>
                <Entry>
                     <Instance>ac s2 * - </Instance>
                     <ProbTable>1.0 0.0</ProbTable>
                </Entry>
            </Parameter>
        </CondProb>
    </ObsFunction>
    <RewardFunction>
        <Func>
            <Var>reward rover</Var>
            <Parent>action rover rover 0 rock 0</Parent>
            <Parameter type="TBL">
                <Entry>
                     <Instance>ame s1 *</Instance>
                     <ValueTable>10</ValueTable>
                </Entry>
                <Entry>
                     <Instance>amw s0 *</Instance>
                     <ValueTable>-100</ValueTable>
                </Entry>
                <Entry>
                     <Instance>as s1 *</Instance>
                     <ValueTable>-100</ValueTable>
                </Entry>
                <Entry>
                     <Instance>as s0 good</Instance>
                     <ValueTable>10</ValueTable>
                </Entry>
                <Entry>
                     <Instance>as s0 bad</Instance>
                     <ValueTable>-10</ValueTable>
                </Entry>
            </Parameter>
        </Func>
    </RewardFunction>

</pomdpx>

September 18, 2009, at 01:56 PM by 172.18.179.74 -
Changed lines 507-510 from:

P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = good) = 1.0 P (rock 1 = bad|action rover = amw, rover 0 = ∗, rock 0 = good) = 0.0 P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = bad) = 0.0 and

to:

P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = good) = 1.0
P (rock 1 = bad|action rover = amw, rover 0 = ∗, rock 0 = good) = 0.0
P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = bad) = 0.0
and\\

September 18, 2009, at 01:51 PM by 172.18.179.74 -
Changed line 554 from:

gives: P (rock 0 = good|∅) = 1 and P (rock 0 = bad|∅) = 2 , which specifies our

to:

gives: P (rock 0 = good|∅) = 0.5 and P (rock 0 = bad|∅) = 0.5 , which specifies our

September 18, 2009, at 01:45 PM by 172.18.179.74 -
Changed lines 50-51 from:

Figure 2.2 – Dynamic Bayesian network of the RockSample problem. The

rover’s position is fully observed whereas the rock type is partially observed.

to:

Figure 2.2 – Dynamic Bayesian network of the RockSample problem. The rover’s position is fully observed whereas the rock type is partially observed.
Changed lines 323-325 from:

itself. It has an optional attribute called type, which has possible values TBL (default) and DD, short for table and decision diagram, respectively. We will describe how to encode the RockSample problem both in TBL and DD.

to:

itself. It has an optional attribute called type, which has possible values TBL (default) and DD, short for table and decision diagram, respectively. We will describe how to encode the RockSample problem both in TBL and DD.

September 18, 2009, at 01:32 PM by 172.18.179.74 -
Changed line 229 from:

This specifies the transition function '_T , which in general is the multiplicative

to:

This specifies the transition function T , which in general is the multiplicative

Changed line 270 from:

This specifies the observation function Z, which in general is the multiplicative

to:

This specifies the observation function Z, which in general is the multiplicative

Changed line 277 from:

Example 8. Contents of ObsFunction.

to:

Example 8. Contents of ObsFunction.

Changed line 304 from:

Example 9. Contents of RewardFunction.

to:

Example 9. Contents of RewardFunction.

September 18, 2009, at 01:29 PM by 172.18.179.74 -
Changed line 242 from:

Example 7. Contents of StateTransitionFunction.

to:

Example 7. Contents of StateTransitionFunction.

September 18, 2009, at 01:27 PM by 172.18.179.74 -
Changed line 229 from:

This specifies the transition function T , which in general is the multiplicative

to:

This specifies the transition function '_T , which in general is the multiplicative

Changed lines 234-235 from:
  P (rover_1, rock_1|action_rover, rover_0, rock_0) =
  P (rover_1|action_rover, rover_0) × P (rock_1|action_rover, rover_0, rock_0).
to:

P (rover_1, rock_1|action_rover, rover_0, rock_0) = P (rover_1|action_rover, rover_0) × P (rock_1|action_rover, rover_0, rock_0).

September 18, 2009, at 01:26 PM by 172.18.179.74 -
Changed line 218 from:

Example 6. Initial joint belief specification.

to:

Example 6. Initial joint belief specification.

Changed line 228 from:

2.2.6 <StateTransitionFunction> Tag

to:

2.2.6 <StateTransitionFunction> Tag

September 18, 2009, at 01:12 PM by 172.18.179.74 -
Changed lines 210-212 from:
  • <Parent> – the set of conditioning variables. Only identifiers declared as vnamePrev or vnameCurr of state variables are allowed here. The previous statement is actually slightly misleading, as PomdpX allows certain combinations of vnamePrev and vnameCurr identifiers. Referring to Figure 1.1, we only allow conditioning arrows from xt (fully observed variables) to yt (partially observed variables) and not the other way round. Specifically, a vnameCurr identifier is allowed as parent only if the variable is fully

observed. In addition, the keyword null may be used to signify the absence of any conditioning variables.

to:
  • <Parent> – the set of conditioning variables. Only identifiers declared as vnamePrev or vnameCurr of state variables are allowed here. The previous statement is actually slightly misleading, as PomdpX allows certain combinations of vnamePrev and vnameCurr identifiers. Referring to Figure 1.1, we only allow conditioning arrows from xt (fully observed variables) to yt (partially observed variables) and not the other way round. Specifically, a vnameCurr identifier is allowed as parent only if the variable is fully observed. In addition, the keyword null may be used to signify the absence of any vconditioning variables.
September 18, 2009, at 01:11 PM by 172.18.179.74 -
Changed lines 211-212 from:

observed. In addition, the keyword null may be used to signify the absence of any

to:

observed. In addition, the keyword null may be used to signify the absence of any

Changed lines 233-234 from:

Each <CondProb> tag specifies the transition function for each state variable. For our RockSample problem, with reference to Figure 2.2, the overall transition

to:

Each <CondProb> tag specifies the transition function for each state variable. For our RockSample problem, with reference to Figure 2.2, the overall transition

September 18, 2009, at 01:10 PM by 172.18.179.74 -
Changed lines 130-131 from:
  • fullyObs – set to true if the variable is fully observed. The default is false. Thus for the variable rock in Example 4, it is partially observed, as implied by the omission of the fullyObs attribute.
to:
  • fullyObs – set to true if the variable is fully observed. The default is false. Thus for the variable rock in Example 4, it is partially observed, as implied by the omission of the fullyObs attribute.
Changed lines 186-189 from:

belief to be specified as multiple multiplicative factors, with each <CondProb> tag specifying one of these factors. From our running RockSample problem, since the initial belief is not conditional on anything, it is factored as b0 = P (rover_0|∅)P (rock_0|∅). We will need two <CondProb> tags to specify it fully

to:

belief to be specified as multiple multiplicative factors, with each <CondProb> tag specifying one of these factors. From our running RockSample problem, since the initial belief is not conditional on anything, it is factored as b0 = P (rover_0|∅)P (rock_0|∅). We will need two <CondProb> tags to specify it fully

September 18, 2009, at 01:04 PM by 172.18.179.74 -
Changed lines 6-7 from:

PomdpX is an XML file format for describing pomdps (partially observable Markov decision processes), momdps (mixed observability Markov decision processes)[1] and mdps (Markov decision processes) in a factored representation. It

to:

PomdpX is an XML file format for describing POMDPs (partially observable Markov decision processes), MOMDPs (mixed observability Markov decision processes)[1] and MDPs (Markov decision processes) in a factored representation. It

September 18, 2009, at 12:10 PM by 172.18.179.74 -
Changed lines 86-88 from:

schema to ensure well-formedness.1

The conventional ordering of the child elements is Description, Discount,

to:

schema to ensure well-formedness.

The conventional ordering of the child elements is Description, Discount,

Changed line 92 from:

description of the specified model. The other child elements specify the pomdp

to:

description of the specified model. The other child elements specify the POMDP

September 18, 2009, at 12:08 PM by 172.18.179.74 -
Changed line 29 from:

2.1 Example Problem

to:

Example Problem

Changed line 71 from:

2.2 File Format Structure

to:

File Format Structure

Changed line 324 from:

2.3 <Parameter> Tag

to:

<Parameter> Tag

September 18, 2009, at 12:05 PM by 172.18.179.74 -
September 18, 2009, at 12:05 PM by 172.18.179.74 -
Changed line 78 from:

2.2.1 <pomdpx> Tag

to:

2.2.1 <pomdpx> Tag

Changed line 80 from:

a PomdpX document—the pomdpx element—which has the following attributes:

to:

a PomdpX document—the pomdpx element—which has the following attributes:

September 18, 2009, at 11:58 AM by 172.18.179.74 -
Changed lines 6-8 from:

PomdpX1 is an XML file format for describing pomdps (partially observable Markov decision processes), momdps (mixed observability Markov decision pro- cesses)[1] and mdps (Markov decision processes) in a factored representation. It

to:

PomdpX is an XML file format for describing pomdps (partially observable Markov decision processes), momdps (mixed observability Markov decision processes)[1] and mdps (Markov decision processes) in a factored representation. It

September 18, 2009, at 11:54 AM by 172.18.179.74 -
Changed line 493 from:

Example 13. Usage of double -.

to:

Example 13. Usage of double -.

Added lines 509-587:

By using double “-”, the single <Entry> set in Example 13 is equivalent to specifying the following:

P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = good) = 1.0 P (rock 1 = bad|action rover = amw, rover 0 = ∗, rock 0 = good) = 0.0 P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = bad) = 0.0 and P (rock 1 = bad|action rover = amw, rover 0 = ∗, rock 0 = bad) = 1.0.

The <ProbTable> entries in Example 13 are in effect a 2 × 2 identity matrix. Hence our PomdpX format also allows for the keyword identity2 to be used in lieu of having to enumerate all the ones and zeros (like line 8). Therefore Examples 13 and 14 are functionally equivalent.

Example 14. Usage of keyword identity.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw * - - </Instance>
8.                    <ProbTable>identity</ProbTable>
9.               </Entry>
10.          </Parameter>
11.      </CondProb>
12. </StateTransitionFunction>

Another recognized keyword which may also be used in the <ProbTable> tags is uniform. This is equivalent to the probability 1/n repeated n times, where n is the number of possible values that could appear here. For example, the <Entry> tag below,

Example 15. Usage of keyword uniform.


<InitialStateBelief>
     <CondProb>
         <Var>rock 0</Var>
         <Parent>null</Parent>
         <Parameter type = "TBL">
             <Entry>
                  <Instance> - </Instance>
                  <ProbTable>uniform</ProbTable>
             </Entry>
         </Parameter>
     </CondProb>
</InitialStateBelief>

gives: P (rock 0 = good|∅) = 1 and P (rock 0 = bad|∅) = 2 , which specifies our initial belief that the rock has equal probability of being good or bad.

Besides being a child of the CondProb element, the <Parameter> tag may also appear as a child of the Func element which is used to define the reward function. In this case, the <Entry> tag within the <Parameter> must contain

the following:

  • <Instance> – declares values of all the variables for the reward function. Each variable value must correspond to the identifiers that appear between the enclosing <Parent> tag.
  • <ValueTable> – specifies the actual numerical reward.

Example 16 shows a snippet defining the reward function for the rover. In this example, the <Entry> specifies:

Rreward rover (action rover = ame, rover 0 = s1, rock 0 = ∗) = 10.

By now, the wildcard character “*” should be familiar to the user. Its use here denotes the fact that the rover will obtain a reward of 10 moving East from s1 (to the terminal state), regardless of whether the rock is good or bad.

Note that the characters “*” and “-” can be used in a similar manner as described in the previous sections. However, the keywords uniform and identity cannot appear between <ValueTable> tags, since those keywords only make sense for probabilities and not rewards.

We reiterate here that any probability or value entries of a function table which are not specified within a <Parameter> tag are assumed to be zero. Fur- thermore, a particular probability or value entry can also be specified more than once. The definition that appears last within a <Parameter> tag is the one that will take effect. This is convenient for specifying exceptions to a more general specification. The full compact version of the PomdpX input file for the RockSample problem with <Parameter type="TBL"> is given in Appendix A.

September 17, 2009, at 12:44 PM by 172.18.179.93 -
Changed lines 428-430 from:

[@ 1. <StateTransitionFunction>

to:

[@1. <StateTransitionFunction>

Added lines 446-506:

As some probabilities of the rock ’s transition are zero, they may be conveniently left out. However in certain cases, some variables may have all non-zero transition probabilities. PomdpX specifically provides another special character “-” to handle this. The “-” character means cycle through all possible values that could appear here and match the listed probabilities (in <ProbTable>) accordingly. Hence, Example 11 can also be expressed as:

Example 12. Usage of character -.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                     <Instance>amw * good - </Instance>
8.                     <ProbTable>1.0 0.0</ProbTable>
9.               </Entry>
10.              <Entry>
11.                    <Instance>amw * bad - </Instance>
12.                    <ProbTable>0.0 1.0</ProbTable>
13.              </Entry>
14.          </Parameter>
15.      </CondProb>
16. </StateTransitionFunction>

Although it is not obvious here, one can imagine if the entries were both non- zero, the use of “-” would save us from having to specify another set of <Entry> tag.

With the introduction of the “-” character, the first <Entry> set (lines 6–9) in Example 12 is in effect specifying the following:

P (rock 1 = good|action rover = amw, rover 0 = ∗, rock 0 = good) = 1.0 and P (rock 1 = bad|action rover = amw, rover 0 = ∗, rock 0 = good) = 0.0.

There is also an implicit ordering in Example 12. For instance, the usage of “-” for the first <Entry> set (lines 6–9), considers the possible values of rock to be good first then bad, hence the <ProbTable> entries are listed as (1.0 0.0) rather than (0.0 1.0). This “internal” order is actually taken from the way rock is declared in the <ValueEnum> tag (see Section 2.2.4), in which its possible values were declared to be first good then bad.

In the quest for further compression, there is a final modification we can make to Example 12. We make the observation that the two <Entry> sets seem some- what complementary differing only in the states of rock_0 and <ProbTable> entries. Thus employing the same trick for Example 12, we can replace the states of ''rock_0' with a “-”. This gives us Example 13.

Example 13. Usage of double -.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw * - - </Instance>
8.                    <ProbTable>1.0 0.0 0.0 1.0</ProbTable>
9.               </Entry>
10.          </Parameter>
11.      </CondProb>
12. </StateTransitionFunction>
September 16, 2009, at 12:18 PM by 172.18.179.175 -
Added line 331:
Changed lines 339-341 from:
  P (rock_1 = good|action_rover = amw, rover 0 = s0, rock_0 = good) = 1.0.
to:

P (rock_1 = good|action_rover = amw, rover_0 = s0, rock_0 = good) = 1.0.

Added lines 356-447:

Example 10. Contents of Parameter type="TBL", within CondProb.


1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock_1</Var>
4.           <Parent>action_rover rover_0 rock_0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw s0 good good</Instance>
8.                    <ProbTable>1.0</ProbTable>
9.               </Entry>
10.              <Entry>
11.                   <Instance>amw s1 good good</Instance>
12.                   <ProbTable>1.0</ProbTable>
13.              </Entry>
14.              <Entry>
15.                   <Instance>amw s2 good good</Instance>
16.                   <ProbTable>1.0</ProbTable>
17.              </Entry>
18.              <Entry>
19.                   <Instance>amw s0 good bad</Instance>
20.                   <ProbTable>0.0</ProbTable>
21.              </Entry>
22.              <Entry>
23.                   <Instance>amw s1 good bad</Instance>
24.                   <ProbTable>0.0</ProbTable>
25.              </Entry>
26.              <Entry>
27.                   <Instance>amw s2 good bad</Instance>
28.                   <ProbTable>0.0</ProbTable>
29.              </Entry>
30.              <Entry>
31.                   <Instance>amw s0 bad good</Instance>
32.                   <ProbTable>0.0</ProbTable>
33.              </Entry>
34.              <Entry>
35.                   <Instance>amw s1 bad good</Instance>
36.                   <ProbTable>0.0</ProbTable>
37.              </Entry>
38.              <Entry>
39.                   <Instance>amw s2 bad good</Instance>
40.                   <ProbTable>0.0</ProbTable>
41.              </Entry>
42.              <Entry>
43.                   <Instance>amw s0 bad bad</Instance>
44.                   <ProbTable>1.0</ProbTable>
45.              </Entry>
46.              <Entry>
47.                   <Instance>amw s1 bad bad</Instance>
48.                   <ProbTable>1.0</ProbTable>
49.              </Entry>
50.              <Entry>
51.                   <Instance>amw s2 bad bad</Instance>
52.                   <ProbTable>1.0</ProbTable>
53.              </Entry>
54.          </Parameter>
55.      </CondProb>
56. </StateTransitionFunction>

It seems a bit daunting that it takes 56 lines just to declare the transition function for the rock for a simple 1 × 3 grid. And this only for the rover’s action of moving West. But XML is verbose by nature and that is the price to pay for interoperability and extensibility. However, PomdpX does provide several convenience features to ease the encoding task.

First and foremost, lines 18–41 are actually redundant since any entry not specified is assumed to be zero. Secondly, we observe that the first three <Entry> sets (lines 6–17) are very similar. They differ only in the state of rover_0 and s0 to s2 are all the possible states of the rover. In such a situation, we may use the wildcard character “*”, which means that this is true for all possible values that could appear here. Therefore, lines 6–17 could be replaced by just one <Entry> tag, this is true for lines 42–53 too. Example 10 is re-written more succinctly and shown as Example 11.

Example 11. Usage of wildcard character *.

1. <StateTransitionFunction>
2.       <CondProb>
3.           <Var>rock 1</Var>
4.           <Parent>action rover rover 0 rock 0</Parent>
5.           <Parameter type = "TBL">
6.               <Entry>
7.                    <Instance>amw * good good</Instance>
8.                    <ProbTable>1.0</ProbTable>
9.               </Entry>
10.              <Entry>
11.                   <Instance>amw * bad bad</Instance>
12.                   <ProbTable>1.0</ProbTable>
13.              </Entry>
14.          </Parameter>
15.      </CondProb>
16. </StateTransitionFunction>
September 16, 2009, at 11:28 AM by 172.18.179.175 -
Added lines 324-352:

2.3 <Parameter> Tag

The <Parameter> tag is a fairly complicated component of PomdpX, introducing several new keywords and symbols, thus it warrants an individual section in itself. It has an optional attribute called type, which has possible values TBL (default) and DD, short for table and decision diagram, respectively. We will describe how to encode the RockSample problem both in TBL and DD.

2.3.1 Table Type (TBL)

When the <Parameter> tag appears as a child of a CondProb element, it must contain <Entry> child tags. Each Entry element specifies the probability entry of a function table. The <Entry> tag itself must consist of the following:

  • <Instance> – declares all the variables for the probability function. Each variable value must correspond to the identifiers that appear between the enclosing <Parent> tag, followed by the identifier that appears between the enclosing <Var> tag.
  • <ProbTable> – specifies the actual numerical values of the probabilities. This is best illustrated by Example 10 below. With reference to Figure 2.2, we show the full encoding of the rock ’s transition function for the rover ’s action of moving West. From the example, the <Var> tag declares that we are defining the transition function for the variable rock (line 3). It is conditional on action_rover, rover_0 and rock_0, which appear between the <Parent> tag

(line 4). The first <Entry> set (lines 6–9) specifies:

  P (rock_1 = good|action_rover = amw, rover 0 = s0, rock_0 = good) = 1.0.

In this case, when action_rover is amw, and rock_0 is good, rock_1 will be good as well, since a move action will not disturb its state. Conversely, if action_rover is amw, and rock_0 is good it is impossible for rock_1 to be bad as specified by lines 18–29.

Note that order matters here and it might be the source of some subtle bugs if overlooked. As mentioned before, the conditioning variables declared between the <Instance> tag (first three elements in line 7) correspond to the order they appear in the enclosing <Parent> tag, the last element corresponds to the variable being defined. One may arbitarily re-order the conditioning variables as long as they match-up within the <Parent> and <Instance> tags and the last element is always the identifier defined by <Var>. The convention that we adopt is to declare actions, fully observed variables followed by partially observed variables.

September 16, 2009, at 11:13 AM by 172.18.179.175 -
Changed line 136 from:
     <StateVar vnamePrev="rover 0" vnameCurr="rover 1"
to:
     <StateVar vnamePrev="rover_0" vnameCurr="rover_1"
Changed line 140 from:
     <StateVar vnamePrev="rock 0" vnameCurr="rock 1>"
to:
     <StateVar vnamePrev="rock_0" vnameCurr="rock_1>"
Changed line 146 from:
     <ActionVar vname="action rover">
to:
     <ActionVar vname="action_rover">
Changed line 190 from:

P (rover 0|∅)P (rock 0|∅). We will need two <CondProb> tags to specify it fully

to:

P (rover_0|∅)P (rock_0|∅). We will need two <CondProb> tags to specify it fully

Changed line 197 from:
         <Var>rover 0</Var>
to:
         <Var>rover_0</Var>
Changed line 202 from:
         <Var>rock 0</Var>
to:
         <Var>rock_0</Var>
Changed line 219 from:

joint belief of all state variables, P (rover 0, rock 0), with a single <CondProb>

to:

joint belief of all state variables, P (rover_0, rock_0), with a single <CondProb>

Changed line 226 from:
         <Var>rover 0 rock 0</Var>
to:
         <Var>rover_0 rock_0</Var>
Changed lines 238-240 from:
  P (rover 1, rock 1|action rover, rover 0, rock 0) =
  P (rover 1|action rover, rover 0) × P (rock 1|action rover, rover 0, rock 0).
to:
  P (rover_1, rock_1|action_rover, rover_0, rock_0) =
  P (rover_1|action_rover, rover_0) × P (rock_1|action_rover, rover_0, rock_0).
Changed lines 250-251 from:
           <Var>rover 1</Var>
           <Parent>action rover rover 0</Parent>
to:
           <Var>rover_1</Var>
           <Parent>action_rover rover_0</Parent>
Changed lines 255-256 from:
           <Var>rock 1</Var>
           <Parent>action rover rover 0 rock 0</Parent>
to:
           <Var>rock_1</Var>
           <Parent>action_rover rover_0 rock_0</Parent>
Changed line 286 from:
            <Parent>action rover rover 1 rock 1</Parent>
to:
            <Parent>action_rover rover_1 rock_1</Parent>
Changed line 313 from:
            <Parent>action rover rover 0 rock 0</Parent>
to:
            <Parent>action_rover rover_0 rock_0</Parent>
Added line 324:
September 16, 2009, at 11:10 AM by 172.18.179.175 -
Changed lines 320-321 from:
  • <Var> – this identifies the reward variable whose reward function is being

specified. Only identifiers that had been declared as the vname attribute

to:
  • <Var> – this identifies the reward variable whose reward function is being specified. Only identifiers that had been declared as the vname attribute
Changed lines 322-326 from:
  • <Parent> – this identifies the domain of the reward function. All identifiers declared as vnamePrev or vnameCurr attributes of state variables,

vname attribute of action variables or vname attribute of observation variables are allowed here.

  • <Parameter> – specifies the actual values in the function and is described

in detail in Section 2.3.

to:
  • <Parent> – this identifies the domain of the reward function. All identifiers declared as vnamePrev or vnameCurr attributes of state variables, vname attribute of action variables or vname attribute of observation variables are allowed here.
  • <Parameter> – specifies the actual values in the function and is described in detail in Section 2.3.
September 16, 2009, at 11:08 AM by 172.18.179.175 -
Added lines 231-326:

2.2.6 <StateTransitionFunction> Tag

This specifies the transition function T , which in general is the multiplicative result of the individual transition functions of each state variable in the model. Each <CondProb> tag specifies the transition function for each state variable. For our RockSample problem, with reference to Figure 2.2, the overall transition function is:

  P (rover 1, rock 1|action rover, rover 0, rock 0) =
  P (rover 1|action rover, rover 0) × P (rock 1|action rover, rover 0, rock 0).

This is translated to the following in PomdpX. One can see that it is very similar to its equational counterpart, only it has XML tags wrapped around it. We need to provide two CondProb elements, one each for the variable rover and rock.

Example 7. Contents of StateTransitionFunction.


 <StateTransitionFunction>
      <CondProb>
           <Var>rover 1</Var>
           <Parent>action rover rover 0</Parent>
           <Parameter> · · · </Parameter>
      </CondProb>
      <CondProb>
           <Var>rock 1</Var>
           <Parent>action rover rover 0 rock 0</Parent>
           <Parameter> · · · </Parameter>
      </CondProb>
 </StateTransitionFunction>

As described in 2.2.5, the <Var> tag identifies the state variable whose transition function is being specified. In this case, only identifiers declared as the vnameCurr attribute of state variables may be allowed here.

The identifiers within the <Parent> tag identify the conditioning variables in the transition function. They may be identifiers which had been declared as either the vnamePrev or vnameCurr attributes of state variables, or identifiers which had been declared as the vname attribute of action variables (see Section 2.2.4). Once again, we point out the caveat that PomdpX only allows certain combinations of vnamePrev and vnameCurr. One may only use vnameCurr identifiers within the <Parent> tag if the variable is fully observed. We defer the description of <Parameter> tag to Section 2.3 as it is fairly involved.

2.2.7 <ObsFunction> Tag

This specifies the observation function Z, which in general is the multiplicative result of the individual observation functions of each observation variable in the model. Each <CondProb> tag specifies one of these individual observation functions. In the RockSample problem, the probability of an observation is conditional on taking an action and ending in a new state. Thus its parents are action_rover, rover_1 and rock_1, as given in Example 8.

Example 8. Contents of ObsFunction.


 <ObsFunction>
       <CondProb>
            <Var>obs sensor</Var>
            <Parent>action rover rover 1 rock 1</Parent>
            <Parameter> · · · </Parameter>
       </CondProb>
 </ObsFunction>

For each CondProb element, the identifier within the <Var> tags identifies the observation variable whose observation function is being specified. The identifiers within the <Parent> tags identifies the conditioning variables in the observation function. Identifiers that appear within the <Var> tags must be identifiers which had been declared as the vname attribute of observation vari- ables. Identifiers that appear within the <Parent> tags must be identifiers which had been declared as the vnameCurr attribute of state variables, or the vname attribute of action variables (see Section 2.2.4). Parameter specifies the actual probabilities in the function and will be described in Section 2.3.

2.2.8 <RewardFunction> Tag

This specifies the reward function R, which in general is the additive result of the individual reward functions of each reward variable in the model. Each <Func> tag specifies one of these individual reward functions. For our RockSample problem, the reward depends on the action taken at the current state, thus its parents are action_rover, rover_0 and rock_0. This is shown in Example 9.

Example 9. Contents of RewardFunction.


 <RewardFunction>
       <Func>
            <Var>reward rover</Var>
            <Parent>action rover rover 0 rock 0</Parent>
            <Parameter> · · · </Parameter>
       </Func>
 </RewardFunction>

Similar to the <CondProd> tag, the <Func> tag has no attributes and requires the following three children tags to be defined:

  • <Var> – this identifies the reward variable whose reward function is being

specified. Only identifiers that had been declared as the vname attribute of reward variables may appear here.

  • <Parent> – this identifies the domain of the reward function. All identifiers declared as vnamePrev or vnameCurr attributes of state variables,

vname attribute of action variables or vname attribute of observation variables are allowed here.

  • <Parameter> – specifies the actual values in the function and is described

in detail in Section 2.3.

September 16, 2009, at 10:41 AM by 172.18.179.175 -
Changed line 1 from:

(::toc::)

to:

(:toc:)

September 16, 2009, at 10:40 AM by 172.18.179.175 -
Changed line 1 from:

(:toc:)

to:

(::toc::)

September 16, 2009, at 10:31 AM by 172.18.179.175 -
Changed lines 207-208 from:

The <CondProb> tag has no attributes and require the following three children

to:

The <CondProb> tag has no attributes and require the following three children

Changed lines 210-212 from:
  • <Var> – identifies the factor being specified. Only identifiers declared as vnamePrev of state variables are allowed here (see Section 2.2.4).
  • <Parent> – the set of conditioning variables. Only identifiers declared as vnamePrev or vnameCurr of state variables are allowed here. The previous statement is actually slightly misleading, as PomdpX allows certain combinations of vnamePrev and vnameCurr identifiers. Referring to Figure 1.1,

we only allow conditioning arrows from xt (fully observed variables) to yt (partially observed variables) and not the other way round. Specifically, a vnameCurr identifier is allowed as parent only if the variable is fully

to:
  • <Var> – identifies the factor being specified. Only identifiers declared as vnamePrev of state variables are allowed here (see Section 2.2.4).
  • <Parent> – the set of conditioning variables. Only identifiers declared as vnamePrev or vnameCurr of state variables are allowed here. The previous statement is actually slightly misleading, as PomdpX allows certain combinations of vnamePrev and vnameCurr identifiers. Referring to Figure 1.1, we only allow conditioning arrows from xt (fully observed variables) to yt (partially observed variables) and not the other way round. Specifically, a vnameCurr identifier is allowed as parent only if the variable is fully
Changed line 213 from:

In addition, the keyword null may be used to signify the absence of any

to:

In addition, the keyword null may be used to signify the absence of any

Changed lines 215-217 from:
  • <Parameter> – specifies the actual probabilities in the factor and is de-

scribed in detail in Section 2.3.

to:
  • <Parameter> – specifies the actual probabilities in the factor and is described in detail in Section 2.3.
Changed lines 218-219 from:

many state variables. We could have alternatively specified b0 as simply the joint belief of all state variables, P (rover 0, rock 0), with a single <CondProb>

to:

many state variables. We could have alternatively specified b0 as simply the joint belief of all state variables, P (rover 0, rock 0), with a single <CondProb>

Added lines 221-231:

Example 6. Initial joint belief specification.


<InitialStateBelief>
     <CondProb>
         <Var>rover 0 rock 0</Var>
         <Parent>null</Parent>
         <Parameter> · · · </Parameter>
     </CondProb>
</InitialStateBelief>
September 16, 2009, at 10:26 AM by 172.18.179.175 -
Changed line 184 from:

2.2.5 <InitialStateBelief> Tag

to:

2.2.5 <InitialStateBelief> Tag

Added line 192:
Added line 194:

Changed lines 210-215 from:
  • <Parent> – the set of conditioning variables. Only identifiers declared as vnamePrev or vnameCurr of state variables are allowed here. The previous

statement is actually slightly misleading, as PomdpX allows certain combi- nations of vnamePrev and vnameCurr identifiers. Referring to Figure 1.1, we only allow conditioning arrows from xt (fully observed variables) to yt (partially observed variables) and not the other way round. Specifically, a vnameCurr identifier is allowed as parent only if the variable is fully

to:
  • <Parent> – the set of conditioning variables. Only identifiers declared as vnamePrev or vnameCurr of state variables are allowed here. The previous statement is actually slightly misleading, as PomdpX allows certain combinations of vnamePrev and vnameCurr identifiers. Referring to Figure 1.1,

we only allow conditioning arrows from xt (fully observed variables) to yt (partially observed variables) and not the other way round. Specifically, a vnameCurr identifier is allowed as parent only if the variable is fully

Deleted line 212:
September 16, 2009, at 10:22 AM by 172.18.179.175 -
Added lines 183-224:

2.2.5 <InitialStateBelief> Tag This is an optional tag. It specifies the initial belief b0 , and may be omitted if all state variables are fully observed. The PomdpX format allows the initial belief to be specified as multiple multiplicative factors, with each <CondProb> tag specifying one of these factors. From our running RockSample problem, since the initial belief is not conditional on anything, it is factored as b0 = P (rover 0|∅)P (rock 0|∅). We will need two <CondProb> tags to specify it fully as shown below. Example 5. Contents of InitialStateBelief.

  <InitialStateBelief>
      <CondProb>
         <Var>rover 0</Var>
         <Parent>null</Parent>
         <Parameter> · · · </Parameter>
      </CondProb>
      <CondProb>
         <Var>rock 0</Var>
         <Parent>null</Parent>
         <Parameter> · · · </Parameter>
      </CondProb>
   </InitialStateBelief>

The <CondProb> tag has no attributes and require the following three children tags:

  • <Var> – identifies the factor being specified. Only identifiers declared as vnamePrev of state variables are allowed here (see Section 2.2.4).
  • <Parent> – the set of conditioning variables. Only identifiers declared as vnamePrev or vnameCurr of state variables are allowed here. The previous

statement is actually slightly misleading, as PomdpX allows certain combi- nations of vnamePrev and vnameCurr identifiers. Referring to Figure 1.1, we only allow conditioning arrows from xt (fully observed variables) to yt (partially observed variables) and not the other way round. Specifically, a vnameCurr identifier is allowed as parent only if the variable is fully observed.

In addition, the keyword null may be used to signify the absence of any conditioning variables.

  • <Parameter> – specifies the actual probabilities in the factor and is de-

scribed in detail in Section 2.3.

The previous example is somewhat cumbersome to declare if we have too many state variables. We could have alternatively specified b0 as simply the joint belief of all state variables, P (rover 0, rock 0), with a single <CondProb> tag as shown in Example 6.

September 15, 2009, at 02:50 PM by 172.18.179.175 -
Changed line 133 from:

Example 4. Variable declaration. Defining S, A, O, and R variables.

to:

Example 4. Variable declaration. Defining S, A, O, and R variables.

Changed lines 152-182 from:
to:

The possible values that a variable can assume are either specified with regards to the <NumValues> or <ValueEnum> tags. In the former, we would give an inte- ger to indicate the number of values/states for the variable. For instance, in the example, the rover is declared with three possible values. The values are sub- sequently referenced internally using numerals, starting from 0 and prepended with ‘s’. Hence the states for the rover variable would be s0, s1 and s2. When using <NumValues> it is up to the user to attach semantic meaning to the values, in our example, s0 denotes the left grid, s1 the center and s2 the right terminal grid.

In the latter, the user will have to manually enumerate all the possible values/states the variable may take on. In our example, the rock has two possible values, it is either good or bad.

The observation and action variables are also declared similarly with the <ObsVar> and <ActionVar> tags respectively. Both require the attribute vname which serves as the identifier for the variable. The possible values that an observation or action can assume can also be specified with either <NumValues> or <ValueEnum>. If <NumValues> is used, ‘o’ and ‘a’ would be prepended to the values of observation and action variables respectively.

In the case of <ValueEnum>, the user will once again need to enumerate all possible values/states manually. In our example, for the action_rover variable, we enumerate all the four possible actions. ‘amw’ is a mnemonic for action move west and ‘ac’ stands for action check and so on.

Finally, reward variables are declared with the <RewardVar> tags which must contain the vname attribute. The vname serves as an identifier for the reward variable. The <RewardVar> is an empty XML tag and no values are specified. Note that we may use the XML shorthand of <RewardVar vname="· · · " /> to close an empty tag here.

September 15, 2009, at 02:34 PM by 172.18.179.175 -
Added line 55:

Example 1. A PomdpX document.

Deleted lines 56-57:

Example 1. A PomdpX document.


Added lines 105-106:

Example 2. Contents of Description.

Deleted lines 107-108:

Example 2. Contents of Description.


Added lines 117-118:

Example 3. Contents of Discount.

Deleted lines 119-120:

Example 3. Contents of Discount.


Added lines 133-150:

Example 4. Variable declaration. Defining S, A, O, and R variables.


<Variable>
     <StateVar vnamePrev="rover 0" vnameCurr="rover 1"
      fullyObs="true">
         <NumValues>3</NumValues>
     </StateVar>
     <StateVar vnamePrev="rock 0" vnameCurr="rock 1>"
         <ValueEnum>good bad</ValueEnum>
     </StateVar>
     <ObsVar vname="obs sensor">
         <ValueEnum>ogood obad</ValueEnum>
     </ObsVar>
     <ActionVar vname="action rover">
         <ValueEnum>amw ame ac as</ValueEnum>
     </ActionVar>
     <RewardVar vname="reward rover" />
</Variable>
September 15, 2009, at 02:30 PM by 172.18.179.175 -
Changed lines 127-128 from:
    Each state variable is declared with the <StateVar> tag. It contains the
to:

Each state variable is declared with the <StateVar> tag. It contains the

September 15, 2009, at 02:30 PM by 172.18.179.175 -
Changed lines 106-111 from:
         Example 2. Contents of Description.
         <Description> RockSample problem for map size 1 x 3.
          Rock is at 0, Rover’s initial position is at 1.
          Exit is at 2.
         </Description>
to:

Example 2. Contents of Description.


 <Description> RockSample problem for map size 1 x 3.
 Rock is at 0, Rover’s initial position is at 1.
 Exit is at 2.
 </Description>
Changed lines 118-119 from:
                      Example 3. Contents of Discount.
                      <Discount> 0.95 </Discount>
to:

Example 3. Contents of Discount.


 <Discount> 0.95 </Discount>
September 15, 2009, at 02:28 PM by 172.18.179.175 -
Added lines 89-129:

The conventional ordering of the child elements is Description, Discount, Variable and thereafter: InitialStateBelief, StateTransitionFunction, ObsFunction and RewardFunction. However this ordering is not strictly re- quired and one may permute their orderings. Description is an optional, short description of the specified model. The other child elements specify the pomdp tuple (S, A, O, T , Z, R, γ) and the initial belief b0 .

In general these elements should all be present, and each can appear only once. ObsFunction may be omitted if there are no observation variables in the model. Similarly, InitialBeliefState may be omitted if all state variables are fully observed (for example an mdp model). pomdpx’s child elements are described in greater detail in the following subsections.

2.2.2 <Description> Tag

This is an optional tag that one may provide to give a brief description of the specified problem. For example:

         Example 2. Contents of Description.
         <Description> RockSample problem for map size 1 x 3.
          Rock is at 0, Rover’s initial position is at 1.
          Exit is at 2.
         </Description>

2.2.3 <Discount> Tag

This specifies the discount factor γ. It has to be a real-valued number, for our RockSample problem, we will be using a discount factor of 0.95 and it is entered as shown:

                      Example 3. Contents of Discount.
                      <Discount> 0.95 </Discount>

2.2.4 <Variable> Tag

The state, action and observation variables which factorize the state S, action A, and observation O spaces are declared within the Variable element. Reward variables, R are also declared here. Example 4 gives the declaration of the variables for the RockSample problem.

    Each state variable is declared with the <StateVar> tag. It contains the

following attributes:

  • vnamePrev – identifier for the variable’s start state.
  • vnameCurr – identifier for the variable’s end state.
  • fullyObs – set to true if the variable is fully observed. The default is false. Thus for the variable rock in Example 4, it is partially observed, as implied by the omission of the fullyObs attribute.
September 15, 2009, at 02:22 PM by 172.18.179.175 -
Changed lines 83-85 from:
   • version
   • id – optional name for the specified model.
   • xmlns:xsi – defines xsi as the XML Schema namespace.
to:
  • version
  • id – optional name for the specified model.
  • xmlns:xsi – defines xsi as the XML Schema namespace.
  • xsi:noNamespaceSchemaLocation – this is where we put our XML Schema

definition, pomdpx.xsd. The PomdpX input should be validated with this schema to ensure well-formedness.1

September 15, 2009, at 02:20 PM by 172.18.179.175 -
Changed line 74 from:

A PomdpX document consists of a header and a pomdpx root element which in

to:

A PomdpX document consists of a header and a pomdpx root element which in

September 15, 2009, at 02:16 PM by 172.18.179.175 -
Added lines 72-85:

2.2 File Format Structure

A PomdpX document consists of a header and a pomdpx root element which in turn contains child elements, as shown in Example 1 below. The first line of the document is an XML processing instruction which defines that the document adheres to the XML 1.0 standard and that the encoding of the document is ISO-8859-1. Other encodings such as UTF-8 are also possible.

2.2.1 <pomdpx> Tag

Continuing with the example above, the second line contains the root-element of a PomdpX document—the pomdpx element—which has the following attributes:

   • version
   • id – optional name for the specified model.
   • xmlns:xsi – defines xsi as the XML Schema namespace.
September 15, 2009, at 02:14 PM by 172.18.179.175 -
Added lines 55-57:

Example 1. A PomdpX document.


September 15, 2009, at 02:09 PM by 172.18.179.175 -
Changed lines 60-72 from:
                                · · · </Description>
     <Description>
                                · · · </Discount>
     <Discount>
                                · · · </Variable>
     <Variable>
                                · · · </InitialStateBelief>
     <InitialStateBelief>
     <StateTransitionFunction> · · · </StateTransitionFunction>
                                · · · </ObsFunction>
     <ObsFunction>
                                · · · </RewardFunction>
     <RewardFunction>
to:
      <Description>             · · · </Description>
      <Discount>                · · · </Discount>
      <Variable>                · · · </Variable>
      <InitialStateBelief>      · · · </InitialStateBelief>
      <StateTransitionFunction> · · · </StateTransitionFunction>
      <ObsFunction>             · · · </ObsFunction>
      <RewardFunction>         · · · </RewardFunction>
September 15, 2009, at 02:06 PM by 172.18.179.175 -
Changed lines 51-54 from:

This is a trivial problem but is adequate to showcase the salient features of PomdpX. As with the original version of the problem, the \emph{Sample} action samples the rock at the rover's current location. If the rock is good, the rover receives a reward of 10 and the rock becomes bad. If the rock is bad, it receives a penalty of $-10$. Moving into the terminal area yields a reward of 10. A penalty of $-100$ is imposed for moving off the grid and sampling in a grid where there is no rock. All other moves have no cost or reward. The \emph{Check} action returns a noisy observation from $\mathcal{O}=\{Good, Bad\}$.

to:

Figure 2.2 – Dynamic Bayesian network of the RockSample problem. The

rover’s position is fully observed whereas the rock type is partially observed.

<?xml version="1.0" encoding="ISO-8859-1"?>
<pomdpx version="0.1" id="rockSample"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:noNamespaceSchemaLocation="pomdpx.xsd">
                                · · · </Description>
     <Description>
                                · · · </Discount>
     <Discount>
                                · · · </Variable>
     <Variable>
                                · · · </InitialStateBelief>
     <InitialStateBelief>
     <StateTransitionFunction> · · · </StateTransitionFunction>
                                · · · </ObsFunction>
     <ObsFunction>
                                · · · </RewardFunction>
     <RewardFunction>
</pomdpx>
September 15, 2009, at 02:00 PM by 172.18.179.175 -
Changed lines 51-54 from:
to:

This is a trivial problem but is adequate to showcase the salient features of PomdpX. As with the original version of the problem, the \emph{Sample} action samples the rock at the rover's current location. If the rock is good, the rover receives a reward of 10 and the rock becomes bad. If the rock is bad, it receives a penalty of $-10$. Moving into the terminal area yields a reward of 10. A penalty of $-100$ is imposed for moving off the grid and sampling in a grid where there is no rock. All other moves have no cost or reward. The \emph{Check} action returns a noisy observation from $\mathcal{O}=\{Good, Bad\}$.

September 15, 2009, at 01:56 PM by 172.18.179.175 -
Changed lines 40-41 from:
to:

Figure 2.1 – The 1 × 3 RockSample problem world.
Changed line 43 from:

of PomdpX. As with the original version of the problem, the Sample action

to:

of PomdpX. As with the original version of the problem, the Sample action

Changed lines 48-49 from:

is no rock. All other moves have no cost or reward. The Check action returns a noisy observation from O = {Good, Bad}.

to:

is no rock. All other moves have no cost or reward. The Check action returns a noisy observation from O = {Good, Bad}.

September 15, 2009, at 01:54 PM by 172.18.179.175 -
Changed line 31 from:

We will be using a modified version of the RockSample problem, first proposed

to:

We will be using a modified version of the RockSample problem, first proposed

Changed line 37 from:

A = {West, East, Sample, Check}. The DBN for the RockSample problem is

to:

A = {West, East, Sample, Check}. The DBN for the RockSample problem is

Changed line 51 from:
to:
September 15, 2009, at 01:53 PM by 172.18.179.175 -
Changed line 51 from:
to:
September 15, 2009, at 01:52 PM by 172.18.179.175 -
Changed line 21 from:

Figure 1.1 – The general model specified in a PomdpX document. Each of
to:

Figure 1.1 – The general model specified in a PomdpX document. Each of
Changed lines 23-24 from:

multiple fully observed (xt ) and partially observed (yt ) state variables.

to:

multiple fully observed (xt ) and partially observed (yt ) state variables.

Changed line 40 from:
to:
September 15, 2009, at 01:51 PM by 172.18.179.175 -
September 15, 2009, at 01:51 PM by 172.18.179.175 -
Changed line 40 from:
to:
September 15, 2009, at 01:51 PM by 172.18.179.175 -
Changed line 21 from:

'Figure 1.1' – The general model specified in a PomdpX document. Each of
to:

Figure 1.1 – The general model specified in a PomdpX document. Each of
Changed line 37 from:

A = {W est, East, Sample, Check}. The DBN for the RockSample problem is

to:

A = {West, East, Sample, Check}. The DBN for the RockSample problem is

Deleted lines 38-39:
September 15, 2009, at 01:47 PM by 172.18.179.175 -
Changed lines 17-18 from:

in Figure 1.1. Each of xt , yt , ot and at represents possibly multiple variables. xt represents fully observed state variables while yt represents partially observed

to:

in Figure 1.1. Each of xt , yt , ot and at represents possibly multiple variables. xt represents fully observed state variables while yt represents partially observed

Changed lines 21-27 from:

<msup> <mfenced> <mi>x</mi> <mo>+</mo> <mi>y</mi></mfenced> <mn>2</mn> </msup>

to:

'Figure 1.1' – The general model specified in a PomdpX document. Each of

xt , yt , ot and at represents multiple variables. The state (st ) is represented as multiple fully observed (xt ) and partially observed (yt ) state variables.

PomdpX Tutorial

The purpose of this section is to provide a tutorial-like approach to using the PomdpX format. We make no assumptions about the user’s familiarity with existing pomdp solvers.

2.1 Example Problem

We will be using a modified version of the RockSample problem, first proposed by Smith and Simmons [2] as our running example to encode into the PomdpX format. It models a rover on an exploration mission and it can achieve rewards by sampling rocks in its immediate area. Consider a map of size 1 × 3 as shown in Figure 2.1, with one rock at the left end and the terminal state at the right end. The rover starts off at the center and its possible actions are A = {W est, East, Sample, Check}. The DBN for the RockSample problem is shown in Figure 2.2.

September 15, 2009, at 12:48 PM by 172.18.178.237 -
Changed lines 21-28 from:

<msup> // Expression with superscript

	 <mfenced> // "base" expression (x+y) 
	  <mi>x</mi> // surrounded by parentheses 
	  <mo>+</mo>
	  <mi>y</mi>
	 </mfenced>
	 <mn>2</mn> // "script" expression 2
	</msup>
to:

<msup> <mfenced> <mi>x</mi> <mo>+</mo> <mi>y</mi></mfenced> <mn>2</mn> </msup>

September 15, 2009, at 12:47 PM by 172.18.178.237 -
Added lines 20-29:

<msup> // Expression with superscript

	 <mfenced> // "base" expression (x+y) 
	  <mi>x</mi> // surrounded by parentheses 
	  <mo>+</mo>
	  <mi>y</mi>
	 </mfenced>
	 <mn>2</mn> // "script" expression 2
	</msup>
September 15, 2009, at 12:41 PM by 172.18.178.237 -
Changed lines 17-18 from:

in Figure 1.1. Each of x^t^ , y^t^ , o^t^ and at represents possibly multiple variables. x^t^ represents fully observed state variables while y^t^ represents partially observed

to:

in Figure 1.1. Each of xt , yt , ot and at represents possibly multiple variables. xt represents fully observed state variables while yt represents partially observed

September 15, 2009, at 12:40 PM by 172.18.178.237 -
Changed lines 17-18 from:

in Figure 1.1. Each of xt , yt , ot and at represents possibly multiple variables. xt represents fully observed state variables while yt represents partially observed

to:

in Figure 1.1. Each of x^t^ , y^t^ , o^t^ and at represents possibly multiple variables. x^t^ represents fully observed state variables while y^t^ represents partially observed

September 15, 2009, at 12:32 PM by 172.18.178.237 -
Changed line 21 from:
to:
September 15, 2009, at 12:32 PM by 172.18.178.237 -
Changed line 32 from:
to:
September 15, 2009, at 12:31 PM by 172.18.178.237 -
Changed lines 21-22 from:
to:
Changed line 32 from:
to:
September 15, 2009, at 12:31 PM by 172.18.178.237 -
Changed line 32 from:
to:
September 15, 2009, at 12:21 PM by 172.18.178.237 -
Changed line 23 from:
    This is a trivial problem but is adequate to showcase the salient features
to:

This is a trivial problem but is adequate to showcase the salient features

September 15, 2009, at 12:21 PM by 172.18.178.237 -
Changed line 32 from:
to:
September 15, 2009, at 12:19 PM by 172.18.178.237 -
Added lines 22-32:
    This is a trivial problem but is adequate to showcase the salient features

of PomdpX. As with the original version of the problem, the Sample action samples the rock at the rover’s current location. If the rock is good, the rover receives a reward of 10 and the rock becomes bad. If the rock is bad, it receives a penalty of −10. Moving into the terminal area yields a reward of 10. A penalty of −100 is imposed for moving off the grid and sampling in a grid where there is no rock. All other moves have no cost or reward. The Check action returns a noisy observation from O = {Good, Bad}.

September 15, 2009, at 12:17 PM by 172.18.178.237 -
Changed line 21 from:
to:
September 15, 2009, at 11:24 AM by 172.18.178.201 -
Changed line 21 from:
to:
September 15, 2009, at 10:25 AM by 172.18.178.237 -
Changed lines 3-8 from:

This title will appear in the TOC, which is included in the cookbook.

You can consult the source of the svn-howto wiki for the markups, fancy headers etc. ~ Kok Sung

to:

Overview

PomdpX1 is an XML file format for describing pomdps (partially observable Markov decision processes), momdps (mixed observability Markov decision pro- cesses)[1] and mdps (Markov decision processes) in a factored representation. It allows multiple state, observation, action and reward variables to be specified in a model. The specified model must have at least one state, action and reward variable, while the observation variable is optional. Each state variable must be specified as either partially observed (default) or fully observed. Thus a PomdpX input document can specify a pomdp (all state variables partially observed), a momdp (mixture of partially observed and fully observed state variables), or an mdp (all state variables fully observed, no observation variables) problem. In general, the model can be represented by the dynamic Bayesian network (DBN) in Figure 1.1. Each of xt , yt , ot and at represents possibly multiple variables. xt represents fully observed state variables while yt represents partially observed state variables. (The reward variables are omitted to prevent clutter).

September 14, 2009, at 04:04 PM by 172.18.178.201 -
Changed line 3 from:

PomdpX Documentation Goes here

to:

This title will appear in the TOC, which is included in the cookbook.

September 14, 2009, at 04:03 PM by 172.18.178.201 -
Changed line 7 from:

You can consult the source of the svn-howto wiki for the markups, fancy headers etc.

to:

You can consult the source of the svn-howto wiki for the markups, fancy headers etc. ~ Kok Sung

September 14, 2009, at 04:03 PM by 172.18.178.201 -
Changed lines 3-18 from:

URL Locations Here

The root svn repository is located at http://bigbird.comp.nus.edu.sg/motion/. It is further sub-divided into 2 other directories, namely, "proj" and "paper".

Project

The directory "proj" will be used to house code developments. Each code development will be an individual repository so that we can keep track of its revision number. For example the project appl can be checked out at, http://bigbird.comp.nus.edu.sg/motion/proj/appl and the momdp code can be checked out at, http://bigbird.comp.nus.edu.sg/motion/proj/momdp.

And if one were to ssh into bigbird, the different projects can found as individual folders by the names "appl" and "momdp" at /home/motion/proj. This is the mapping that apache dav has for the svn repositories.

Paper

The directory "paper" is setup slightly different from "proj", in that there is only one repository, named trunk. Under this repository, it will house multiple projects. The reason for this setup is so that, we can then remotely invoke the command, 'svn mkdir' without having to log into the bigbird server and issue a svnadmin create. This removes the hassle of having to ssh into the main server.

A side-effect of this setup is that there will be a global revision number of all projects under the trunk repository, but this can be tolerated. As an example, the ISRR paper can be checked out at http://bigbird.comp.nus.edu.sg/motion/paper/trunk/isrr09ges/ and NIPS2009 paper at http://bigbird.comp.nus.edu.sg/motion/paper/trunk/nips2009/.

to:

PomdpX Documentation Goes here

Changed line 7 from:

Note: access to the svn repository is now restricted. All actions, e.g. viewing via web-browser, checking out and committing changes, require user authentication.

to:

You can consult the source of the svn-howto wiki for the markups, fancy headers etc.

Deleted lines 9-213:

Standard Workflow for Fixing Bug (on Trunk)

1: Checkout Code (One time only)

2: Update local copies from SVN server

3: Make changes, fix bugs

4: Update local copies and manually fix conflicts if there is any

5: Commit changes

(Repeat 2-5)

Standard Workflow for Experimenting (on Branches)

1: Checkout Code (One time only)

2: Make a new branch

3: Do whatever with the branch

4: Apply bug fixes from trunk to branch if necessary (Merging)

5: Delete the branch if it is no longer needed

6: Reintegrate the branch to trunk if it is successful

Initial Checkout

To check out a repository, simply issue the command below. In the example, the trunk folder of the auv repository is being checked out.

koksung@goskunk:~/temp$ svn co http://bigbird.comp.nus.edu.sg/motion/proj/auv/trunk/
A trunk/AUVmedium5depths.pomdp
A trunk/AUVmedium5depths.xml
A trunk/AUVPomdpgenerator.py
A trunk/AUVXMLgenerator.py
Checked out revision 1.
koksung@goskunk:~/temp$

Meaning of the Prefix Letters

Below are some of the more common letters that one encounters during the daily usage of svn and their meanings.

  • 'A' -- Addition
  • 'D' -- Deletion
  • 'C' -- Conflict exists with this file
  • 'G' -- Modifications have been merged successfully
  • 'M' -- File has been modified since last svn update
  • 'R' -- Replaced
  • '?' -- Not in repository, svn doesn't know about it

Creating a New Repository

A new code development repository

Suppose you want to create a new svn project repository, let's call it auv. Firstly create 2 folders, "trunk" and "branches" in your auv project folder. Then move all your source files into the trunk folder. Then issue the following commands in the example shown:

koksung@bigbird:~$ su - www-data
$ bash
www-data@bigbird:~$
www-data@bigbird:~$ svnadmin create /home/motion/proj/auv
www-data@bigbird:~$ svn import /home/koksung/myproject/auv/ file:///home/motion/proj/auv/ -m "Initial Import"
Adding auv/trunk
Adding auv/trunk/AUVmedium5depths.pomdp
Adding auv/trunk/AUVmedium5depths.xml
Adding auv/trunk/AUVPomdpgenerator.py
Adding auv/trunk/AUVXMLgenerator.py
Adding auv/branches

Committed revision 1.
www-data@bigbird:~$ exit
$ exit
koksung@bigbird:~$

Explanation of the above commands: Our server is Ubuntu which is a Debian derivative. The user id that accesses the svn repository via the Apache http protocol is given by the default: www-data. So by initially creating the project folder and importing the files as the user www-data, this eliminates the additional step of having to setup proper permissions for group and others.

Some IMPORTANT things to note, you have to issue the command 'bash' to change www-data user's shell from sh to bash. This will set the environment and path properly for you. In issuing the 'svn import' command, please provide the full path (starting from /) of your project for svn to import from. As the user, www-data is not part of your group, you have to ensure that the proper read and execute bits are set for www-data to import.

Your newly created svn repository is now ready for check out at the URL: http://bigbird.comp.nus.edu.sg/motion/proj/auv/
Note: auv is the example project folder here, remember to change auv to the name of your project. Please send me an email if you still encounter permission problems, koksung

A new paper project directory

As the paper directory is setup to have only one repository, the most straightforward way to create a new paper directory is to first check out the trunk of the paper, i.e. http://bigbird.comp.nus.edu.sg/motion/paper/trunk/ and then in your local copy, issue "svn mkdir <name of paper project>". Lastly, do a svn check-in to commit the creation of this new paper project directory.

Remember, the command to issue is "svn mkdir" and not "mkdir", the latter only creates a local directory whereas the former will direct svn to create a project directory at the paper repository the next time you commit changes.

Deleting a File in a repository

Note that svn's design philosophy is "Make local changes, commit for global effect". This applies to almost every possible actions. E.g. to delete a file, you will delete your LOCAL copy and then commit, to sync the svn server.

Issue the following commands to delete a file. Notice the line: "D AUVPomdpgenerator.py", the letter 'D' means the file is scheduled for deletion and it will take effect the next time you commit changes.

koksung@bigbird:~/temp/auv/trunk$ ls -l
total 16364
-rw-r--r-- 1 koksung koksung 21264 Mar 17 09:13 AUVPomdpgenerator.py
-rw-r--r-- 1 koksung koksung 26111 Mar 17 09:13 AUVXMLgenerator.py
-rw-r--r-- 1 koksung koksung 16219340 Mar 17 09:13 AUVmedium5depths.pomdp
-rw-r--r-- 1 koksung koksung 455664 Mar 17 09:13 AUVmedium5depths.xml
koksung@bigbird:~/temp/auv/trunk$
koksung@bigbird:~/temp/auv/trunk$ svn delete AUVPomdpgenerator.py
D AUVPomdpgenerator.py
koksung@bigbird:~/temp/auv/trunk$ svn commit -m "Deleted AUVPomdpgenerator.py"
Deleting trunk/AUVPomdpgenerator.py

Committed revision 2.
koksung@bigbird:~/temp/auv/trunk$ ls -l
total 16340
-rw-r--r-- 1 koksung koksung 26111 Mar 17 09:13 AUVXMLgenerator.py
-rw-r--r-- 1 koksung koksung 16219340 Mar 17 09:13 AUVmedium5depths.pomdp
-rw-r--r-- 1 koksung koksung 455664 Mar 17 09:13 AUVmedium5depths.xml
koksung@bigbird:~/temp/auv/trunk$

Adding a File to subversion control in an existing repository

The scenario: suppose you have checked-out a repository, you created a file and would like to add it to subversion control to this existing checked-out repository. Follow the steps below. Notice the line:"A aNewFile.txt", the letter 'A' indicates that the file has been made known to svn and it will be added to subversion control during the next commit. Once again, svn's philosophy: "Make local changes, commit for global effect".

koksung@bigbird:~/temp/auv/trunk$ ls -l
total 16344
-rw-r--r-- 1 koksung koksung 26111 Mar 17 09:13 AUVXMLgenerator.py
-rw-r--r-- 1 koksung koksung 16219340 Mar 17 09:13 AUVmedium5depths.pomdp
-rw-r--r-- 1 koksung koksung 455664 Mar 17 09:13 AUVmedium5depths.xml
-rw-r--r-- 1 koksung koksung 19 Mar 17 09:23 aNewFile.txt
koksung@bigbird:~/temp/auv/trunk$ svn add aNewFile.txt
A aNewFile.txt
koksung@bigbird:~/temp/auv/trunk$ svn commit -m "added a new file to subversion control--aNewFile.txt"
Adding trunk/aNewFile.txt
Transmitting file data.
Committed revision 3.
koksung@bigbird:~/temp/auv/trunk$

Deleting an entire repository

Deleting an entire repository is different from deleting a file. For this, we can remote login to the svn server (i.e. bigbird) and remove the entire project folder as the user www-data. This is the most straightforward way.

koksung@bigbird:~$ su - www-data
$ bash
www-data@bigbird:~$ rm -rf /home/motion/proj/auv
www-data@bigbird:~$ exit
$ exit
koksung@bigbird:~$

Update Local Copy from Server

To update your local copy, use svn update

$ svn update

Commit Changes Back to Server

After you have made changes to the local copy. You need to commit changes back to Server so that others can see it. Go the directory that you would like to commit, then:

$ svn commit -m "Fixed a bug"

Branching

Suppose you would like to branch off the project for some private development, this can be easily done in svn---you make a copy of the project in the repository using the svn copy command.

koksung@goskunk:~/temp/trunk$ svn copy http://bigbird.comp.nus.edu.sg/motion/proj/auv/trunk \
> http://bigbird.comp.nus.edu.sg/motion/proj/auv/branches/my-private-branch \
> -m "Creating a private branch"
Committed revision 3.
koksung@goskunk:~/temp/trunk$

Now that you've created a branch of the project, you can check out a new working copy to start using it. The check out command is as mentioned above. There's nothing special about this working copy; it simply mirrors a different directory in the repository. When you commit changes, however, others won't see them when they update, because their working copies are of /auv/trunk.

Merging From Trunk to Branch

Suppose that you have made some bug-fixes on the trunk, and you would like to apply those to your branch too.

$ svn merge http://bigbird.comp.nus.edu.sg/motion/proj/auv/trunk \
> http://bigbird.comp.nus.edu.sg/motion/proj/auv/branches/my-private-branch
$

There may be some conflicts. After resolving these conflicts manually, you need to commit these changes back to server.

$ svn commit -m "Merged bug-fixes from trunk"

Reintegrate Branch back to Trunk

When a branch is mature enough, you may reintegrate the branch back to trunk. The result is to make the trunk identical to the branch.

Note: Remember to merge all bug-fixes from trunk to your branch first. And email everyone before reintegrating branch back to trunk.

$ svn merge --reintegrate http://bigbird.comp.nus.edu.sg/motion/proj/auv/branches/my-private-branch \
> http://bigbird.comp.nus.edu.sg/motion/proj/auv/trunk
$

Resolve possible conflicts, then do a commit

$ svn commit -m "Reintegrate branch back to trunk"

September 14, 2009, at 02:30 PM by 172.18.178.201 -
Changed line 3 from:

URL Locations

to:

URL Locations Here

August 19, 2009, at 08:21 AM by 172.18.178.201 -
Changed lines 110-112 from:

Lastly, do a svn checkin to commit the creation of this new paper project directory.

to:

Lastly, do a svn check-in to commit the creation of this new paper project directory.

Remember, the command to issue is "svn mkdir" and not "mkdir", the latter only creates a local directory whereas the former will direct svn to create a project directory at the paper repository the next time you commit changes.

August 12, 2009, at 02:13 PM by 172.18.178.201 -
Changed line 2 from:

asd

to:
August 12, 2009, at 02:13 PM by 172.18.178.201 -
Changed line 2 from:
to:

asd

August 12, 2009, at 10:03 AM by 172.18.178.201 -
Changed line 52 from:

Initial Checkout

to:

Initial Checkout

Changed line 170 from:

Update Local Copy from Server

to:

Update Local Copy from Server

Changed line 177 from:

Commit Changes Back to Server

to:

Commit Changes Back to Server

Changed line 184 from:

Branching

to:

Branching

Changed line 198 from:

Merging From Trunk to Branch

to:

Merging From Trunk to Branch

Changed line 211 from:

Reintegrate Branch back to Trunk

to:

Reintegrate Branch back to Trunk

August 12, 2009, at 09:59 AM by 172.18.178.201 -
Added lines 1-2:

(:toc:)

August 11, 2009, at 05:22 PM by 172.18.178.201 -
Changed line 215 from:

svn merge --reintegrate http://bigbird.comp.nus.edu.sg/motion/proj/auv/branches/my-private-branch \ \\

to:

$ svn merge --reintegrate http://bigbird.comp.nus.edu.sg/motion/proj/auv/branches/my-private-branch \ \\

August 11, 2009, at 05:21 PM by 172.18.178.201 -
Changed line 216 from:

> http://bigbird.comp.nus.edu.sg/motion/proj/auv/trunk

to:

> http://bigbird.comp.nus.edu.sg/motion/proj/auv/trunk \\

Changed line 222 from:

svn commit -m "Reintegrate branch back to trunk"

to:

$ svn commit -m "Reintegrate branch back to trunk"

August 11, 2009, at 05:20 PM by 172.18.178.201 -
August 11, 2009, at 05:10 PM by 172.18.178.201 -
August 11, 2009, at 11:21 AM by 172.18.178.201 -
Added lines 168-223:

Update Local Copy from Server

To update your local copy, use svn update

$ svn update

Commit Changes Back to Server

After you have made changes to the local copy. You need to commit changes back to Server so that others can see it. Go the directory that you would like to commit, then:

$ svn commit -m "Fixed a bug"

Branching

Suppose you would like to branch off the project for some private development, this can be easily done in svn---you make a copy of the project in the repository using the svn copy command.

koksung@goskunk:~/temp/trunk$ svn copy http://bigbird.comp.nus.edu.sg/motion/proj/auv/trunk \
> http://bigbird.comp.nus.edu.sg/motion/proj/auv/branches/my-private-branch \
> -m "Creating a private branch"
Committed revision 3.
koksung@goskunk:~/temp/trunk$

Now that you've created a branch of the project, you can check out a new working copy to start using it. The check out command is as mentioned above. There's nothing special about this working copy; it simply mirrors a different directory in the repository. When you commit changes, however, others won't see them when they update, because their working copies are of /auv/trunk.

Merging From Trunk to Branch

Suppose that you have made some bug-fixes on the trunk, and you would like to apply those to your branch too.

$ svn merge http://bigbird.comp.nus.edu.sg/motion/proj/auv/trunk \
> http://bigbird.comp.nus.edu.sg/motion/proj/auv/branches/my-private-branch
$

There may be some conflicts. After resolving these conflicts manually, you need to commit these changes back to server.

$ svn commit -m "Merged bug-fixes from trunk"

Reintegrate Branch back to Trunk

When a branch is mature enough, you may reintegrate the branch back to trunk. The result is to make the trunk identical to the branch.

Note: Remember to merge all bug-fixes from trunk to your branch first. And email everyone before reintegrating branch back to trunk.

svn merge --reintegrate http://bigbird.comp.nus.edu.sg/motion/proj/auv/branches/my-private-branch \
> http://bigbird.comp.nus.edu.sg/motion/proj/auv/trunk $

Resolve possible conflicts, then do a commit

svn commit -m "Reintegrate branch back to trunk"

August 11, 2009, at 11:09 AM by 172.18.178.201 -
Added lines 135-165:

Adding a File to subversion control in an existing repository

The scenario: suppose you have checked-out a repository, you created a file and would like to add it to subversion control to this existing checked-out repository. Follow the steps below. Notice the line:"A aNewFile.txt", the letter 'A' indicates that the file has been made known to svn and it will be added to subversion control during the next commit. Once again, svn's philosophy: "Make local changes, commit for global effect".

koksung@bigbird:~/temp/auv/trunk$ ls -l
total 16344
-rw-r--r-- 1 koksung koksung 26111 Mar 17 09:13 AUVXMLgenerator.py
-rw-r--r-- 1 koksung koksung 16219340 Mar 17 09:13 AUVmedium5depths.pomdp
-rw-r--r-- 1 koksung koksung 455664 Mar 17 09:13 AUVmedium5depths.xml
-rw-r--r-- 1 koksung koksung 19 Mar 17 09:23 aNewFile.txt
koksung@bigbird:~/temp/auv/trunk$ svn add aNewFile.txt
A aNewFile.txt
koksung@bigbird:~/temp/auv/trunk$ svn commit -m "added a new file to subversion control--aNewFile.txt"
Adding trunk/aNewFile.txt
Transmitting file data.
Committed revision 3.
koksung@bigbird:~/temp/auv/trunk$

Deleting an entire repository

Deleting an entire repository is different from deleting a file. For this, we can remote login to the svn server (i.e. bigbird) and remove the entire project folder as the user www-data. This is the most straightforward way.

koksung@bigbird:~$ su - www-data
$ bash
www-data@bigbird:~$ rm -rf /home/motion/proj/auv
www-data@bigbird:~$ exit
$ exit
koksung@bigbird:~$

August 11, 2009, at 11:05 AM by 172.18.178.201 -
Changed lines 17-18 from:

Note: access to the svn repository is now restricted. All actions, e.g. viewing via web-browser, checking out and committing changes, require user authentication.

to:

Note: access to the svn repository is now restricted. All actions, e.g. viewing via web-browser, checking out and committing changes, require user authentication.

Changed lines 35-136 from:

(Repeat 2-5)

to:

(Repeat 2-5)

Standard Workflow for Experimenting (on Branches)

1: Checkout Code (One time only)

2: Make a new branch

3: Do whatever with the branch

4: Apply bug fixes from trunk to branch if necessary (Merging)

5: Delete the branch if it is no longer needed

6: Reintegrate the branch to trunk if it is successful

Initial Checkout

To check out a repository, simply issue the command below. In the example, the trunk folder of the auv repository is being checked out.

koksung@goskunk:~/temp$ svn co http://bigbird.comp.nus.edu.sg/motion/proj/auv/trunk/
A trunk/AUVmedium5depths.pomdp
A trunk/AUVmedium5depths.xml
A trunk/AUVPomdpgenerator.py
A trunk/AUVXMLgenerator.py
Checked out revision 1.
koksung@goskunk:~/temp$

Meaning of the Prefix Letters

Below are some of the more common letters that one encounters during the daily usage of svn and their meanings.

  • 'A' -- Addition
  • 'D' -- Deletion
  • 'C' -- Conflict exists with this file
  • 'G' -- Modifications have been merged successfully
  • 'M' -- File has been modified since last svn update
  • 'R' -- Replaced
  • '?' -- Not in repository, svn doesn't know about it

Creating a New Repository

A new code development repository

Suppose you want to create a new svn project repository, let's call it auv. Firstly create 2 folders, "trunk" and "branches" in your auv project folder. Then move all your source files into the trunk folder. Then issue the following commands in the example shown:

koksung@bigbird:~$ su - www-data
$ bash
www-data@bigbird:~$
www-data@bigbird:~$ svnadmin create /home/motion/proj/auv
www-data@bigbird:~$ svn import /home/koksung/myproject/auv/ file:///home/motion/proj/auv/ -m "Initial Import"
Adding auv/trunk
Adding auv/trunk/AUVmedium5depths.pomdp
Adding auv/trunk/AUVmedium5depths.xml
Adding auv/trunk/AUVPomdpgenerator.py
Adding auv/trunk/AUVXMLgenerator.py
Adding auv/branches

Committed revision 1.
www-data@bigbird:~$ exit
$ exit
koksung@bigbird:~$

Explanation of the above commands: Our server is Ubuntu which is a Debian derivative. The user id that accesses the svn repository via the Apache http protocol is given by the default: www-data. So by initially creating the project folder and importing the files as the user www-data, this eliminates the additional step of having to setup proper permissions for group and others.

Some IMPORTANT things to note, you have to issue the command 'bash' to change www-data user's shell from sh to bash. This will set the environment and path properly for you. In issuing the 'svn import' command, please provide the full path (starting from /) of your project for svn to import from. As the user, www-data is not part of your group, you have to ensure that the proper read and execute bits are set for www-data to import.

Your newly created svn repository is now ready for check out at the URL: http://bigbird.comp.nus.edu.sg/motion/proj/auv/
Note: auv is the example project folder here, remember to change auv to the name of your project. Please send me an email if you still encounter permission problems, koksung

A new paper project directory

As the paper directory is setup to have only one repository, the most straightforward way to create a new paper directory is to first check out the trunk of the paper, i.e. http://bigbird.comp.nus.edu.sg/motion/paper/trunk/ and then in your local copy, issue "svn mkdir <name of paper project>". Lastly, do a svn checkin to commit the creation of this new paper project directory.

Deleting a File in a repository

Note that svn's design philosophy is "Make local changes, commit for global effect". This applies to almost every possible actions. E.g. to delete a file, you will delete your LOCAL copy and then commit, to sync the svn server.

Issue the following commands to delete a file. Notice the line: "D AUVPomdpgenerator.py", the letter 'D' means the file is scheduled for deletion and it will take effect the next time you commit changes.

koksung@bigbird:~/temp/auv/trunk$ ls -l
total 16364
-rw-r--r-- 1 koksung koksung 21264 Mar 17 09:13 AUVPomdpgenerator.py
-rw-r--r-- 1 koksung koksung 26111 Mar 17 09:13 AUVXMLgenerator.py
-rw-r--r-- 1 koksung koksung 16219340 Mar 17 09:13 AUVmedium5depths.pomdp
-rw-r--r-- 1 koksung koksung 455664 Mar 17 09:13 AUVmedium5depths.xml
koksung@bigbird:~/temp/auv/trunk$
koksung@bigbird:~/temp/auv/trunk$ svn delete AUVPomdpgenerator.py
D AUVPomdpgenerator.py
koksung@bigbird:~/temp/auv/trunk$ svn commit -m "Deleted AUVPomdpgenerator.py"
Deleting trunk/AUVPomdpgenerator.py

Committed revision 2.
koksung@bigbird:~/temp/auv/trunk$ ls -l
total 16340
-rw-r--r-- 1 koksung koksung 26111 Mar 17 09:13 AUVXMLgenerator.py
-rw-r--r-- 1 koksung koksung 16219340 Mar 17 09:13 AUVmedium5depths.pomdp
-rw-r--r-- 1 koksung koksung 455664 Mar 17 09:13 AUVmedium5depths.xml
koksung@bigbird:~/temp/auv/trunk$

August 11, 2009, at 10:26 AM by 172.18.178.201 -
Changed lines 1-12 from:

Welcome to PmWiki!

A local copy of PmWiki's documentation has been installed along with the software, and is available via the documentation index.

To continue setting up PmWiki, see initial setup tasks.

The basic editing page describes how to create pages in PmWiki. You can practice editing in the wiki sandbox.

More information about PmWiki is available from http://www.pmwiki.org .

to:

URL Locations

The root svn repository is located at http://bigbird.comp.nus.edu.sg/motion/. It is further sub-divided into 2 other directories, namely, "proj" and "paper".

Project

The directory "proj" will be used to house code developments. Each code development will be an individual repository so that we can keep track of its revision number. For example the project appl can be checked out at, http://bigbird.comp.nus.edu.sg/motion/proj/appl and the momdp code can be checked out at, http://bigbird.comp.nus.edu.sg/motion/proj/momdp.

And if one were to ssh into bigbird, the different projects can found as individual folders by the names "appl" and "momdp" at /home/motion/proj. This is the mapping that apache dav has for the svn repositories.

Paper

The directory "paper" is setup slightly different from "proj", in that there is only one repository, named trunk. Under this repository, it will house multiple projects. The reason for this setup is so that, we can then remotely invoke the command, 'svn mkdir' without having to log into the bigbird server and issue a svnadmin create. This removes the hassle of having to ssh into the main server.

A side-effect of this setup is that there will be a global revision number of all projects under the trunk repository, but this can be tolerated. As an example, the ISRR paper can be checked out at http://bigbird.comp.nus.edu.sg/motion/paper/trunk/isrr09ges/ and NIPS2009 paper at http://bigbird.comp.nus.edu.sg/motion/paper/trunk/nips2009/.

Note: access to the svn repository is now restricted. All actions, e.g. viewing via web-browser, checking out and committing changes, require user authentication.

Standard Workflow for Fixing Bug (on Trunk)

1: Checkout Code (One time only)

2: Update local copies from SVN server

3: Make changes, fix bugs

4: Update local copies and manually fix conflicts if there is any

5: Commit changes

(Repeat 2-5)