Planners
The core of SymbolicPlanners.jl is a library of planning algorithms, or Planner
s:
SymbolicPlanners.Planner
— Typeabstract type Planner
Abstract planner type. Once constructed, a planner
can be run on a domain
, initial state
, and a spec
, returning a Solution
. A problem
can be provided instead of a state
and spec
:
planner(domain::Domain, state::State, spec)
planner(domain::Domain, problem::Problem)
Planner
s define the following interface:
SymbolicPlanners.solve
— Functionsolve(planner::Planner, domain::Domain, state::State, spec::Specification)
Solve a planning problem using the specified planner
.
New subtypes of Planner
should implement this method.
SymbolicPlanners.refine!
— Functionrefine!(sol::Solution, planner::Planner, domain::Domain, state::State, spec)
Refine an existing solution (sol
) to a planning problem (in-place). The spec
argument can be provided as a Specification
or as one or more goal Term
s to be satisfied.
SymbolicPlanners.refine
— Functionrefine(sol::Solution, planner::Planner, domain::Domain, state::State, spec)
Refine a copy of an existing solution to a planning problem.
Breadth-First Search
As a baseline uninformed planning algorithm, one can use BreadthFirstPlanner
:
SymbolicPlanners.BreadthFirstPlanner
— TypeBreadthFirstPlanner(;
max_nodes::Int = typemax(Int),
max_time::Float64 = Inf,
save_search::Bool = false,
save_search_order::Bool = save_search,
verbose::Bool = false,
callback = verbose ? LoggerCallback() : nothing
)
Breadth-first search planner. Nodes are expanded in order of increasing distance from the initial state (skipping previously visited nodes).
Returns a PathSearchSolution
or NullSolution
, similar to ForwardPlanner
.
Arguments
max_nodes
: Maximum number of search nodes before termination.max_time
: Maximum time in seconds before planner times out.save_search
: Flag to save the search tree and frontier in the returned solution.save_search_order
: Flag to save the node expansion order in the returned solution.verbose
: Flag to print debug information during search.callback
: Callback function for logging, etc.
Note that BreadthFirstPlanner
will ignore plan metrics or action costs, returning a plan that has the shortest number of steps, but not necessarily the lowest cost.
Forward Heuristic Search
The modern day workhorse for automated planning is forward heuristic search, implemented by ForwardPlanner
.
SymbolicPlanners.ForwardPlanner
— TypeForwardPlanner(;
heuristic::Heuristic = GoalCountHeuristic(),
search_noise::Union{Nothing,Float64} = nothing,
g_mult::Float32 = 1.0f0,
h_mult::Float32 = 1.0f0,
max_nodes::Int = typemax(Int),
max_time::Float64 = Inf,
fail_fast::Bool = false,
refine_method::Symbol = :continue,
reset_node_count::Bool = true,
save_search::Bool = true,
save_search_order::Bool = true,
save_parents::Bool = false,
save_children::Bool = false,
verbose::Bool = false,
callback = verbose ? LoggerCallback() : nothing
)
Forward best-first search planner, which encompasses uniform-cost search, greedy search, and A* search. Each node $n$ is expanded in order of increasing priority $f(n)$, defined as:
\[f(n) = g_\text{mult} \cdot g(n) + h_\text{mult} \cdot h(n)\]
where $g(n)$ is the path cost from the initial state to $n$, and $h(n)$ is the heuristic's goal distance estimate.
Returns a PathSearchSolution
if the goal is achieved, containing a plan that reaches the goal node, and status
set to :success
. If the node or time budget runs out, the solution will instead contain a partial plan to the last node selected for expansion, with status
set to :max_nodes
or :max_time
accordingly.
If save_search
is true, the returned solution will contain the search tree and frontier so far. If save_search
is true and the search space is exhausted return a NullSolution
with status
set to :failure
.
Arguments
heuristic
: Search heuristic that estimates cost of a state to the goal.search_noise
: Amount of Boltzmann search noise (nothing
for deterministic search).g_mult
: Path cost multiplier when computing the $f$ value of a search node.h_mult
: Heuristic multiplier when computing the $f$ value of a search node.max_nodes
: Maximum number of search nodes before termination.max_time
: Maximum time in seconds before planner times out.fail_fast
: Flag to terminate search if the heuristic estimates an infinite cost.refine_method
: Solution refinement method (one of:continue
,:reroot
,:restart
)reset_node_count
: Whether to reset the expanded node count before solution refinement.save_search
: Flag to save the search tree and frontier (needed for refinement).save_search_order
: Flag to save the node expansion order in the solution.save_parents
: Flag to save all parent pointers in search tree (needed for rerooting).save_children
: Flag to save all children pointers in search tree (needed for rerooting).verbose
: Flag to print debug information during search.callback
: Callback function for logging, etc.
Refinement Methods
Setting the refine_method
keyword argument controls the behavior of refine!
when called on a PathSearchSolution
:
:continue
(default): Continues the search by expanding the search tree rooted at the original starting state.save_search
will default totrue
if this method is used.:reroot
: Reroots the search tree at the newly-provided starting state, then continues the search, as in Fringe-Retrieving A* [1].save_search
,save_parents
, andsave_children
will default totrue
if this method is used.:restart
: Restarts the search from the new starting state, throwing away the previous search tree and frontier. This is the only valid refinement method whensave_search
isfalse
.
[1] X. Sun, W. Yeoh, and S. Koenig, “Generalized Fringe-Retrieving A*: Faster moving target search on state lattices,” AAMAS (2010), pp. 1081-1088. https://dl.acm.org/doi/abs/10.5555/1838206.1838352
Several convenience constructors for variants of ForwardPlanner
are provided:
SymbolicPlanners.UniformCostPlanner
— FunctionUniformCostPlanner(; kwargs...)
Uniform-cost search. Nodes with the lowest path cost from the initial state are expanded first (i.e. the search heuristic is not used).
SymbolicPlanners.GreedyPlanner
— FunctionGreedyPlanner(heuristic; kwargs...)
Greedy best-first search, with cycle checking. Nodes with the lowest heuristic value are expanded first (i.e. the cost of reaching them from the initial state is ignored).
SymbolicPlanners.AStarPlanner
— FunctionAStarPlanner(heuristic; kwargs...)
A* search. Nodes with the lowest $f$ value are expanded first. This is guaranteed to produce a cost-optimal solution if the heuristic
is admissible.
SymbolicPlanners.WeightedAStarPlanner
— FunctionWeightedAStarPlanner(heuristic, h_mult; kwargs...)
Weighted A* search, which multiplies the heuristic estimate by h_mult
when computing the $f$ value of a node. Nodes with the lowest $f$ value are expanded first.
Probabilistic variants of forward search are also provided:
SymbolicPlanners.ProbForwardPlanner
— TypeProbForwardPlanner(;
search_noise::Float64 = 1.0,
kwargs...
)
A probabilistic variant of forward best-first search. Instead of always expanding the node with lowest $f$ value in the search frontier, this samples a node to expand according to Boltzmann distribution, where the $f$ value of a frontier node is treated as the unnormalized log probability of expansion.
The temperature for Boltzmann sampling is defined by search_noise
. Higher values lead to more random search, lower values lead to more deterministic search.
Useful for simulating a diversity of potentially sub-optimal plans, especially when paired with a limited max_nodes
budget.
An alias for ForwardPlanner{Float64}
. See ForwardPlanner
for other arguments.
SymbolicPlanners.ProbAStarPlanner
— FunctionProbAStarPlanner(heuristic; search_noise, kwargs...)
A probabilistic variant of A* search. See ProbForwardPlanner
for how nodes are probabilistically expanded.
Backward Heuristic Search
Due to the flexibility of the PDDL.jl API, SymbolicPlanners.jl supports backward search:
SymbolicPlanners.BackwardPlanner
— TypeBackwardPlanner(;
heuristic::Heuristic = GoalCountHeuristic(:backward),
search_noise::Union{Nothing,Float64} = nothing,
g_mult::Float32 = 1.0f0,
h_mult::Float32 = 1.0f0,
max_nodes::Int = typemax(Int),
max_time::Float64 = Inf,
fail_fast::Bool = false,
save_search::Bool = false,
save_search_order::Bool = save_search,
verbose::Bool = false,
callback = verbose ? LoggerCallback() : nothing
)
Heuristic-guided backward (i.e. regression) search planner. Instead of searching forwards, searches backwards from the goal, which is treated as a set of states which satisfy the goal predicates (equivalently, a partial state, because only some predicates and fluents may be specified). Each expanded node also corresponds to a partial state. [1]
As with ForwardPlanner
, each node $n$ is expanded in order of increasing priority $f(n)$, defined as:
\[f(n) = g_\text{mult} \cdot g(n) + h_\text{mult} \cdot h(n)\]
However $g(n)$ is instead defined as the path cost from the goal to the current node $n$, and $h(n)$ is a heuristic estimate of the distance from the initial state. As such, only certain heuristics, such as GoalCountHeuristic
and HSPRHeuristic
can be used with backward search.
Returns a PathSearchSolution
or NullSolution
, similar to ForwardPlanner
.
This planner does not currently support domains with non-Boolean fluents or problems involving constraint specifications.
[1] B. Bonet and H. Geffner, "Planning as Heuristic Search," Artificial Intelligence, vol. 129, no. 1, pp. 5–33, Jun. 2001, https://doi.org/10.1016/S0004-3702(01)00108-4.
Arguments
heuristic
: Search heuristic that estimates cost of a state to the goal.search_noise
: Amount of Boltzmann search noise (nothing
for deterministic search).g_mult
: Path cost multiplier when computing the $f$ value of a search node.h_mult
: Heuristic multiplier when computing the $f$ value of a search node.max_nodes
: Maximum number of search nodes before termination.max_time
: Maximum time in seconds before planner times out.fail_fast
: Flag to terminate search if the heuristic estimates an infinite cost.save_search
: Flag to save the search tree and frontier in the returned solution.save_search_order
: Flag to save the node expansion order in the returned solution.verbose
: Flag to print debug information during search.callback
: Callback function for logging, etc.
The following convenience constructors are provided:
SymbolicPlanners.BackwardGreedyPlanner
— FunctionBackwardGreedyPlanner(heuristic; kwargs...)
Backward greedy search, with cycle checking.
SymbolicPlanners.BackwardAStarPlanner
— FunctionBackwardAStarPlanner(heuristic; kwargs...)
Backward A* search.
As with ForwardPlanner
, probabilistic variants of backward search are provided:
SymbolicPlanners.ProbBackwardPlanner
— TypeProbBackwardPlanner(;
search_noise::Float64 = 1.0,
kwargs...
)
A probabilistic variant of backward search, with the same node expansion rule as ProbForwardPlanner
.
An alias for BackwardPlanner{Float64}
. See BackwardPlanner
for other arguments.
SymbolicPlanners.ProbBackwardAStarPlanner
— FunctionProbBackwardAStarPlanner(heuristic; search_noise, kwargs...)
A probabilistic variant of backward A* search.
Bidirectional Search
A simple implementation of bidirectional search is provided by BidirectionalPlanner
.
SymbolicPlanners.BidirectionalPlanner
— Typeplanner = BidirectionalPlanner(;
forward::ForwardPlanner = ForwardPlanner(),
backward::BackwardPlanner = BackwardPlanner(),
max_nodes::Int = typemax(Int),
max_time::Float64 = Inf,
save_search::Bool = false
)
A bi-directional planner which simulataneously runs a forward search from the initial state and backward search from the goal, succeeding if either search is successful, or if the search frontiers are detected to cross.
Frontier crossing is detected by checking whether the most recently expanded forward node is subsumed by a node in the backward search frontier, or vice versa. Subsumption means that the partial state represented by a backward node is consistent with the complete state represented by forward node.
While the above procedure is not complete (i.e. some crossings will be missed), it represents a trade-off between the cost of testing for subsumption and the benefit of detecting a crossing, in lieu of more sophisticated methods [1].
[1] V. Alcázar, S. Fernández, and D. Borrajo, "Analyzing the Impact of Partial States on Duplicate Detection and Collision of Frontiers," ICAPS (2014), https://doi.org/10.1609/icaps.v24i1.13677
Arguments
forward
: Forward search configuration.backward
: Forward search configuration.max_nodes
: Maximum number of search nodes before termination.max_time
: Maximum time in seconds before planner times out.save_search
: Flag to save the search tree and frontier in the returned solution.
The following convenience constructors are provided:
SymbolicPlanners.BiGreedyPlanner
— FunctionBiGreedyPlanner(f_heuristic, b_heuristic; kwargs...)
Bidirectional greedy best-first search, where f_heuristic
is the forward search heuristic and b_heuristic
is the backward search heuristic. Options specified as
kwargs` are shared by both the backward and forward search.
SymbolicPlanners.BiAStarPlanner
— FunctionBiAStarPlanner(f_heuristic, b_heuristic; kwargs...)
Bidirectional A* search, where f_heuristic
is the forward search heuristic and b_heuristic
is the backward search heuristic. Options specified as
kwargs` are shared by both the backward and forward search.
Policy-Based Planners
Unlike most systems for automated symbolic planning, SymbolicPlanners.jl includes several policy-based planners which return PolicySolution
s.
These planners are especially useful for stochastic environments (even if the planner operates over a determinized version of the true domain), since they compute what action should be performed in each encountered state, rather than just a sequence of ordered actions. They are also useful for real-time planning via reuse of previous solutions.
SymbolicPlanners.RealTimeDynamicPlanner
— TypeRealTimeDynamicPlanner(;
heuristic::Heuristic = GoalCountHeuristic(),
n_rollouts::Int = 50,
max_depth::Int = 50,
rollout_noise::Float64 = 0.0,
action_noise::Float64 = 0.0,
verbose::Bool = false,
callback = verbose ? LoggerCallback() : nothing
)
Planner that uses Real Time Dynamic Programming (RTDP
for short), a form of asynchronous value iteration which performs greedy rollouts from the initial state, updating the value estimates of states encountered along the way [1].
If a heuristic
is provided, the negated heuristic value will be used as an initial value estimate for newly encountered states (since the value of a state in a shortest path problem is the cost to reach the goal), thereby guiding early rollouts.
For admissible (i.e. optimistic) heuristics, convergence to the true value function is guaranteed in the reachable state space after a sufficient number of rollouts.
Returns a TabularPolicy
(wrapped in a BoltzmannPolicy
if action_noise > 0
), which stores the value estimates and action Q-values for each encountered state.
[1] A. G. Barto, S. J. Bradtke, and S. P. Singh, "Learning to Act using Real-Time Dynamic Programming," Artificial Intelligence, vol. 72, no. 1, pp. 81–138, Jan. 1995, https://doi.org/10.1016/0004-3702(94)00011-O.
Arguments
heuristic
: Search heuristic used to initialize the value function.n_rollouts
: Number of rollouts to perform from the initial state.max_depth
: Maximum depth of each rollout.rollout_noise
: Amount of Boltzmann noise during simulated rollouts.action_noise
: Amount of Boltzmann action noise for the returned policy.verbose
: Flag to print debug information during search.callback
: Callback function for logging, etc.
SymbolicPlanners.RealTimeHeuristicSearch
— TypeRealTimeHeuristicSearch(;
heuristic::Heuristic = GoalCountHeuristic(),
n_iters::Int = 1,
max_nodes::Int = 50,
update_method::Symbol = :dijkstra,
search_neighbors::Symbol = :unexpanded,
save_search::Bool = true,
reuse_search::Bool = false,
reuse_paths::Bool = true,
kwargs...
)
RealTimeHeuristicSearch(
planner::ForwardPlanner,
n_iters::Int = 1,
update_method::Symbol = :dijkstra,
search_neighbors::Symbol = :unexpanded,
reuse_search::Bool = true,
reuse_paths::Bool = true,
)
A real time heuristic search (RTHS
) algorithm [1] similar to RTDP
. Instead of greedy rollouts, forward heuristic search is performed from the current state (up to max_nodes
), and value estimates are updated for all states in the interior of the search tree. Search may also be performed from neighboring states by configuring search_neighbors
. A simulated action is then taken to the best neighboring state. This process repeats for n_iters
, with future searches using the updated value function as a more informed heuristic.
Returns a ReusableTreePolicy
, containing a TabularVPolicy
of state value estimates, the most recent PathSearchSolution
produced by forward search (including a search tree if save_search
is true
), and a reusable tree of goal paths if reuse_paths
is true
.
Arguments
planner
: TheForwardPlanner
used for lookahead heuristic search. Any keyword arguments accepted byForwardPlanner
are also keyword arguments forRTHS
(e.g.heuristic
,max_nodes
,save_search
, etc.).n_iters
: Number of iterations to perform. In each iteration, search is followed by a simulated action to the best neighboring state. If a goal or dead end is reached, we restart from the initial state.update_method
: Method used to update value estimates, either via cost differencing from the terminal node (:costdiff
), or via Dijkstra's algorithm (:dijkstra
).search_neighbors
: Controls whether search is additionally performed from all neighbors of the current state (:all
), from neighbors unexpanded by the initial search (:unexpanded
), or none of them (:none
).reuse_search
: Iftrue
, then previous search solutions are reused by subsequent searches in future iterations or calls torefine!
. The latter requiressave_search
to betrue
. Search solutions are reused via the:reroot
refinement method forForwardPlanner
.reuse_paths
: Iftrue
, then every time a path to the goal is found, it is stored in a reusable tree of goal paths, as in Tree Adaptive A* [2]. Future searches will terminate once a previous path is encountered, reducing the cost of search. A consistentheuristic
is required to ensure path optimality.
Update Methods
Setting the update_method
keyword argument controls how value estimates $V(s)$ (or equivalently, cost-to-goal estimates $h(s) = -V(s)$) are updated:
:costdiff
: Value estimates are updated by cost differencing from the terminal node $t$. For each node $s$ in the search tree's interior, we estimate the cost-to-goal $h(s)$ by adding a lower bound on the cost from $s$ to $t$ to the cost-to-goal of $t$:\[h(s) = h(t) + (c(r, t) - c(r, s))\]
where $c(r, s)$ is the cost from the root node $r$ to $s$. This is the update used by Real-Time Adaptive A* [3]. Takes $O(N)$ time, where $N$ is the size of the tree's interior.
:dijkstra
: Value estimates are backpropagated from the search frontier via Dijkstra's algorithm. For each node $s$ in tree's interior, we update the cost-to-goal $h(s)$ by minimizing over all paths to the frontier:\[h(s) = \min_{t \in F} h(t) + c(s, t)\]
This is the update rule by LSS-LRTA* [4], which produces more informed value estimates than
:costdiff
, but takes $O(NB \log NB)$ time. $N$ is the size of the tree's interior and $B$ is the maximal branching factor. Thesave_parents
keyword forForwardPlanner
defaults totrue
when this method is used.
Both of these update methods require a heuristic
that is initially consistent for the updated value estimates to remain consistent.
[1] R. E. Korf, "Real-Time Heuristic Search," Artificial Intelligence, vol. 42, no. 2, pp. 189–211, Mar. 1990, https://doi.org/10.1016/0004-3702(90)90054-4.
[2] C. Hernández, X. Sun, S. Koenig, and P. Meseguer, "Tree Adaptive A*," AAMAS (2011), pp. 123–130. https://dl.acm.org/doi/abs/10.5555/2030470.2030488.
[3] S. Koenig and M. Likhachev, "Real-Time Adaptive A*," AAMAS (2006), pp. 281–288. https://doi.org/10.1145/1160633.1160682.
[4] S. Koenig and X. Sun, "Comparing real-time and incremental heuristic search for real-time situated agents", AAMAS (2009), pp. 313–341. https://dl.acm.org/doi/10.5555/1018410.1018838.
SymbolicPlanners.AlternatingRealTimeHeuristicSearch
— TypeAlternatingRealTimeHeuristicSearch(
planners::RealTimeHeuristicSearch...;
share_values::Bool = true,
share_search::Bool = false,
share_paths::Bool = true,
)
A variant of RealTimeHeuristicSearch
(AlternatingRTHS
or ARTHS
for short) that alternates between different search strategies, corresponding to one or more planners
. For example, ARTHS
can alternate between a RTHS
planner that uses A* search (h_mult = 1.0
) and an RTHS
planner that uses breadth-first search (h_mult = 0.0
), ensuring that the updated region of the state space is both deep and broad.
Returns a MultiSolution
composed of ReusableTreePolicy
sub-solutions. By default, these sub-solutions share and update the same value estimates (share_values = true
) and the same reusable tree of optimal paths to the goal (share_paths = true
), but do not share the same reusable forward search tree (share_search = false
).
SymbolicPlanners.MonteCarloTreeSearch
— TypeMonteCarloTreeSearch(
n_rollouts::Int64 = 50,
max_depth::Int64 = 50,
heuristic::Heuristic = NullHeuristic(),
selector::MCTSNodeSelector = BoltzmannUCBSelector(),
estimator::MCTSLeafEstimator = RandomRolloutEstimator()
end
Planner that uses Monte Carlo Tree Search (MCTS
for short) [1], with a customizable initial value heuristic
, node selector
strategy, and leaf node value estimator
.
Arguments
n_rollouts
: Number of search rollouts to perform.max_depth
: Maximum depth of rollout (including the selection and estimation phases).heuristic
: Initial value heuristic for newly encountered states / leaf nodes.selector
: Node selection strategy for previously visited nodes (e.g. MaxUCB).estimator
: Estimator for leaf node values (e.g. random or policy-based rollouts).
External Planners
Wrappers for several external planners are provided:
SymbolicPlanners.FastDownward
— TypeFastDownward(
search::String = "astar",
heuristic::String = "add",
h_params::Dict{String, String} = Dict(),
max_time::Float64 = 300,
verbose::Bool = false,
log_stats::Bool = true,
fd_path::String = get(ENV, "FD_PATH", ""),
py_cmd::String = get(ENV, "PYTHON", "python")
)
Wrapper for the FastDownward planning system [1]. The planner has to be installed locally for this wrapper to be used. Consult the FastDownward documentation for further explanation of options.
[1] M. Helmert, "The Fast Downward Planning System," Journal of Artificial Intelligence Research, vol. 26, pp. 191–246, Jul. 2006, https://doi.org/10.1613/jair.1705.
Arguments
search
: String specifying search algorithm (e.g. "astar", "ehc").heuristic
: String specifying search heuristic (e.g. "add", "lmcut",).h_params
: Heuristic parameters as a dictionary mapping names to values.max_time
: Maximum time in seconds before planner times out.verbose
: Flag to print planner outputs.log_stats
: Flag to log solution statistics.fd_path
: Path tofast_downward.py
.py_cmd
: Path to Python executable.
SymbolicPlanners.Pyperplan
— TypePyperplan(
search::String = "astar",
heuristic::String = "add",
log_level::String = "info",
log_stats::Bool = true,
max_time::Float64 = 300,
verbose::Bool = false,
py_cmd::String = get(ENV, "PYTHON", "python")
)
Wrapper for the Pyperplan lightweight STRIPS planner [1]. The planner has to be installed locally for this wrapper to be used. Consult the Pyperplan documentation for further explanation of options.
[1] Y. Alkhazraji et al., "Pyperplan." Zenodo, 2020. https://doi.org/10.5281/zenodo.3700819.
Arguments
search
: String specifying search algorithm (e.g. "astar", "gbf").heuristic
: String specifying search heuristic (e.g. "hadd", "hmax",).log_level
: How much information to log when running the planner.log_stats
: Flag to log solution statistics.max_time
: Maximum time in seconds before planner times out.verbose
: Flag to print planner outputs.py_cmd
: Path to Python executable.
SymbolicPlanners.ENHSP
— TypeENHSP(
search::String = "astar",
heuristic::String = "add",
h_mult::Float64 = 1.0,
log_stats::Bool = true,
max_time::Float64 = 300,
verbose::Bool = false,
enhsp_path::String = get(ENV, "ENHSP_PATH", ""),
java_cmd::String = get(ENV, "JAVA", "java")
)
Wrapper for the Expressive Numeric Heuristic Search Planner (ENHSP) [1]. The planner has to be installed locally for this wrapper to be used. Consult the ENHSP documentation for further explanation of options.
[1] E. Scala et al., "ENHSP", https://sites.google.com/view/enhsp/.
Arguments
search
: String specifying search algorithm (e.g. "gbfs", "WAStar").heuristic
: String specifying search heuristic (e.g. "hadd", "aibr",).h_mult
: Heuristic multiplier for weighted A*.log_stats
: Flag to log solution statistics.max_time
: Maximum time in seconds before planner times out.verbose
: Flag to print planner outputs.enhsp_path
: Path toenhsp.jar
.java_cmd
: Path to Java executable.