Planners

The core of SymbolicPlanners.jl is a library of planning algorithms, or Planners:

SymbolicPlanners.PlannerType
abstract type Planner

Abstract planner type. Once constructed, a planner can be run on a domain, initial state, and a spec, returning a Solution. A problem can be provided instead of a state and spec:

planner(domain::Domain, state::State, spec)
planner(domain::Domain, problem::Problem)

New planners should define solve and (optionally) refine!.

source

Planners define the following interface:

SymbolicPlanners.solveFunction
solve(planner::Planner, domain::Domain, state::State, spec::Specification)

Solve a planning problem using the specified planner.

New subtypes of Planner should implement this method.

source
SymbolicPlanners.refine!Function
refine!(sol::Solution, planner::Planner, domain::Domain, state::State, spec)

Refine an existing solution (sol) to a planning problem (in-place). The spec argument can be provided as a Specification or as one or more goal Terms to be satisfied.

source
SymbolicPlanners.refineFunction
refine(sol::Solution, planner::Planner, domain::Domain, state::State, spec)

Refine a copy of an existing solution to a planning problem.

source

As a baseline uninformed planning algorithm, one can use BreadthFirstPlanner:

SymbolicPlanners.BreadthFirstPlannerType
BreadthFirstPlanner(;
    max_nodes::Int = typemax(Int),
    max_time::Float64 = Inf,
    save_search::Bool = false,
    save_search_order::Bool = save_search,
    verbose::Bool = false,
    callback = verbose ? LoggerCallback() : nothing
)

Breadth-first search planner. Nodes are expanded in order of increasing distance from the initial state (skipping previously visited nodes).

Returns a PathSearchSolution or NullSolution, similar to ForwardPlanner.

Arguments

  • max_nodes: Maximum number of search nodes before termination.

  • max_time: Maximum time in seconds before planner times out.

  • save_search: Flag to save the search tree and frontier in the returned solution.

  • save_search_order: Flag to save the node expansion order in the returned solution.

  • verbose: Flag to print debug information during search.

  • callback: Callback function for logging, etc.

source

Note that BreadthFirstPlanner will ignore plan metrics or action costs, returning a plan that has the shortest number of steps, but not necessarily the lowest cost.

The modern day workhorse for automated planning is forward heuristic search, implemented by ForwardPlanner.

SymbolicPlanners.ForwardPlannerType
ForwardPlanner(;
    heuristic::Heuristic = GoalCountHeuristic(),
    search_noise::Union{Nothing,Float64} = nothing,
    g_mult::Float32 = 1.0f0,
    h_mult::Float32 = 1.0f0,
    max_nodes::Int = typemax(Int),
    max_time::Float64 = Inf,
    fail_fast::Bool = false,
    refine_method::Symbol = :continue,
    reset_node_count::Bool = true,
    save_search::Bool = true,
    save_search_order::Bool = true,
    save_parents::Bool = false,
    save_children::Bool = false,
    verbose::Bool = false,
    callback = verbose ? LoggerCallback() : nothing
)

Forward best-first search planner, which encompasses uniform-cost search, greedy search, and A* search. Each node $n$ is expanded in order of increasing priority $f(n)$, defined as:

\[f(n) = g_\text{mult} \cdot g(n) + h_\text{mult} \cdot h(n)\]

where $g(n)$ is the path cost from the initial state to $n$, and $h(n)$ is the heuristic's goal distance estimate.

Returns a PathSearchSolution if the goal is achieved, containing a plan that reaches the goal node, and status set to :success. If the node or time budget runs out, the solution will instead contain a partial plan to the last node selected for expansion, with status set to :max_nodes or :max_time accordingly.

If save_search is true, the returned solution will contain the search tree and frontier so far. If save_search is true and the search space is exhausted return a NullSolution with status set to :failure.

Arguments

  • heuristic: Search heuristic that estimates cost of a state to the goal.

  • search_noise: Amount of Boltzmann search noise (nothing for deterministic search).

  • g_mult: Path cost multiplier when computing the $f$ value of a search node.

  • h_mult: Heuristic multiplier when computing the $f$ value of a search node.

  • max_nodes: Maximum number of search nodes before termination.

  • max_time: Maximum time in seconds before planner times out.

  • fail_fast: Flag to terminate search if the heuristic estimates an infinite cost.

  • refine_method: Solution refinement method (one of :continue, :reroot, :restart)

  • reset_node_count: Whether to reset the expanded node count before solution refinement.

  • save_search: Flag to save the search tree and frontier (needed for refinement).

  • save_search_order: Flag to save the node expansion order in the solution.

  • save_parents: Flag to save all parent pointers in search tree (needed for rerooting).

  • save_children: Flag to save all children pointers in search tree (needed for rerooting).

  • verbose: Flag to print debug information during search.

  • callback: Callback function for logging, etc.

Refinement Methods

Setting the refine_method keyword argument controls the behavior of refine! when called on a PathSearchSolution:

  • :continue (default): Continues the search by expanding the search tree rooted at the original starting state. save_search will default to true if this method is used.

  • :reroot: Reroots the search tree at the newly-provided starting state, then continues the search, as in Fringe-Retrieving A* [1]. save_search, save_parents, and save_children will default to true if this method is used.

  • :restart: Restarts the search from the new starting state, throwing away the previous search tree and frontier. This is the only valid refinement method when save_search is false.

[1] X. Sun, W. Yeoh, and S. Koenig, “Generalized Fringe-Retrieving A*: Faster moving target search on state lattices,” AAMAS (2010), pp. 1081-1088. https://dl.acm.org/doi/abs/10.5555/1838206.1838352

source

Several convenience constructors for variants of ForwardPlanner are provided:

SymbolicPlanners.UniformCostPlannerFunction
UniformCostPlanner(; kwargs...)

Uniform-cost search. Nodes with the lowest path cost from the initial state are expanded first (i.e. the search heuristic is not used).

source
SymbolicPlanners.GreedyPlannerFunction
GreedyPlanner(heuristic; kwargs...)

Greedy best-first search, with cycle checking. Nodes with the lowest heuristic value are expanded first (i.e. the cost of reaching them from the initial state is ignored).

source
SymbolicPlanners.AStarPlannerFunction
AStarPlanner(heuristic; kwargs...)

A* search. Nodes with the lowest $f$ value are expanded first. This is guaranteed to produce a cost-optimal solution if the heuristic is admissible.

source
SymbolicPlanners.WeightedAStarPlannerFunction
WeightedAStarPlanner(heuristic, h_mult; kwargs...)

Weighted A* search, which multiplies the heuristic estimate by h_mult when computing the $f$ value of a node. Nodes with the lowest $f$ value are expanded first.

source

Probabilistic variants of forward search are also provided:

SymbolicPlanners.ProbForwardPlannerType
ProbForwardPlanner(;
    search_noise::Float64 = 1.0,
    kwargs...
)

A probabilistic variant of forward best-first search. Instead of always expanding the node with lowest $f$ value in the search frontier, this samples a node to expand according to Boltzmann distribution, where the $f$ value of a frontier node is treated as the unnormalized log probability of expansion.

The temperature for Boltzmann sampling is defined by search_noise. Higher values lead to more random search, lower values lead to more deterministic search.

Useful for simulating a diversity of potentially sub-optimal plans, especially when paired with a limited max_nodes budget.

An alias for ForwardPlanner{Float64}. See ForwardPlanner for other arguments.

source

Due to the flexibility of the PDDL.jl API, SymbolicPlanners.jl supports backward search:

SymbolicPlanners.BackwardPlannerType
BackwardPlanner(;
    heuristic::Heuristic = GoalCountHeuristic(:backward),
    search_noise::Union{Nothing,Float64} = nothing,
    g_mult::Float32 = 1.0f0,
    h_mult::Float32 = 1.0f0,
    max_nodes::Int = typemax(Int),
    max_time::Float64 = Inf,
    fail_fast::Bool = false,
    save_search::Bool = false,
    save_search_order::Bool = save_search,
    verbose::Bool = false,
    callback = verbose ? LoggerCallback() : nothing
)

Heuristic-guided backward (i.e. regression) search planner. Instead of searching forwards, searches backwards from the goal, which is treated as a set of states which satisfy the goal predicates (equivalently, a partial state, because only some predicates and fluents may be specified). Each expanded node also corresponds to a partial state. [1]

As with ForwardPlanner, each node $n$ is expanded in order of increasing priority $f(n)$, defined as:

\[f(n) = g_\text{mult} \cdot g(n) + h_\text{mult} \cdot h(n)\]

However $g(n)$ is instead defined as the path cost from the goal to the current node $n$, and $h(n)$ is a heuristic estimate of the distance from the initial state. As such, only certain heuristics, such as GoalCountHeuristic and HSPRHeuristic can be used with backward search.

Returns a PathSearchSolution or NullSolution, similar to ForwardPlanner.

This planner does not currently support domains with non-Boolean fluents or problems involving constraint specifications.

[1] B. Bonet and H. Geffner, "Planning as Heuristic Search," Artificial Intelligence, vol. 129, no. 1, pp. 5–33, Jun. 2001, https://doi.org/10.1016/S0004-3702(01)00108-4.

Arguments

  • heuristic: Search heuristic that estimates cost of a state to the goal.

  • search_noise: Amount of Boltzmann search noise (nothing for deterministic search).

  • g_mult: Path cost multiplier when computing the $f$ value of a search node.

  • h_mult: Heuristic multiplier when computing the $f$ value of a search node.

  • max_nodes: Maximum number of search nodes before termination.

  • max_time: Maximum time in seconds before planner times out.

  • fail_fast: Flag to terminate search if the heuristic estimates an infinite cost.

  • save_search: Flag to save the search tree and frontier in the returned solution.

  • save_search_order: Flag to save the node expansion order in the returned solution.

  • verbose: Flag to print debug information during search.

  • callback: Callback function for logging, etc.

source

The following convenience constructors are provided:

As with ForwardPlanner, probabilistic variants of backward search are provided:

A simple implementation of bidirectional search is provided by BidirectionalPlanner.

SymbolicPlanners.BidirectionalPlannerType
planner = BidirectionalPlanner(;
    forward::ForwardPlanner = ForwardPlanner(),
    backward::BackwardPlanner = BackwardPlanner(),
    max_nodes::Int = typemax(Int),
    max_time::Float64 = Inf,
    save_search::Bool = false
)

A bi-directional planner which simulataneously runs a forward search from the initial state and backward search from the goal, succeeding if either search is successful, or if the search frontiers are detected to cross.

Frontier crossing is detected by checking whether the most recently expanded forward node is subsumed by a node in the backward search frontier, or vice versa. Subsumption means that the partial state represented by a backward node is consistent with the complete state represented by forward node.

While the above procedure is not complete (i.e. some crossings will be missed), it represents a trade-off between the cost of testing for subsumption and the benefit of detecting a crossing, in lieu of more sophisticated methods [1].

[1] V. Alcázar, S. Fernández, and D. Borrajo, "Analyzing the Impact of Partial States on Duplicate Detection and Collision of Frontiers," ICAPS (2014), https://doi.org/10.1609/icaps.v24i1.13677

Arguments

  • forward: Forward search configuration.

  • backward: Forward search configuration.

  • max_nodes: Maximum number of search nodes before termination.

  • max_time: Maximum time in seconds before planner times out.

  • save_search: Flag to save the search tree and frontier in the returned solution.

source

The following convenience constructors are provided:

SymbolicPlanners.BiGreedyPlannerFunction
BiGreedyPlanner(f_heuristic, b_heuristic; kwargs...)

Bidirectional greedy best-first search, where f_heuristic is the forward search heuristic and b_heuristicis the backward search heuristic. Options specified askwargs` are shared by both the backward and forward search.

source
SymbolicPlanners.BiAStarPlannerFunction
BiAStarPlanner(f_heuristic, b_heuristic; kwargs...)

Bidirectional A* search, where f_heuristic is the forward search heuristic and b_heuristicis the backward search heuristic. Options specified askwargs` are shared by both the backward and forward search.

source

Policy-Based Planners

Unlike most systems for automated symbolic planning, SymbolicPlanners.jl includes several policy-based planners which return PolicySolutions.

These planners are especially useful for stochastic environments (even if the planner operates over a determinized version of the true domain), since they compute what action should be performed in each encountered state, rather than just a sequence of ordered actions. They are also useful for real-time planning via reuse of previous solutions.

SymbolicPlanners.RealTimeDynamicPlannerType
RealTimeDynamicPlanner(;
    heuristic::Heuristic = GoalCountHeuristic(),
    n_rollouts::Int = 50,
    max_depth::Int = 50,
    rollout_noise::Float64 = 0.0,
    action_noise::Float64 = 0.0,
    verbose::Bool = false,
    callback = verbose ? LoggerCallback() : nothing
)

Planner that uses Real Time Dynamic Programming (RTDP for short), a form of asynchronous value iteration which performs greedy rollouts from the initial state, updating the value estimates of states encountered along the way [1].

If a heuristic is provided, the negated heuristic value will be used as an initial value estimate for newly encountered states (since the value of a state in a shortest path problem is the cost to reach the goal), thereby guiding early rollouts.

For admissible (i.e. optimistic) heuristics, convergence to the true value function is guaranteed in the reachable state space after a sufficient number of rollouts.

Returns a TabularPolicy (wrapped in a BoltzmannPolicy if action_noise > 0), which stores the value estimates and action Q-values for each encountered state.

[1] A. G. Barto, S. J. Bradtke, and S. P. Singh, "Learning to Act using Real-Time Dynamic Programming," Artificial Intelligence, vol. 72, no. 1, pp. 81–138, Jan. 1995, https://doi.org/10.1016/0004-3702(94)00011-O.

Arguments

  • heuristic: Search heuristic used to initialize the value function.

  • n_rollouts: Number of rollouts to perform from the initial state.

  • max_depth: Maximum depth of each rollout.

  • rollout_noise: Amount of Boltzmann noise during simulated rollouts.

  • action_noise: Amount of Boltzmann action noise for the returned policy.

  • verbose: Flag to print debug information during search.

  • callback: Callback function for logging, etc.

source
SymbolicPlanners.RealTimeHeuristicSearchType
RealTimeHeuristicSearch(;
    heuristic::Heuristic = GoalCountHeuristic(),
    n_iters::Int = 1,
    max_nodes::Int = 50,
    update_method::Symbol = :dijkstra,
    search_neighbors::Symbol = :unexpanded,
    save_search::Bool = true,
    reuse_search::Bool = false,
    reuse_paths::Bool = true,
    kwargs...
)

RealTimeHeuristicSearch(
    planner::ForwardPlanner,
    n_iters::Int = 1,
    update_method::Symbol = :dijkstra,
    search_neighbors::Symbol = :unexpanded,
    reuse_search::Bool = true,
    reuse_paths::Bool = true,
)

A real time heuristic search (RTHS) algorithm [1] similar to RTDP. Instead of greedy rollouts, forward heuristic search is performed from the current state (up to max_nodes), and value estimates are updated for all states in the interior of the search tree. Search may also be performed from neighboring states by configuring search_neighbors. A simulated action is then taken to the best neighboring state. This process repeats for n_iters, with future searches using the updated value function as a more informed heuristic.

Returns a ReusableTreePolicy, containing a TabularVPolicy of state value estimates, the most recent PathSearchSolution produced by forward search (including a search tree if save_search is true), and a reusable tree of goal paths if reuse_paths is true.

Arguments

  • planner: The ForwardPlanner used for lookahead heuristic search. Any keyword arguments accepted by ForwardPlanner are also keyword arguments for RTHS (e.g. heuristic, max_nodes, save_search, etc.).

  • n_iters: Number of iterations to perform. In each iteration, search is followed by a simulated action to the best neighboring state. If a goal or dead end is reached, we restart from the initial state.

  • update_method: Method used to update value estimates, either via cost differencing from the terminal node (:costdiff), or via Dijkstra's algorithm (:dijkstra).

  • search_neighbors: Controls whether search is additionally performed from all neighbors of the current state (:all), from neighbors unexpanded by the initial search (:unexpanded), or none of them (:none).

  • reuse_search: If true, then previous search solutions are reused by subsequent searches in future iterations or calls to refine!. The latter requires save_search to be true. Search solutions are reused via the :reroot refinement method for ForwardPlanner.

  • reuse_paths: If true, then every time a path to the goal is found, it is stored in a reusable tree of goal paths, as in Tree Adaptive A* [2]. Future searches will terminate once a previous path is encountered, reducing the cost of search. A consistent heuristic is required to ensure path optimality.

Update Methods

Setting the update_method keyword argument controls how value estimates $V(s)$ (or equivalently, cost-to-goal estimates $h(s) = -V(s)$) are updated:

  • :costdiff: Value estimates are updated by cost differencing from the terminal node $t$. For each node $s$ in the search tree's interior, we estimate the cost-to-goal $h(s)$ by adding a lower bound on the cost from $s$ to $t$ to the cost-to-goal of $t$:

    \[h(s) = h(t) + (c(r, t) - c(r, s))\]

    where $c(r, s)$ is the cost from the root node $r$ to $s$. This is the update used by Real-Time Adaptive A* [3]. Takes $O(N)$ time, where $N$ is the size of the tree's interior.

  • :dijkstra : Value estimates are backpropagated from the search frontier via Dijkstra's algorithm. For each node $s$ in tree's interior, we update the cost-to-goal $h(s)$ by minimizing over all paths to the frontier:

    \[h(s) = \min_{t \in F} h(t) + c(s, t)\]

    This is the update rule by LSS-LRTA* [4], which produces more informed value estimates than :costdiff, but takes $O(NB \log NB)$ time. $N$ is the size of the tree's interior and $B$ is the maximal branching factor. The save_parents keyword for ForwardPlanner defaults to true when this method is used.

Both of these update methods require a heuristic that is initially consistent for the updated value estimates to remain consistent.

[1] R. E. Korf, "Real-Time Heuristic Search," Artificial Intelligence, vol. 42, no. 2, pp. 189–211, Mar. 1990, https://doi.org/10.1016/0004-3702(90)90054-4.

[2] C. Hernández, X. Sun, S. Koenig, and P. Meseguer, "Tree Adaptive A*," AAMAS (2011), pp. 123–130. https://dl.acm.org/doi/abs/10.5555/2030470.2030488.

[3] S. Koenig and M. Likhachev, "Real-Time Adaptive A*," AAMAS (2006), pp. 281–288. https://doi.org/10.1145/1160633.1160682.

[4] S. Koenig and X. Sun, "Comparing real-time and incremental heuristic search for real-time situated agents", AAMAS (2009), pp. 313–341. https://dl.acm.org/doi/10.5555/1018410.1018838.

source
SymbolicPlanners.AlternatingRealTimeHeuristicSearchType
AlternatingRealTimeHeuristicSearch(
    planners::RealTimeHeuristicSearch...;
    share_values::Bool = true,
    share_search::Bool = false,
    share_paths::Bool = true,
)

A variant of RealTimeHeuristicSearch (AlternatingRTHS or ARTHS for short) that alternates between different search strategies, corresponding to one or more planners. For example, ARTHS can alternate between a RTHS planner that uses A* search (h_mult = 1.0) and an RTHS planner that uses breadth-first search (h_mult = 0.0), ensuring that the updated region of the state space is both deep and broad.

Returns a MultiSolution composed of ReusableTreePolicy sub-solutions. By default, these sub-solutions share and update the same value estimates (share_values = true) and the same reusable tree of optimal paths to the goal (share_paths = true), but do not share the same reusable forward search tree (share_search = false).

source
SymbolicPlanners.MonteCarloTreeSearchType
MonteCarloTreeSearch(
	n_rollouts::Int64 = 50,
	max_depth::Int64 = 50,
	heuristic::Heuristic = NullHeuristic(),
	selector::MCTSNodeSelector = BoltzmannUCBSelector(),
	estimator::MCTSLeafEstimator = RandomRolloutEstimator()
end

Planner that uses Monte Carlo Tree Search (MCTS for short) [1], with a customizable initial value heuristic, node selector strategy, and leaf node value estimator.

Arguments

  • n_rollouts: Number of search rollouts to perform.

  • max_depth: Maximum depth of rollout (including the selection and estimation phases).

  • heuristic: Initial value heuristic for newly encountered states / leaf nodes.

  • selector: Node selection strategy for previously visited nodes (e.g. MaxUCB).

  • estimator: Estimator for leaf node values (e.g. random or policy-based rollouts).

source

External Planners

Wrappers for several external planners are provided:

SymbolicPlanners.FastDownwardType
FastDownward(
    search::String = "astar",
    heuristic::String = "add",
    h_params::Dict{String, String} = Dict(),
    max_time::Float64 = 300,
    verbose::Bool = false,
    log_stats::Bool = true,
    fd_path::String = get(ENV, "FD_PATH", ""),
    py_cmd::String = get(ENV, "PYTHON", "python")
)

Wrapper for the FastDownward planning system [1]. The planner has to be installed locally for this wrapper to be used. Consult the FastDownward documentation for further explanation of options.

[1] M. Helmert, "The Fast Downward Planning System," Journal of Artificial Intelligence Research, vol. 26, pp. 191–246, Jul. 2006, https://doi.org/10.1613/jair.1705.

Arguments

  • search: String specifying search algorithm (e.g. "astar", "ehc").

  • heuristic: String specifying search heuristic (e.g. "add", "lmcut",).

  • h_params: Heuristic parameters as a dictionary mapping names to values.

  • max_time: Maximum time in seconds before planner times out.

  • verbose: Flag to print planner outputs.

  • log_stats: Flag to log solution statistics.

  • fd_path: Path to fast_downward.py.

  • py_cmd: Path to Python executable.

source
SymbolicPlanners.PyperplanType
Pyperplan(
    search::String = "astar",
    heuristic::String = "add",
    log_level::String = "info",
    log_stats::Bool = true,
    max_time::Float64 = 300,
    verbose::Bool = false,
    py_cmd::String = get(ENV, "PYTHON", "python")
)

Wrapper for the Pyperplan lightweight STRIPS planner [1]. The planner has to be installed locally for this wrapper to be used. Consult the Pyperplan documentation for further explanation of options.

[1] Y. Alkhazraji et al., "Pyperplan." Zenodo, 2020. https://doi.org/10.5281/zenodo.3700819.

Arguments

  • search: String specifying search algorithm (e.g. "astar", "gbf").

  • heuristic: String specifying search heuristic (e.g. "hadd", "hmax",).

  • log_level: How much information to log when running the planner.

  • log_stats: Flag to log solution statistics.

  • max_time: Maximum time in seconds before planner times out.

  • verbose: Flag to print planner outputs.

  • py_cmd: Path to Python executable.

source
SymbolicPlanners.ENHSPType
ENHSP(
    search::String = "astar",
    heuristic::String = "add",
    h_mult::Float64 = 1.0,
    log_stats::Bool = true,
    max_time::Float64 = 300,
    verbose::Bool = false,
    enhsp_path::String = get(ENV, "ENHSP_PATH", ""),
    java_cmd::String = get(ENV, "JAVA", "java")
)

Wrapper for the Expressive Numeric Heuristic Search Planner (ENHSP) [1]. The planner has to be installed locally for this wrapper to be used. Consult the ENHSP documentation for further explanation of options.

[1] E. Scala et al., "ENHSP", https://sites.google.com/view/enhsp/.

Arguments

  • search: String specifying search algorithm (e.g. "gbfs", "WAStar").

  • heuristic: String specifying search heuristic (e.g. "hadd", "aibr",).

  • h_mult: Heuristic multiplier for weighted A*.

  • log_stats: Flag to log solution statistics.

  • max_time: Maximum time in seconds before planner times out.

  • verbose: Flag to print planner outputs.

  • enhsp_path: Path to enhsp.jar.

  • java_cmd: Path to Java executable.

source