Draft:Informed search

Draft article not currently submitted for review.

This is a draft Articles for creation (AfC) submission. It is not currently pending review. While there are no deadlines, abandoned drafts may be deleted after six months. To edit the draft click on the "Edit" tab at the top of the window.

To be accepted, a draft should:

Show the subject qualifies for a Wikipedia article by using multiple sources that meet four criteria. The sources should be (1) reliable (2) secondary (3) independent of the subject (4) talk about the subject in some depth. For some topics, there are alternative criteria.
Be written from a neutral point of view
Respect copyright and do not plagiarize. Do not copy-paste.

It is strongly discouraged to write about yourself, your business or employer. If you do so, you must declare it.

Where to get help

If you need help editing or submitting your draft, please ask us a question at the AfC Help Desk or get live help from experienced editors. These venues are only for help with editing and the submission process, not to get reviews.
If you need feedback on your draft, or if the review is taking a lot of time, you can try asking for help on the talk page of a relevant WikiProject. Some WikiProjects are more active than others so a speedy reply is not guaranteed.

How to improve a draft

Wikipedia:Contributing to Wikipedia – a basic overview on how to edit Wikipedia.
Help:Wikitext – how to use the markup
Help:Referencing for beginners – how to include references
Wikipedia:Article development – how to develop your article
Wikipedia:Writing better articles – how to improve your article
Wikipedia:Verifiability – make sure your article includes reliable third-party sources

You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article.

Improving your odds of a speedy review

To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags.

Add tags to your draft

Editor resources

Easy tools: Citation bot (help) | Advanced: Fix bare URLs

Last edited by GTrang (talk | contribs) 3 days ago. (Update)

Submit the draft for review!

Informed search (also known as heuristic search) is a strategy of searching for solutions in a state space that uses knowledge specific to the given problem. Informed methods usually provide a more efficient search compared to uninformed methods.

Information specific to the problem is formulated as a heuristic function. At each step of the search, the heuristic function evaluates alternatives based on additional information to decide which direction the search should continue^[1].

Heuristic Functions

In the context of state space search, a heuristic function h(n) is defined on the nodes of a search tree as follows:

h(n) = an estimate of the cost of the least expensive path from node n to a goal node.

If n is a goal node, then h(n) = 0.

The node to expand is selected based on the evaluation function (English: evaluation function)

f(n) = an estimate of the cost of the least expensive solution path passing through node n,

f(n) = g(n) + h(n),

where the function g(n) determines the cost of the path already traversed from the start node to node n.

Values of the functions along the optimal solution
f1(n) = g(n) + h1(n) – inadmissible heuristic
f2(n) = g(n) + h2(n) – admissible, but not consistent
f3(n) = g(n) + h3(n) – consistent heuristic

If the heuristic function h(n) never overestimates the actual minimum cost of reaching the goal (i.e., it is a lower bound of the actual cost), such a function is called admissible.

If the heuristic function h(n) satisfies the condition

h(a) ≤ cost(a, b) + h(b),

where b is a successor of a, then this function is called consistent.

If f(n) = g(n) + h(n) is the evaluation function, and h(n) is a consistent function, then the function f(n) is monotonically non-decreasing along any explored path. Therefore, consistent functions are also called monotonic.

Any consistent function is admissible, but not every admissible function is consistent.

If h₁(n), h₂(n) are admissible heuristic functions, and for any node n the inequality h₁(n) ≥ h₂(n) holds, then h₁ is a more informed heuristic, or dominates over h₂.

If there are admissible heuristics h₁ and h₂ for the problem, then the heuristic h(n) = max(h₁, h₂) is admissible and dominates each of the original heuristics^[1]^[2].

Comparison of Heuristic Functions

When comparing admissible heuristics, the degree of informativeness and the spatial and temporal complexity of computing each heuristic are important. More informed heuristics can reduce the number of nodes expanded, although the cost may be the time required to compute the heuristic for each node.

The effective branching factor is the average number of successors of a node in the search tree after applying heuristic pruning methods^[1]^[2]. The effective branching factor can be used to judge the quality of the heuristic function used.

An ideal heuristic function (e.g., a lookup table) always returns the exact values of the shortest solution length, so the search tree contains only optimal solutions. The effective branching factor of an ideal heuristic function is close to 1^[1].

Search Problem Examples

Sum of the Manhattan distances of all tiles from their target positions:
h_m(n)=3+0+0+3+2+4+2+4+1+3+2+2+
+3+3+2=34.
The optimal solution consists of 50 moves.

Permutation puzzles are often used as models for testing search algorithms and heuristic functions, such as 3×3 ^[3]^[4], 4×4 ^[5]^[6]^[7], 5×5 ^[8]^[9]^[10], 6×6 ^[11], Rubik's Cube ^[9]^[12], and the Tower of Hanoi with four rods ^[11]^[13].

In the "15-puzzle," the heuristic h_m, based on Manhattan distance, can be applied. More specifically, for each tile, the Manhattan distance between its current position and its position in the initial state is calculated, and these values are summed.

It can be shown that this heuristic is admissible and consistent: its value cannot change by more than ±1 in a single move.

Constructing Heuristic Functions

Relaxed Problem

The heuristic function h_m, used to solve the "Fifteen Puzzle," represents a lower bound on the length of the optimal solution. Additionally, h_m(n) is the exact length of the optimal solution for the relaxed version of the puzzle, where tiles can be moved into occupied positions. The original puzzle has the constraint "no more than one tile per cell," which is not present in the relaxed version. A problem with fewer constraints on possible actions is called a relaxed problem; the cost of solving the relaxed problem is a valid heuristic for the original problem^[1], as any solution to the original problem is also a solution to the relaxed problem.

Subproblem

A valid heuristic may be based on the cost of solving a subproblem of the original problem. Any solution to the main problem is simultaneously a solution to each of its subproblems^[1].

A subproblem of the Fifteen Puzzle could be the problem of placing tiles 1, 2, 3, and 4 in their correct positions. The cost of solving this subproblem is a valid heuristic for the original problem.

Pattern Databases

The "fringe" template (target configuration of the subproblem)^[6]

Pattern databases^[1] — a type of admissible heuristic based on the idea of storing the exact cost of solving each possible instance of a subproblem^[1]^[6]^[12].

An example of a template for the 15-puzzle is shown in the image on the right: the definition of the subproblem includes the positions of seven tiles in the first column and the first row. The number of configurations for this template is ${\dfrac {16!}{8!}}=518918400$ . For each configuration, the database contains the minimum number of moves required to transform this configuration into the target configuration of the subproblem, as shown in the image. The database is constructed using the method of backward breadth-first search^[2]^[6].

Search Algorithms

Best-First Search

Best-First Search is an approach in which the node for expansion is chosen based on an evaluation function f(n). The node with the lowest evaluation is selected for expansion.

A* Search

A* Search is the most well-known variant of best-first search. It uses the evaluation function f(n) of the cost of the least costly path to the goal passing through node n:

f(n) = g(n) + h(n), where

g(n) is the cost from the start node to node n,

h(n) is the estimated cost from node n to the goal.

If h(n) never overestimates the cost to reach the goal (i.e., is admissible), then A* Search is optimal.

IDA*

Iterative Deepening A* (IDA*) is the application of the iterative deepening idea in the context of heuristic search.

The uninformed iterative deepening algorithm stops expanding when the search depth d exceeds the current depth limit l. The informed IDA* algorithm stops expanding when the evaluation f(n) of the path cost through the current node n exceeds the current path cost bound bound.

IDA* has minimal memory overhead compared to A* and a comparatively small (if a good heuristic is chosen) number of expanded nodes compared to IDDFS.

Pseudocode

^[14]

 node              current node
 g                 cost from root to node
 f                 estimated cost of the minimum path through node
 h(node)           heuristic estimate of the cost from node to goal
 cost(node, succ)  path cost function
 is_goal(node)     goal test function
 successors(node)  node expansion function

 procedure ida_star(root, cost(), is_goal(), h())
   bound := h(root)
   loop
     t := search(root, 0, bound)
     if t = FOUND then return FOUND
     if t = ∞ then return NOT_FOUND
     bound := t
   end loop
 end procedure

 function search(node, g, bound)
   f := g + h(node)
   if f > bound then return f
   if is_goal(node) then return FOUND
   min := ∞
   for succ in successors(node) do
     t := search(succ, g + cost(node, succ), bound)
     if t = FOUND then return FOUND
     if t < min then min := t
   end for
   return min
 end function

MA*

In progress

SMA*

SMA* In progress

RBFS