Skip to main content

Optimal Binary Search using Dynamic Programming

An optimal binary search tree is a binary search tree for which the nodes are arranged on levels such that the tree cost is minimum.

If the probabilities of searching for elements of a set are known from accumulated data from past searches, then Binary Search Tree (BST) should be such that the average number of comparisons in a search should be minimum.

eg. Lets the elements to be searched are A, B, C, D and probabilities of searching these items are 0.1, 0.2, 0.4 and 0.3 respectively.

Lets consider 2 out of 14 possible BST containing these keys.
Figure 1
Figure 2

Average number of comparison is calculated as sum of level*probability(key element) for each element of the tree.

Lets the level of tree start from 1.

Therefore, for figure 1 -
   
  • Average number of comparison = 1*0.1 +2*0.2 +3*0.4 +4*0.3  = 2.9                                             
For figure 2 -
  •    Average number of comparison= 2*0.1 +1*0.2 +2*0.4 +3*0.3 = 2.1

Since, both the examples are not optimal. So, how to find the Optimal Binary Search Tree.

In this example, we can find the optimal BST by generating all 14 BST with these keys. i.e. exhaustive search approach, which is not possible for larger value of keys. Because total number of BSTs with n keys is equal to nth Catalan number. So, the alternate method is to use the Dynamic Programming.

Notes for OBST using Dynamic Programming are attached below.

Page1

page2

page3

page4

page5

page6

page7

page8

page9

page10

page11

Comments

Popular posts from this blog

Job Sequencing with Deadlines

Given a set of n jobs Each job i has an integer deadlines di>=0 and a profit pi>0 All jobs requires only one unit time to complete Only one machine is available for processing jobs For job i the profit pi is earned if the job is completed by its deadline.

Travel Sales Man Problem (TSP) and Solution using Dynamic Programming

Given a graph of n vertices, determine the minimum cost path to start at a given vertex and travel to each other vertex exactly once, returning to the starting vertex. In some versions, the starting and ending points are different and fixed, and all other points have to be visited exactly once from start to end. A standard way to solve these problems is to try all possible orders of visiting the n points, which results in a run-time of O(n!). To calculate cost using Dynamic Programming, we need to establish recursive relation in terms of sub-problems. Let us define a term C(S, i) be the cost of the minimum cost path visiting each vertex in set S exactly once, starting at 1 and ending at i. We start with all subsets of size 2 and calculate C(S, i) for all subsets where S is the subset, then we calculate C(S, i) for all subsets S of size 3 and so on. Note that 1 must be present in every subset. So the algorithm is like below - If size of S is 2, then S must be {1, i},  C(S, ...