Discrete Structures for Computing Notes 8 ------------------------------------------------------------------------ 7.3 Divide-and-Conquer Algorithms and Recurrence Relations ------------------------------------------------------------------------ Divide and conquer is a popular and powerful technique for developing recursive algorithms. 1. divide problem into smaller subproblems 2. recursively solve the subproblems 3. combine the solutions to the subproblems to obtain solution to original problem The formula for the running time is: f(n) = a*f(n/b) + g(n) where * f is function that describes the running time of the algorithm * n is size of problem * a is number of subproblems (number of recursive calls) * each subproblem is of size n/b * g is function that describes the running time of combining step Example: binary search 1. divide sequence into two halves 2. recursively search in the appropriate half 3. no combining is necessary f(n) = f(n/2) + c, where c is some constant Example: finding maximum of a sequence 1. divide sequence into two halves 2. recursively find maximum in first half, recursively find maximum in second half 3. return the larger of the two maxima found in step 2 f(n) = 2*f(n/2) + c, where c is some constant Example: mergesort 1. divide sequence into two halves 2. recursively sort the first half, recursively sort the second half 3. merge the two sorted sequences from step 2 f(n) = 2*f(n/2) + c*n, where c is some constant Notice that for all 3 of these example algorithms, f(1) equals some constant, i.e., is O(1). ------ So how can we solve these recursive formulas to get a closed form solution? I.e., don't define f in terms of itself. Let's start with simple cases and work up to more complicated ones. THEOREM: Suppose f is an increasing function such that * f(n) = f(n/b) + c * n is a power of b * b is an integer greater than 1 * c is a positive real number * f(1) is O(1) Then f(n) = O(log n). PROOF: Expand f(n) using its recursive definition a few times to try to find the pattern. f(n) = f(n/b) + c = [f((n/b)/b) + c] + c = f(n/b^2) + 2*c = f(n/b^3) + 3*c ... = f(n/b^k) + k*c ... keep going until argument to f is 1, i.e., until n/b^k = 1, which means k = log_b n. ... = f(1) + (log_b n) * c = O(log n). QED Application: Formula for binary search fits this pattern. It is f(n) = f(n/2) + c, which is O(log n). THEOREM: Suppose f is an increasing function such that * f(n) = a*f(n/b) + c * n is a power of b * a > 1 * b is an integer greater than 1 * c is a positive real number * f(1) is O(1) Then f(n) = O(n^{log_b a}). PROOF: Expand f(n) using its recursive definition a few times to try to find the pattern. f(n) = a*f(n/b) + c = a*[a*f(n/b^2) + c] + c = a^2*f(n/b^2) + (a+1)*c = a^2*[a*f(n/b^3) + c] + (a+1)*c = a^3*f(n/b^3) + (a^2+a+1)*c ... = a^k*f(n/b^k) + (a^{k-1}+a^{k-2}+...+a^2+a+1)*c ... keep going until the argument to f is 1, i.e., until n/b^k = 1, or k = log_b n ... = a^{log_b n}*f(1) + c*Sum_{j=0}^{log_b a - 1} a^j * Using formula for geometric progress, the sum becomes (a^{log_b n} - 1)/(a-1) * a^{log_b n} = n^{log_b a} by rules of logarithms * So whole thing becomes O(n^{log_b a}). QED Application: Formula for finding maximum fits this pattern. f(n) = 2*f(n/2) + c (i.e., a = 2 and b = 2). So we get f(n) is O(n^{log_2 2}) = O(n^1) = O(n). But notice that neither of the previous two theorems handle the case of mergesort, when g(n) is not a constant. The next theorem attacks that problem. Not totally general, because it requires g(n) to have a certain form, but it is quite useful in many situations. MASTER THEOREM: Suppose f is an increasing function such that * f(n) = a*f(n/b) + c*n^d * n is a power of b * a > 1 * b is an integer greater than 1 * c is a positive real number * d is a nonnegative real number * f(1) is O(1) Then f(n) is (i) O(n^d) if a < b^d (ii) O(n^d log n) if a = b^d (iii) O(n^{log_b a) if a > b^d Proof is spelled out in the exercises. Application: Formula for mergesort fits this pattern. f(n) = 2*f(n/2) + c*n, where a = 2, b = 2, and d = 1. Let's see which case of the master theorem holds by checking how a compares to b^d, i.e., how 2 compares to 2^1. They are equal, so we have case (ii), and f(n) is O(n^1 log n) = O(n log n). -------- Another example: * Suppose f(n) = 5*f(n/2) + 3, f(1) = 7, and f is increasing. Use the second theorem with a = 5, b = 2 and c = 3. Result is that f(n) = O(n^{log_b a}) = O(n^{log_2 5}). ------- More sophisticated divide-and-conquer algorithm from computational geometry. * Given a set of n points in the plane (x_1,y_1), ..., (x_n,y_n). * Determine which pair of points are closest to each other. (Remember that the distance between (x_i,y_i) and (x_j,y_j) is sqrt((x_i - x_j)^2 + (y_i - y_j)^2).) First idea for solution: * Compute the distance between every pair of points. * Find the minimum. Drawback is that the time is O(n^2), since there are n(n-1)/2 pairs of points. Let's try divide and conquer. First, we do some preprocessing (only once). 1. xsort := result of using mergesort to sort the list of points in increasing order of x-coordinates 2. ysort := result of using mergesort to sort the list of points in increasing order of y-coordinates Then comes the recursive (divide-and-conquer) part of the algorithm. 1. Divide the points into two halves (assume n is a power of 2) based on x-coordinates. Let L be the vertical line that splits the points. 2. Recursively compute the pair of nodes to the left of L that have the minimum distance, d_L, between them. Recursively compute the pair of nodes to the right of L that have the minimum distance, d_R, between them. 3. Combining step: Let d = min(d_L,d_R). Have to check whether any pair of points, one from the left of L and one from the right of L, are at distance less than d. If there is such a pair of points, then one must be within distance d of L on the left and the other must be within distance d of L on the right. Use xsort to extract just the points that are within distance d of L (on either side). So now we are considering all the points in vertical strip of width 2d centered on L. Use ysort to consider the points in the strip in increasing order of y coordinate (i.e., start at the bottom and work up). For each point p = (x,y), consider all the points p' in the strip on the other side of L whose y-coordinate is in the range y to y+d. Compute the distance between p and p' and see if it is less than d, and if so, keep track of it. Running time of this algorithm: * Preprocessing is O(n log n) * Let f(n) be the running time of the rest of the algorithm. f(n) = 2*f(n/2) + g(n). * What is g(n)? - dividing the points into two halves based on line L takes O(n) time - computing d is O(1) - extracting points in the strip is O(n) - number of points in the strip to be considered in increasing order of y-coordinate is at most n - given a point p in the strip, for each other point p', the comparison is O(1) time. - *** how many points p' must p be compared against? *** Suppose p is on left. Since all the points on the right side of L are at least d apart from each other, in each quadrant of the d x d square on the right side being considered, there is at most one point. Thus p will only need to be compared against at most 4 other points (a constant number). So g(n) is O(n). By case 2 of the master theorem (a = 2, b = 2, d = 1), we find that f(n) is O(n log n). Since the preprocessing also takes O(n log n) time, the total time is O(n log n), which is significantly better than the O(n^2) time of the naive approach.