Discrete Structures for Computing Notes 7 ------------------------------------------------------------------------ 4.4 Recursive Algorithms ------------------------------------------------------------------------ Recursion is a powerful tool for designing algorithms (and programs). A RECURSIVE algorithm solves a problem by reducing it to an instance of the same problem with smaller input. A recursive algorithm must satisfy two properties: * there is one (or more) well-defined stopping cases * each subsequent instance of the problem gets closer to a stopping case. Example: Recursive algorithm for computing n! factorial(n): if n = 0 then // stopping case return 1 else return n*factorial(n-1) // get closer to stopping case endif Example: Recursive algorithm for computing a^n: power(a,n): if n = 0 then // stopping case return 1 else return a*power(a,n-1) // get closer to stopping case endif Example: Recursive algorithm for doing linear search linear_search(i,j,x): // search for x in A[i..j] if A[i] = x then // a stopping case return i else if i = j then // another stopping case return 0 else return linear_search(i+1,j,x) // get closer to stopping case endif Example: recursive version of binary search binary_search(i,j,x): // search for x in A[i..j] m := floor((i+1)/2) // compute midpoint if x = A[m] then return m // a stopping case - found x elseif i > m then return 0 // another stopping case - nothing left to search elseif j < m then return 0 // another stopping case - nothing left to search elseif x < A[m] then return binary_search(x,i,m-1) // search in left half, // get closer to stopping case else // x > A[m] return binary_search(x,m+1,j) // search in right half, // get closer to stopping case endif How does this work? Looks like magic. Not magic, though. In a programming language implementation, the intermediate values are kept track of by the run-time system (in the stack frames for the separate invocations of the recursive procedure/subroutine/method). Recursive algorithms can be proved correct using *induction*. Example: Prove that the recursive algorithm for computing a^n is correct. We will use induction on the exponent n. Basis step: n = 0. In this case, the algorithm returns 1, which is the correct answer. Inductive step: * Suppose the algorithm returns the correct answer for exponent k, which is a^k. * Prove the algorithm returns the correct answer for exponent k+1, which is a^{k+1}. * When we invoke the procedure with a and k+1, the answer returned is a*x, where x is the answer produced by the invocation with arguments a and k. * By the inductive hypothesis, x = a^k. * Thus the final answer returned is a*a^k, which is a^{k+1}. QED Note that strong induction was not needed, all we needed was correctness for k in order to prove correctness for k+1. But sometimes we need strong induction. << example in book however involves modular powers, which I've been skipping >> Example: recursive algorithm for computing Fibonacci numbers: Fibonacci(n): if n = 0 then // a stopping case return 0 elseif n = 1 then // another stopping case return 1 else return Fibonacci(n-1) + Fibonacci(n-2) // closer to stopping case endif Recursion is not *necessary*, it can always be replaced by an iterative approach. Advantages of recursion over iteration: * can be easier to develop a solution based on recursion * can have simpler code Disadvantages of recursion over iteration: * can be less efficient (slower, take more space) Look at inefficiencies in recursive Fibonacci algorithm: << Fig 1 on p. 316 >> Can be shown that the running time is exponential in n. A way to avoid this inefficiency (of all the repeated work) is to go to an iterative algorithm instead: iterative_Fibonacci(n) : if n = 0 then return 0 elseif n = 1 then return 1 else x := 0 // f_{i-2} y := 1 // f_{i-1} for i := 2 to n do z := x + y // f_i = f_{i-1} + f_{i-2} x := y // new f_{i-2} is the old f_{i-1} y := z // new f_{i-1} is the old f_i endfor return y endif Use induction to show correctness: Inspecting the code shows that the correct answers are returned for n = 0 and n = 1. What about n >= 2? Make an inductive claim: At the end of iteration i of the for loop, x = f_{i-1} and y = {f_i}. Basis Step: When i = 2, x = 1 = f_1 and y = 1 = f_2. Inductive Step: Assume for k and show for k+1. At beginning of iteration k+1, z is set equal to x + y. By the inductive hypothesis, x = f_{k-1} and y = f_k. Thus z is set equal to f_{k+1}. In the rest of the iteration, x is set equal to y, which is f_k, and y is set equal to z, which is f_{k+1}. QED Running time is O(n) (loop is done n-1 times, loop body takes constant time). Merge Sort ---------- mergesort(L): // L is the list to be sorted, length is n if n = 1 then // base case return L else m := floor(n/2) L1 := L[1..m] // split L into two halves L2 := L[m+1..n] S1 := mergesort(L1) // sort the two halves recursively S2 := mergesort(L2) S := merge(S1,S2) // merge the two sorted lists return S endif How do we merge two sorted lists? merge(S1,S2) : S := new empty list while S1 and S2 are both nonempty do compare first elements of S1 and S2 remove smaller one from its list and append to S endwhile if S1 is not yet empty then copy rest of S1 to end of S endif if S2 is not yet empty then copy rest of S2 to end of S endif end Do example on input 8, 2, 4, 6, 9, 7, 10, 1 What is the running time of mergesort? Later in the class we will study techniques for analyzing the time of recursive algorithms. For now, let's just give a high level idea. Assume n is a power of 2, say 2^p (so each splitting into half is perfect). First, let's look at the running time of merge. Note that each element in each list is "handled" once. So the running time is proportional to the sum of the lengths of the two input lists. Fact: Running time of merge algorithm on inputs S1 and S2 is O(length(S1) + length(S2)). So if both S1 and S2 are of length k, then running time is O(k). Now consider the recursive mergesort. Fact: The time for the invocation of mergesort on a list of size k, **excluding the time for the two recursive calls** is O(k). Why? O(k) time is needed to copy the input list into the two new lists and then O(k) time is needed to do the merge. How many recursive calls do we have? First invocation is on list of size n = 2^p Second invocation is on list of size n/2 = 2^{p-1} - there are 2 of them Third invocation is on list of size n/4 = 2^{p-2} - there are 4 of them Fourth invocation is on list of size n/8 = 2^{p-3} - there are 8 of them ... Last invocation is on list of size 1 = 2^{p-p} - there are n of them. So we have p levels in the calling tree, where p = log_2 n. Add all this up: O(n + 2*n/2 + 4*n/4 + 8*n/8 + ... + n*n/n) = O(n log n), since there are log n terms in the sum and each term is n. ------------------------------------------------------------------------ 4.5 Program Correctness ------------------------------------------------------------------------ How can we *prove* that a program is correct (gives the right answer on all possible inputs)? Testing is not sufficient, since we can only test on a finite number of inputs -- maybe an input we didn't test causes an error. Program verification is the name for proving correctness of programs. Techniques include * rules of inference * mathematical induction It may never be realistic to prove programs correct, especially large ones: they are too large to be done by hand, so some work has been done on computerized checking. But how do you know the verifier program doesn't have errors?? However, the concepts from program verification can be very useful in proving that the *algorithm* being implemented by a program is correct. If you implement an incorrect algorithm, then definitely your program will be incorrect. There are two aspects to proving program correctness: (1) PARTIAL CORRECTNESS: Show that *if* the program terminates, then it produces the correct answer. (2) TERMINATION: Show that the program really does terminate. Partial Correctness ------------------- We use two propositions in order to formalize the statement that if the program terminates then it has the right answer: * initial assertion p: gives properties of input values (assumptions about the initialization) * final assertion q: gives properties that the output values should have The designer of the program must come up with p and q. They capture what you think the program is doing or is supposed to do. A program S is PARTIALLY CORRECT WITH RESPECT TO p and q if whenever p is true for the input values of S and S terminates, then q is true for the output values of S. Notation is p {S} q ("Hoare triple"). Example: Program (segment) S is: y := 2 z := x + y Let initial assertion p be "x = 1". Let final assertion q be "z = 3". Note that p {S} q is true in this case, since x = 1 initially, y is set to 2 in the first line, and z is set to x + y = 1 + 2 = 3 in the second line. Assignment statements are fairly straightforward to deal with. What about other features of programming languages? We'll see some ways to deal with * subprograms * conditionals (if statements) * loops Subprograms: ------------ Suppose program S consists of subprogram S1 followed by subprogram S2, denoted S = S1; S2. We have the composition rule: p {S1} q if p is true when S1 starts and S1 terminates, then q is true when S1 terminates q {S2} r if q is true when S2 starts and S2 terminates, then r is true when S2 terminates ---------- p {S} r if p is true when S starts and S terminates, then r is true when S terminates Breaking programs up into segments and being able to reason about the segments separately and then combining them is a powerful tool. Conditional Statements: ---------------------- Consider a statement of the form R = if condition then S We have the rule: (p /\ condition) {S} q if p and the condition are true when S starts and S terminates, then q is true when S terminates (p /\ ~condition} -> q if p is true and the condition is false, then q is true (doesn't rely on S) ------------------------- p {R} q Why is this reasonable? If the condition is true, then S is executed, which may cause q to become true. If the condition is false, then S is not executed, so it better be the case that q was already true. Example: Verify if x > y then y := x is correct with initial assertion p = "T" (true) and final assertion q = "y >= x". So R is the whole conditional statement, the condition is "x > y", and S is the body "y := x". Let's try to use the rule for conditionals. We need to show that two things are true: (a) (p /\ condition) {S} q, which in our specific case is (T /\ (x > y) {y := x} (y >= x), and (b) (p /\ ~condition) -> q, which in our specific case is (T /\ ~(x > y)) -> (y >= x). If (a) and (b) are true, then the rule of inference above regarding conditionals will imply that p {T} q is true, i.e., the thing we are trying to prove. Why is (a) true? First note that (T /\ (x > y)) is equivalent to (x > y). So we need to argue that if x > y before we execute y := x, then afterwards, y >= x. This is true since afterwards, y actually is equal to x. Why is (b) true? First note that (T /\ ~(x > y)) is equivalent to ~(x > y). Certainly ~(x > y) implies that y >= x. -------------- Inference Rule for if-then-else conditional: Suppose S is of the form if condition then S1 else S2 Then we have: (p /\ condition) {S1} q (p /\ ~condition) {S2} q ------------------------- p {S} q Example of its use: Verify if x < 0 then abs := -x else abs := x endif is correct with respect to initial assertion p = "T" and final assertion q = "abs = |x|". We want to use the inference rule above. So we have to show (a) (p /\ condition) {S1} q, which in our case is (T /\ (x < 0) {abs := -x} (abs = |x|) and (b) (p /\ ~condition) {S2} q, which in our case is (T /\ ~(x < 0) {abs := x} (abs = |x|) Then we can conclude from the inference rule that for all x, at the end of the code we've computed the absolute value of x. To show (a): Note that (T /\ (x < 0)) is equivalent to (x < 0). If x < 0 initially, then abs is set to -x, which is |x|. To show (b): Note that (T /\ ~(x < 0)) is equivalent to ~(x < 0). If x is nonnegative initially, then abs is set to x, which is |x|. Loop Invariants --------------- Suppose T is the program while condition S A LOOP INVARIANT is a statement p that remains true each time S is executed. I.e., (p /\ condition) {S} p is true. Inference rule is: (p /\ condition) {S} p so p is true both before and after iteration ------------------------ p {T} (~condition /\ p) p is still true when loop is finished Example: Consider the program segment i := 1 factorial := 1 while i < n do i := i+1 factorial := factorial*i endwhile "condition" is "i < n" and S consists of the body of the loop (i := i+1; factorial := factorial*i). We want to show that this program segment terminates with factorial = n!, whenever n is a positive integer. We can prove this with the help of an appropriate loop invariant. Suppose we knew that the statement p = "factorial = i! and i <= n" was a loop invariant for this program segment. In other words, the following Hoare triple is true. (p /\ condition) {S} p In our particular case, the triple becomes ((factorial = i! and i <= n) /\ (i < n)) {i := i+1; factorial := factorial*i} (factorial = i! and i <= n) Now let's use this triple to prove correctness. The (assumed) fact that p is a loop invariant means that the following triple is true: p {while condition S} (~condition /\ p) i.e., (factorial = i! and i <= n) {while condition S} (~(i < n) /\ (factorial = i! and i <= n)) In words, if p is true before the while loop begins, then after the while loop ends, the condition of the while loop is no longer true but p is still true. Just before the first iteration of the while loop, the statement p is true. Why? * p is the statement "factorial = i! and i <= n". * The first part is true since i = 1, factorial = 1, and 1 = 1!. * The second part is true since i = 1 and we are assuming n is a positive integer. Since p is true before the while loop begins, the rule of inference for loop invariants implies that after the while loop ends, i >= n and factorial = i! and i <= n. Thus, i is *equal* to n, and so factorial = n!. Now, why is p a loop invariant? To prove this, we have to show that (p /\ condition) {S} p is true, i.e., ((factorial = i! and i <= n) /\ (i < n)) {i := i+1; factorial := factorial*i} (factorial = i! and i <= n) is true. Suppose that (p /\ condition) is true before S (the body of the while loop). Since condition is true, S is actually executed. We need a way to distinguish between the values of the variables at the beginning and at the end of the iteration: * Let i_b be value of variable i at beginning * Let i_e be value of variable i at end Let factorial_b be value of variable factorial at beginning Let factorial_e be value of variable factorial at end Inspecting the code shows * i_e = i_b + 1 * factorial _e = factorial_b * i_e Now let's use the fact that p /\ condition is true. * i_e = i_b + 1 < n + 1 since condition is true <= n * factorial_e = factorial_b * i_e = (i_b!) * (i_b + 1) since p is true = (i_b + 1)! = i_e! So we've shown p is a loop invariant. The last piece of the proof is to show that the while loop actually terminates. Initially i is set equal to 1, so after n-1 iterations of the body of the loop, i will equal n and the loop will terminate. ---------- Example: Verify that the following procedure correctly calculates the product of its arguments (using repeated addition). multiple(m, n) : // S1 if n < 0 then a := -n else a := n // S2 k := 0 // counter x := 0 // accumulates the product // S3 while k < a do x := x + m k := k + 1 endwhile // S4 if n < 0 then product := -x else product := x end Idea is to break the problem down into the 4 pieces as indicated by the comments. The overall procedure is the sequential composition of S1; S2; S3; S4. Now we have to figure out appropriate propositions p, q, r, s, and t so that we can show * p {S1} q * q {S2} r * r {S3} s * s {S4} t where * p is an appropriate initial assertion and * t is an appropriate final assertion It turns out this can be done if the assertions are: * p : "m and n are integers" * q : p /\ (a = |n|) * r : q /\ (k = 0) /\ (x = 0) * s : x = m*a /\ a = |n| * t : product = m*n A few more details given in the book (p. 327), including the loop invariant for the loop in S3 (x = m*k /\ k <= a).