CPSC 311 Spring 2004 HW 8 Solutions ======================================================================= Ex. 34.1-4 ======================================================================= The dynamic programming algorithm for the 0-1 knapsack problem has running time O(n*W), where n is the number of items and W is the maximum weight that the knapsack can hold. This is *not* a polynomial running time for any reasonable representation of the input. In a reasonable representation, all numeric values (the weights,the values, ec.) will be given in binary (base 2, or perhaps even a higher base). To represent the value W takes log W bits; thus the running time of O(n*W) is *exponential* in the size of the input (although polynomial in the magnitude). ======================================================================= Ex. 34.2-1 ======================================================================= Graph isomorphism is in NP: A candidate solution is a mapping f from the nodes of G1 to the nodes of G2. To verify the candidate solution, check, for all nodes u and v in G1, that there is an edge in G1 from u to v iff there is an edge in G2 from f(u) to f(v). Clearly this can be done in polynomial time. ======================================================================= Ex. 34.5-1 ======================================================================= Subgraph-isomorphism (SI) is NP-complete: (1) SI is in NP: a candidate solution is a subset of the nodes in G2 together with a mapping from nodes in G1 to nodes in the subset. Same verification as for graph isomorphism. (2) Clique is polynomial time transformable to SI: Given a clique input (a graph G and a bound K), convert it into an SI input (two graphs G1 and G2) like this: Let G1 be a clique on K nodes and G2 be G. Clearly this can be done in polynomial time. And obviously G contains a clique of size K if and only if G2 contains a subgraph that is isomorphic to G1. ======================================================================= Prob. 34-1, part (a) ======================================================================= The independent set (IS) decision problem: given a graph G = (V,E) and a bound K, is there a subset V' of V consisting of K nodes such that V' is an independent set? (1) IS is in NP: a candidate solution is a subset V' of the nodes. To verify the candidate solution, check that |V'| = K and that there is no edge between any pair of nodes in V'. This is polynomial time since there are |V'|^2 pairs to check. (2) Clique is polynomial time transformable to IS: Given a Clique input (a graph G and a bound K), convert it into an IS input (a graph G' and a bound K') like this: Let G' be the complement graph of G (there is edge between u and v in G' iff there is no edge between u and v in G) and let K' = K. Clearly this construction can be done in polynomial time. Furthermore, obviously G has a clique of size K iff G' has an independent set of size K' = K (the set of nodes that form a clique in G form an independent set in G'). ======================================================================= Ex. 35.2-2 ======================================================================= Given an instance I of TSP consisting of n cities and Theta(n^2) distances between the pairs of cities, convert into an instance I' of TSP that satisfies the triangle inequality as follows: 1. Find m, the maximum distance 2. For each pair of cities in I with distance d, 3. let the distance between that pair of cities in I' be d+m The time for constructing I' is O(n^2) (polynomial). Claim 1: I' satisfies the triangle inequality. Proof: Choose any three cities, with distances between them a, b, c. a + m <= a + m + b + c, since distances are nonnegative <= m + m + b + c, since m is the maximum distance = (b + m) + (c + m), by rearranging terms. QED Claim 2: Optimal tours are preserved by this transformation. Proof: Suppose optimal tour T for I has cost C. The cost of T in I' is C + n*m. Suppose there is a better tour T' in I', with cost C'. The cost of T' in I is C' - n*m < (C + n*m) - n*m since T' is a better tour than T in I' = C This contradicts T being the optimal tour in I. For the other direction, suppose optimal tour T' for I' has cost C'. The cost of T' in I is C' - n*m. Suppose there is a better tour T in I, with cost C. The cost of T in I' is C + n*m < (C' - n*m) + n*m since T is a better tour than T' in I = C' This contradicts T' being the optimal tour in I'. QED So there is a polynomial time transformation from TSP instances without the triangle inequality to TSP instances with the triangle inequality that preserves optimal tours. However, this does not contradict Theorem 37.3, for this reason: Suppose we construct I' from I, then run a constant ratio (K) approximation algorithm for TSP with triangle inequality. The resulting tour T' for I' has cost C', where C' <= K*opt(I'). The tour T' in I has cost C = C' - n*m. C = C' - n*m <= K*opt(I') - n*m <= K*(opt(I) + n*m) - n*m since best tour in I' is at least as good as using best tour in I with new distances = K*opt(I) + (K-1)*n*m Since K>1, C depends on n and m and is *not* bounded by any constant times opt(I). ======================================================================= Prob. 35-1 ======================================================================= Bin-packing (BP) decision problem: Given a finite set S of objects, each with a size in (0,1), and a bound K, is it possible to pack all the objects into at most K bins, where each bin has size 1? (No breaking of objects allowed!) Subset-sum (SS) decision problem: Given a finite set S of natural numbers and a "target" natural number t, is there a subset of S whose elements sum to t? Set-partition (SP) decision problem: Given a finite set S of natural numbers, is there a partition of S into two (exhaustive and disjoint) subsets such that the sum of the elements in one subset is the same as the sum of those in the other? (a) We'll do this in steps: first, show SS is reducible to SP; then show SP is reducible to BP. SS reducible to SP: Let s_1, ..., s_n, t be any SS input. Transform it into an SP input s_1, ..., s_n, 2S-t, S+t, where S is the sum of s_1 through s_n. Note that for the SP problem, we are looking for a subset whose elements sum to 2S (since the entire set of inputs sums to 4S). If a subset of s_1, ..., s_n that sums to t exists, then that same set of elements, together with 2S-t sums to 2S, and thus the SP input has a partition. If a partition of the SP input exists, then the elements 2S-t and S+t must be on different sides of the partition. In this case, the other elements on the side of the partition containing 2S-t sum to t, and these elements show that the SS input has a subset summing to t. SP reducible to BP: Let s_1, ..., s_n be any SP input. Transform it into a BP input with objects whose sizes are s_1/(S/2), ..., s_n/(S/2), where S is the sum of s_1 through s_n, and with number of bins equal to 2. If a partition of the SP input exists, then each side of the partition sums to S/2, so in the BP input, the corresponding objects' sizes sum to (S/2)/(S/2) = 1. Thus 2 bins suffice for the BP input. If 2 bins suffice for the BP input, then the sizes of the objects in each bin sum to at most 1. Thus the corresponding elements in the SP input sum to at most S/2. Since S is the sum of all the elements in the SP input, there must be two bins and each one must have sizes that sum to exactly 1. Thus the corresponding elements in the SP input sum to exactly S/2, and we have a partition. (b) Number of bins must be at least ceiling(S), since otherwise there wouldn't be room for all the objects. (c) first-fit leaves at most one bin less than half full: Suppose two bins are less than half full. Before allocating the second of these bins, the first of these two bins would have been filled with the object(s) in the second bin. (d) This worst-case scenario happens if each object is just over 1/2 in size. Then each one goes in a separate bin, but almost half of each bin is wasted. To be more formal, suppose in contradiction, that there is case in which the number of bins is more than ceiling(2*S). Then there must be at least two bins that are less than half full (otherwise the sum of the sizes would exceed S, but S is *defined* to be the sum of the sizes). This contradicts part (c). (e) Approximation ratio of 2 for first-fit follows from (b) and (d). (f) To implement first-fit: for each object (in arbitrary order of size) do scan through the currently allocated bins until finding one with enough space left if one isn't found then allocate a new (empty) bin put the object in the bin found (or the new one) endfor Time: in the worst case, each object goes in its own bin, so if there are n objects, then there are n bins. Iteration i of the for loop requires checking i-1 bins. So the total time is O(n^2). Can improve this by keeping a more efficiently searchable data structure of the bins, based on how much room is left in each bin, such as a binary heap. Gives O(n log n) time.