Computational ComplexityWe study how hard or how easy to solve a computation problem. There are class of problems which existed algorithms that are efficient to solve them. Some class of problems are known to be so difficult that they are impossible to solve efficiently. The difficulty of some class of problems is unknown, we do not know any algorithm to solve them efficiently and yet we don't know how to prove it.
Polynomial function
Defining complexity class
Complexity class P
Complexity class NP
NP-completeness
Polynomial reduction
Definition
How to prove that a problem is NP-complete
Example of NP-complete proof
NP-hard
Picture of complexity space
efficiency of algorithm easy/hard table 13-2 Somchaip p 323
The design and analysis of algorithms that we have seen previously focuses on algorithms. We study classes of algorithms that solve problems. There are algorithms which we can analyse. To study the difficulty of problems we can not study the algorithms to solve them. Instead, the focus is on the problems.
Remember that we do not study any concrete algorithm hence our analysis will be only "proving" properties about problems.
There are several approach to find complexity of a problem. I will show three approches, two of which will be just briefly explained and we will concentrate on the last approach:
Using polynomial, all problems the we studied previously are classified as "easy". Examples of hard problems are: Traveling saleman problem (TSP), graph colouring, 0/1 knapsack, Hamilton graph, longest simple path in a graph, satisfiability problem. This group of problems has many important practical applications and yet we do not have any efficient algorithm to solve them. However, by the study of complexity they are known to belong to the same class. If we can find an algorithm to solve one of these problems efficiently then there exist efficient algorithms to solve all of them. This is very significant!
The examples of problems that we can prove to be too difficult to solve by computers are: diophantine equation, halting problem. They are proven to be non-computable.
An algorithm A "accepts" a string x is-in {0, 1}* when given the input x the algorithm outputs A(x) = 1. The language accepted by A is the set L = { x is-in {0, 1}* : A(x) =1 }. A "rejects" x when A(x) = 0.
L accepted by A, however A might not reject x not-in L. For example, the program stuck in a loop. L is said to be "decided" by A if all input is either accepted of rejected by A.
We can informally define a complexity class as a set of languages that an algorithm can determine whether a given string x belongs to language L.
Thm 36.2 P = {L: L is accepted by a polynomial-time algorithm}
(read the proof in CLR p.923)
Please note that this definition does not allow probabilistic algorithm s to be in P.
A proof system for a decision problem X is a set F of pair <x,q>. For any x is-in X there must exist at least one q is-in Q such that <x,q> is-in F; on the other hand no such q exist when x is-not-in X. To convince someone that x is-in X, it suffices to show some q is-in Q such taht <x,q> is-in F. Any q such that <x,q> is-in F is called a "certificate" that x is-in X.
NP is the class of decision problem that haves an "efficient" proof system. In other words, NP is the class of problems that can be verified in polynomial-time.
Def. NP is the class of decision problem X that: q is a certificate, the size of q is at most polynomial in the size of x (an instant of X) and there exists a polynomial-time algorithm that can verify whether or not <x,q> is-in F where F is a proof system for X.
NOTE: NP stands for Nondeterministic Polynomial-time. NP was originally defined as the class of problems that can be solved by a nondeterministic algorithm in polynomial time. Although the definition is diffirent but it is equivalent to the one we presented here.
Whether this class of problem is easy or difficult to solve is still unknown (no proof either way).
The first theorem establishes a relation between P and NP.
Theorem: P is-subset of NP
Consider an arbitrary decision problem X is-in P. The solution can be found. We can make a trivial certificate Q = {0} for any yes-instance and no certificate for no-instance. This certificate can be verified in polynomial time because the solution has been found in polynomial time (X is-in P). QED
The central open question is whether or not the set inclusion is proper.
P = NP ?
What have been accomplished is to establish a class of problems that have similar difficulty called "NP-complete". The implication of NP-complete is this: if there is an efficienty algorithm to solve any one of problems in NP-complete then it would automatically provide efficient algorithms for all of them.
To state the NP-complete class we need the notion of "polynomial reduction".
Let Ax be the algorithm to solve X, Ay be the algorithm to solve Y.
The concept of reduction can be illustrated as the diagram below:
Ax
x --> [ --> f(x) ---> [Ay] --> ] -->
If there exists polynomial time algorithm Ay to solve Y then f(x) plus Ay can solve X in polynomial time. (because f(x) can be computed in polynomial time).
for X <=p Y can be read as "X is not harder to solve than Y".
Example, we prove the polynomial equivalence of two Hamilton cycle problems. Let HAM be the problem of finding a Hamilton cycle in a graph if one exists, we allow it to return an arbritary answer when presented with non-Hamilton graph. Let HAMD be decision problem, deciding whether a given graph is Hamiltonian.
Thm: HAM =p HAMD
Proof: first we prove the direction HAMD <=p HAM.
function HamD(G)
d = Ham(G)
if d defines a Hamilton cycle in G
then return TRUE
else return FALSE
This algorithm solves HAMD correctly provided that Ham solves HAM correctly. It is clear that HAMD takes polynomial time if Ham is counted at unit cost.
now, the other direction HAM <=p HAMD.
Consider each edge in turn. Remove the edge and ask if the graph still be Hamiltonian. The edge is kept only when removing it make the graph non-Hamiltonian. When all is done the remaining graph will contain a Hamilton cycle.
function Ham(G)
if HamD(G) = FALSE then return "no solution"
for each e is-in E
if HamD( <V,E-{e}> ) then V = V -{e}
return unique cycle in remaining G
Ham takes polynomial time if we count each call on HamD at unit cost.
Example of reduction: Let TSPD be decision version of TSP, finding a tour in the graph, with costs on the edge, that visit every nodes once and is a cycle, whose cost is minimum. For the decision version: decide whether or not there exists a valid tour with cost lower than L.
Prove that HAMD <=p TSPD
Proof Let G be a graph with n nodes. f(G) be an instance of TSPD with the cost function:
c(u,v) = 1 if {u,v} is-in V
= 2 otherwise
bound L = n. Any Hamilton cycle in G translates into a tour in H that cost exactly n. If there are no Hamilton cycle in G, any tour in H must use at least one edge of cost 2, thus the total cost must be at least n+1. G is a yes-instance of HAMD iff f(G) = <H,c,L> is a yes-instance of TSPD.
Example: Let SAT-3-CNT be satisfiable problem of Boolean formula
in conjunctive normal form that has at most 3 literals per clause.
Let 3COL be a problem of deciding if a given graph can be 3 colours.
SAT-3-CNT <=p 3COL
Build a graph that correspond to Boolean formula.
Consider first, the literals assignment.
<graph representation of Boolean variable truth assignment>
now consider the clause.
<widget that represents a clause>
Any valid three-colouring of the graph provides a truth assignment for the Boolean formula and vice versa. Thus, the graph can be painted with 3 colours iff the formula is satisfiable. The construction graph can be done efficiently containing 3 + 2t + 6k nodes and 3 + 3t + 12k edges where t is the number of literals and k be the number of clauses in the Boolean formula.
============================================
Def: A decision problem X is NP-complete
if
1) X is-in NP
2) Y <=p X for every problem Y is-in NP
============================================
This definition is one of the most important discovery in computer science. NP-complete can be said to be "the most difficult problem in NP".
Theorem: Let X be an NP-complete problem. A decision problem Y is-in NP if X <=p Y then Y is also NP-complete.
Proof: By definition of NP-complete, the first condition Y is-in
NP. The second condition, consider an arbitrary Z is-in NP.
X is NP-complete and Z is-in NP, therefore Z <=p X. By assumption
X <=p Y, hence by transitive rule of reduction
Z <=p Y which is the necessary condition for Y to be
NP-complete.
Therefore to prove a problem Y to be NP-complete there are two steps:
1) prove Y is-in NP.
2) we use the known NP-complete X and reduce it to the new problem.
X <=p Y.
The first proof of NP-complete came from Cook (1971) with the SAT-CNF (satisfiability of Boolean formula in conjunctive normal form). Its proof is too difficult to be presented here.
Thm: SAT-3-CNF is NP-complete
Proof:
1) SAT-3-CNF is NP (assignment is verifiable)
2) Now we have two NP-complete problems to be used: SAT-CNF and
SAT. We can do either SAT-CNF <=p SAT-3-CNF or
SAT <=p SAT-3-CNF. We choose
SAT-CNF <=p SAT-3-CNF
We must find f such that if B is a formula in SAT-CNF, E = f(B) is
a formula in SAT-3-CNF such that E is satisfiable iff B is satisfiable.
Consider one clause with k literals.
case 1: k <=3 , E = B.
case 2: k = 4 , let B = l1 + l2 + l3 + l4, add a u
E = (l1 + l2 + u)(/u + l3 + l4) where / is negate.
if B is satisfiable, we can assign u such that E is satisfiable.
case 3: k > 4
E = (l1 + l2 + u1)(/u1 + l3 + u2) . . . (/u_k-1, l_k-1, l_k)
all clauses can be transformed the same way.
the additional size of u will be in polynomial of size of original
literals. The transformation can be done in polynomial time.
example: B = (p + /q + r + s)(/r + s)(/p + s + /x + v + /w)
E = (p + /q + u1)(/u1 + r + s)(/r + s)(/p + s + u2)
(/u2 + /x + u3)(/u3 + v + /w)
More reductions
The classic six of NP-complete from Garey and Johnson (1979) is shown below:
SAT <=p 3SAT
3SAT <=p 3DM, 3SAT <=p VC
3DM <= PARTITION
VC <=p HC, VC <=p CLIQUE
where
3SAT is a restricted form of SAT where each clause contains exactly 3 literals. (in CNF)
3DM is a 3 dimensional matching. A set M is-subset W x X x Y wehre W, X, Y are disjoint and have the same number of q elements. Does M contain a matching M' is-subset of M, size of M' is q and no two elements of M' agree in any coordinate. It is generalise of the stable marriage problem (which is 2D, stable marriage has polynomial algorithm).
VC is vertex cover. Is there a vertex cover of size K <= |V| of graph G, a subset V' of V such that |V'| <= K and for each edge {u,v} is-in E, at least one of u and v belong to V'?
CLIQUE Does a graph G contain a clique of size J <= |V| or more, that is, a subset V' of V such that |V'| >= J and every two vertices in V' are joined by an edge in E? ( a clique is a complete graph)
HC Hamilton circuit. Does G contain a Hamilton circuit, that is, an ordering <v1, v2, ... vn> of V in G, where n = |V|, such that {vn,v1} is-in E and {vi, vi+1} is-in E for all i, 1<=i<=n?
PARTITION. A finite set A, and a size s(a) is-in Z+ for each a
is-in A. Is there a subset A' of A such that
(sum a is-in A') s(a) = ( sum a is-in A-A') s(a) ?
Currently there are thousand of problems proven to be NP-complete and the size of this set keep on enlarging.
It is obvious that 3COL <=p COLC (a graph is 3-colourable when it chromatic number is either 0,1,2,3). However COLC is-not-in NP. Knowing that a problem is NP-hard tell us a lot that it is likely to be very difficult. Knowing that the problem is-in NP is not as useful.
Beside NP-hard, there are decision problems that are not is NP. For example, the "exact" problem, can a graph be colour exactly with k colours? We can imagine how to verify that a graph can be k-colourable, however it is unlikely to find a certificate that a graph "can not" be coloured with less/more than k colours. Another example is the unique satisfiability problem, is it true that there is only one truth assignment that make a Boolean formula true?
End NP-complete Last update 21 August 2002