Divide and Conquer

by Somchai Prasitjatrakul    transcribed  by Prabhas Chongstitvatana
27 June 2002
DC
merge sort
quick sort
celebrity problem
Divide and Conquer (DC)  steps consist of
1 divide problem into subproblems
2 solve subproblems (recursively)
3 combine solutions from subproblems
return solution

The time spent on divide, combine usually reciprocal to each other.  If divide is simple, combine is difficult and vice versa.  Total time in solving can be written in recurrent form:
            t(n) = at(n/b) + f(n)

when problem is divide into a subproblems of size b. f(n) contains dividing and combining time.  It also depends on type of data structure.  For example, dividing an array  takes a constant time but dividing a linked list take big-theta(n).

assume students know about sorting. (from data structure)

Examples:

merge sort

1 dividing data into two groups of equal size.  constant time
2 merge-sort both groups
3 combine two groups into one using merge.  big-theta(n)

(using drawing
input 13246875
divide  1324  6875     big-theta(1)
solve    ms    ms      2 t(n/2)
        1234  5678
combine  12345678      big-theta(n)    )

hence t(n) = 2t(n/2) + big-theta(1)

Solving this recurrence (master method) t(n) = big-theta(n lg n)

quicksort

1 dividing data into two groups using partition (size is unknown)
2 quicksort both groups
3 combine two groups.  because it is in-place it takes constant time.

partition is the worst case where it divides into one and the rest.

quicksort takes time:
t(n) = t(1) + t(n-1) + big-theta(1)

t(n) = big-theta(n^2)

therefore worst case running time is big-theta(n^2)

To avoid worst case, use random pivot (choosing pivoting element randomly).  Perform average case analysis.  assume the size of the lefthand group is k. (the righthand group is n-k)

t(n) = 1/(n-1) sum k=1 to n  t(n-k) + big-theta(1)
 . . .
t(n) = big-theta(n lg n)

comparing merge sort and quicksort  you can see the trade off between dividing time and combining time.

More complicate case of divide and conquer

celebrity problem

Def. in a group of person, a person whom everybody knows him/her and he/she doesn't know any person is the celebrity.

The problem can be represented as directed graph with a -> b means 'a knows b' or can be represented as incidence matrix k(i,j)  row and column represent person

where k(i,j) = 1   when i->j
             = 0   otherwise

Example set of person {a,b,c,d,e,f}

    a b c d e f
a   1 0 1 0 0 0
b   0 1 1 1 0 0
c   0 0 1 0 0 0
d   0 0 1 1 1 0
e   1 0 1 0 1 0
f   0 0 1 0 0 1
c knows no one and is known by every one, therefore c is the celebrity.

Find an algorithm to find celebrity

simplest one
1 pick one person, check whether he knows anybody, (scan by row), then
check if everybody knows him/her (scan by row).
2 do this until check everybody.

t(n) = n x big-theta(n) = big-theta(n^2)

(unfinished last update 27 June 2002)