2110712  Problems


1  cardinality of very large sets

X is a finite set with size n.  n is very large that it is impossible to count each element in X.  To find the cardinality of X we use uniform random sampling of elements in X with the following algorithm:

card x
  k = 0
  S = empty
  a = uniform x
  repeat
    k = k + 1
    S = S union {a}
    a = uniform x
  until a is-in S
return k^2/2

Prove that card x is an unbiased estimator of n.  The expected running time of card x  is big-theta( sqrt(n)) if uniform x runs at constant time.
Hint:  n = sqrt m   where m is the number of elements sampled from X before there is a repeat.
It will help if you solve the following two related questions first:

1.1  if the size of a hash table N = m^2.  where m is the number of element in the table.  Show that there is significant collision when assume hash by uniform random.

1.2  There are 25 persons in a room.  What is the chance that two persons have the same birthday?