Skip to content

5-2 Searching an unsorted array

This problem examines three algorithms for searching for a value $x$ in an unsorted array $A$ consisting of $n$ elements.

Consider the following randomized strategy: pick a random index $i$ into $A$. If $A[i] = x$, then terminate; otherwise, continue the search by picking a new random index into $A$. Continue picking random indices into $A$ until you find an index $j$ such that $A[j] = x$ or until every element of $A$ has been checked. This strategy may examine a given element more than once, because it picks from the whole set of indices each time.

a. Write pseudocode for a procedure RANDOM-SEARCH to implement the strategy above. Be sure that your algorithm terminates when all indices into $A$ have been picked.

b. Suppose that there is exactly one index $i$ such that $A[i] = x$. What is the expected number of indices into $A$ that must be picked before $x$ is found and RANDOM-SEARCH terminates?

c. Generalizing your solution to part (b), suppose that there are $k\geq 1$ indices $i$ such that $A[i] = x$. What is the expected number of indices into $A$ that must be picked before $x$ is found and RANDOM-SEARCH terminates? Your answer should be a function of $n$ and $k$.

d. Suppose that there are no indices $i$ such that $A[i] = x$. What is the expected number of indices into $A$ that must be picked before all elements of $A$ have been checked and RAMDON-SEARCH terminates?

Now consider a deterministic linear search algorithm. The algorithm, which we call DETERMINISTIC-SEARCH, searches $A$ for $x$ in order, considering $A[i],A[2],A[3],...,A[n]$ until either it finds $A[i] = x$ or it reaches the end of the array. Assume that all possible permutations of the input array are equally likely.

e. Suppose that there is exactly one index $i$ such that $A[i] = x$. What is the average-case running time of DETERMINISTIC-SEARCH? What is the worst-case running time of DETERMINISTIC-SEARCH?

f. Generalizing your solution to part (e), suppose that there are $k\geq 1$ indices $i$ such that $A[i] = x$. What is the average-case running time of DETERMINISTIC-SEARCH? What is the worst-case running time of DETERMINISTIC-SEARCH? Your answer should be a function of $n$ and $k$.

g. Suppose that there are no indices $i$ such that $A[i] = x$. What is the average-case running time of DETERMINISTIC-SEARCH? What is the worst-case running time of DETERMINISTIC-SEARCH?

Finally, consider a randomized algorithm SCRAMBLE-SEARCH that first randomly permutes the input array and then runs the deterministic linear search given above on the resulting permuted array.

h. Letting $k$ be the number of indices $i$ such that $A[i] = x$, give the worst-case and expected running times of SCRAMBLE-SEARCH for the cases in which $k = 0$ and $k = 1$. Generalize your solution to handle the case in which $k \geq 1$.

i. Which of the three searching algorithms would you use? Explain your answer.

a

RANDOM-SEARCH(x, A, n)
    S = 
    while |S| != n
        i = RAMDOM(1,n)
        if A[i] == x
            return i
        else
            S = S  {i}
    return NIL

b

Let $D$ be the number of indices into $𝐴$ that must be picked before $x$ is found and RANDOM-SEARCH terminates.

$$ \begin{aligned} Pr(D = i) & = (\frac{n-1}{n})^{i-1}\cdot\frac{1}{n}\cr E[D] & = \sum_{i=1}^{n} i Pr(D=i)\cr & = \sum_{i=1}^{n} i (\frac{n-1}{n})^{i-1}\cdot\frac{1}{n}\cr & = n \quad \text{(Geometric distribution)}\cr \end{aligned} $$

c

$$ \begin{aligned} p & = \frac{k}{n}\cr Pr(D = i) & = p(1-p)^i\cr E[D] & = \sum_{i=1}^{n} i Pr(D=i)\cr & = \sum_{i=1}^{n} ip(1-p)^{i-1}\cr & = \frac{1}{p}\cr & = \frac{n}{k}\cr \end{aligned} $$

d

Let $X_i$ be the number of indices into $A$ that need to pick the new $i$th indice. precisely speaking, Let $j$ be the the number of indices into $A$ that pick the new $(i-1)$ indice, $k$ be the the number of indices into $A$ that pick the new $i$ indice, then $X_i = k-j$

$$ \begin{aligned} E[X_i] & = \frac{n}{n-i+1}\cr E\left[\sum_{i=1}^{n}X_i\right] & = \sum_{i=1}^{n}E[X_i]\cr & = \sum_{i=1}^{n}\frac{n}{n-i+1}\cr & = \sum_{i=1}^{n}\frac{n}{i}\cr & = n(\ln n + O(1))\cr \end{aligned} $$

e

worse-case: n; average-case: (n+1)/2(obviouly).

f

worse-case: n-k+1;

average-case:

$$ \begin{aligned} Pr(X=i) & = \frac{P(n-k,i-1)P(k,1)P(n-i,n-i)}{P(n,n)}\cr Pr(X \geq i) & = \prod_{j=1}^{i-1}\frac{n-k-j+1}{n-j+1}\cr & = \frac{P(n-k,i-1)}{P(n,i-1)}\cr & = \frac{P(n-i+1,k)}{P(n,k)}\cr & = \frac{n-i+1\choose k}{n\choose k}\cr E[X] & = \sum_{i=1}^{i=n-k+1} iPr(X=i)\cr & = \sum_{i=1}^{i=n-k+1} Pr(X\geq i)\cr & = \sum_{i=1}^{i=n-k+1} \frac{n-i+1\choose k}{n\choose k}\cr & = \frac{\sum_{i=k}^{n}{i\choose k}}{n\choose k}\cr & = \frac{n+1\choose k+1}{n\choose k}\cr & = \frac{n+1}{k+1}\cr \end{aligned} $$

g

Both are $n$

h

The same as DETERMINISTIC-SEARCH, only we replace "average-case" with "expected".

i

DETERMINISTIC-SEARCH, since it gives best worse-case and average-case running time. DCRAMBLE-SEARCH gives the same running time but with extra cost of linear time to permute the input array. In the same time, DETERMINISTIC-SEARCH has scanned the full array and reported a result.