algorithmrandomquicksortasymptotic-complexity

Expected running time vs. worst-case running time


I am studying the randomized-quicksort algorithm. I realized that the running time of this algorithm is always represented as "expected running time".

What is the reason for specifying or using the "expected running time"? Why don't we calculate the worst- or average-case?


Solution

  • When we say "expected running time", we are talking about the running time for the average case. We might be talking about an asymptotically upper or lower bound (or both). Similarly, we can talk about asymptotically upper and lower bounds on the running time of the best or worst cases. In other words, the bound is orthogonal to the case.

    In the case of randomized quicksort, people talk about the expected running time (O(n log n)) since this makes the algorithm seem better than worst-case O(n^2) algorithms (which it is, though not asymptotically in the worst case). In other words, randomized quicksort is much asymptotically faster than e.g. Bubblesort for almost all inputs, and people want a way to make that fact clear; so people emphasize the average-case running time of randomized quicksort, rather than the fact that it is as asymptotically bad as Bubblesort in the worst case.

    As pointed out in the comments and in Greg's excellent answer, it may also be common to speak of the expected runtime with respect to the set of sequences of random choices made during the algorithm's execution on a fixed, arbitrary input. This may be more natural since we think of the inputs as being passively acted upon by an active algorithm. In fact, it is equivalent to speak of an average over random inputs and an algorithm whose execution does not consider structural differences. Both of these formulations are easier to visualize than the true average over the set of pairs of inputs and random choices, but you do get the same answers no matter which way you approach it.