javacoding-efficiencycost-management

What is the cost function for this algorithm?


my question is in the title. After hours of thinking and looking up sites on google i came to the conclusion that i'm not quite sure how to solve this problem or if it is correct. Maybe you guys could give me some advice to solve such things faster or in an easier way. Any help is appreciated!


The cost function of the algorithm is:

T(n) = O(n log n).

The outer loop is executed approximately log(n) times (because "i" is doubled in each iteration), and the inner loop is executed at most n times in each iteration of the outer loop (on the first iteration of the outer loop) and at most half as many times in each subsequent iteration of the outer loop. Together, this results in a runtime of O(n log n).

public float[] normalize (float[] seq) {
  int n = seq.length;
  float sum = 0;
  int cnt = 0;
  int i;
  for (i = 1; i < n; i = i + i ) {
    for ( int j = 0; j < i; j++) {
      sum = sum + seq [j];
    }
    cnt += i;
  }
  float[] res = new float [n];
  while (i >= 0) {
    if (i < n) {
      res [i] = seq[i] / (sum / cnt);
    }
  i--;
  }
  return res;
}

The cost function of the algorithm is:

T(n) = O(n log n).

The outer loop is executed approximately log(n) times (because "i" is doubled in each iteration), and the inner loop is executed at most n times in each iteration of the outer loop (on the first iteration of the outer loop) and at most half as many times in each subsequent iteration of the outer loop. Together, this results in a runtime of O(n log n).


Solution

  • Yes, it's O(n log n).

    1. for (int i = 0; i < n; i = i + i) is indeed a log(n) performance characteristic. Not 'approximately', exactly - approximation is baked into the notion of performance characteristics.
    2. Then for each of those you loop through 0-i, where i's worst/average case scenario are all a constant reference from n, more or less - thus, O(n). Combine the two and you have O(n log n).
    3. The rest is O(n), and constant factors 'do not count', so O(2n * log n) is still just simplified O(n log n).

    Such analysis is generally the only way to determine these things. In theory you can actually chart it out; after all, 'this function is O(n log n) means:

    If you make a 2D line chart, with 'time taken' on the Y axis, and 'the value of n' on the X axis and you actually fill in values (generate some input with n=1. Run it a few times and average the runtime. Let's say it 5msec. Put a 'dot' on x=1,y=5. Now run it at n=2. Let's say it's 7 msec. Put a dot on x=2,y=7. Keep going. Now draw a line that roughly fits the dots. VOILA!

    O(n log n) means: If only you look far enough to the right, with ZERO indication about how far you have to look, the curve will look exactly like the far-to-the-right part of the curve described by the math function: y = x * log(x), regardless of which base that log is (they all look the same). It's just about what it looks like, hence, constant factors are irrelevant, and hence, anything that doesn't 'control' can be omitted (e.g. y=x^2 + x and y = x^2 look exactly the same if you look far enough to the right of either curve; this is why O(x^2 + x) is just gobbledygook, the only point to saying that is 'this is what the curve eventually looks like', and that + x thing has no bearing on looks, hence, it makes no sense to include it, and hence, O(n) notation never would.

    At any rate, this gives you an alternate approach: Actually chart it out and compare what it looks like to a few standard curves. But I doubt this is less work than just reasoning through it.