calgorithmtime-complexityarray-algorithms

I wrote a function in C language that as I would think has O(n^3) time complexity, but as n grows, for some reason it behaves like O(n^2)


I have written a function that makes an array unique, but I don't quite understand how the time complexity works here.

Here's my whole program:

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <math.h>
#include <time.h>
#define SIZE 64000 

void fill(size_t const, int*);
void make_unique(size_t*, int**);
bool pop_back(size_t*, int**);
bool pop_front(size_t*, int**);
bool remove_at(size_t, size_t*, int**);


int main(void) {
    srand(time(NULL));
    size_t size=SIZE;
    int* arr=malloc(sizeof(int[size]));
    fill(size, arr);
    //print(size, arr);

    double time_spent = 0.0;
    clock_t begin = clock();
    make_unique(&size, &arr);
    clock_t end = clock();
    time_spent += (double)(end - begin) / CLOCKS_PER_SEC;

    printf("\nThe elapsed time is %f seconds", time_spent);
    printf("\nArray with no equal elements: \n");
    print(size, arr);
    free(arr);
    return EXIT_SUCCESS;
}


void make_unique(size_t* size, int** arr){
    for(size_t i=0; i<*size; ++i)
        for(size_t j=0; j<*size;){
            if(i==j) {
                ++j;
                continue;
            }
            if((*arr)[i]==(*arr)[j]){
                remove_at(j, size, arr);
                continue;
            }
            ++j;
        }
}

bool pop_back(size_t* size, int** arr){
    if(!*arr||!*size) return false;
    int* new_arr=malloc(sizeof(int[*size-1]));
    --*size;
    for(size_t i=0; i<*size; ++i)
        new_arr[i]=(*arr)[i];
    free(*arr);
    *arr=new_arr;
    return true;
}

bool pop_front(size_t* size, int** arr){
    if(!*arr||!*size) return false;
    int* new_arr=malloc(sizeof(int[*size-1]));
    --*size;
    for(size_t i=0, j=1; i<*size; ++i, ++j)
        new_arr[i]=(*arr)[j];
    free(*arr);
    *arr=new_arr;
    return true;
}

bool remove_at(size_t pos, size_t* size, int** arr){
    if(pos==0) return pop_front(size, arr);
    if(pos==*size-1) return pop_back(size, arr);
    if(!*arr||!*size||pos>=*size) return false;

    int* new_arr=malloc(sizeof(int[*size-1]));
    for(size_t i=0; i<pos; ++i)
        new_arr[i]=(*arr)[i];
    for(size_t i=pos, j=pos+1; j<*size; ++i, ++j)
        new_arr[i]=(*arr)[j];
    --*size;
    free(*arr);
    *arr=new_arr;
    return true;
}

void fill(size_t const size, int arr*){
    for(size_t i=0; i<size; ++i){
        arr[i]=rand()%100;
    }
}

The algorithm itself has quadratic time, but it also uses function remove_at. Is the overall complexity cubic?

The results I got were:

(SIZE    - seconds)
1024 000 - 1076.92
 512 000 -  262.92
 256 000 -   65.90
 128 000 -   16.3
  64 000 -    4.11

It looks like my program has O(n^2) time complexity. Does it?

I compiled my code with clang compiler and ran it in terminal.

By the way, if I have some serious inaccuracies in my code, I would like to read some comments. Or an advice on how to improve the effectiveness of it.

UPD:

My last understanding of general case complexity of my program is: We have (n-d) iterations in outer loop, n/(n-d) iterations in mid loop, and (n+(n-1)+...+(n-d))/d ~ (n-d/2) iterations in inner loop. So the complexity is O((n-d)(n/(n-d))(n+(n-1)+...+(n-d))/d ) = O(n*(n-d/2)) = O(n^2) (d is number of duplicates). It finally seems right to me (if it's not let me know). Thank you all for helping me.


Solution

  • Your code has O(n^2) worst case time complexity, even though it looks like it's O(n^3).

    For every element that's a duplicate of an earlier element, you incur O(n) work to remove it. There's at most n-1 of these elements, so the total cost of this duplicate removal is O(n^2).

    So you spend O(n^2) time finding the duplicates, and O(n^2) time removing the duplicates. That the "remove duplicate" code appears inside the nested loops is irrelevant because it's not called for most loop iterations.