I need to find minimum subarray of size L, A[0:L], B[0:L] such that there are M different elements in A that are bigger than M different elements in B. Like A[i] > B[j] counts but I cannot use A[i] or B[j] again. Also subarrays must start from the beginning of array.
The homework is about AVLTrees and Heap so I guess a solution would involve one of them (For the example below, I used the priority_queue from stl but actually I am not allowed to use any container from any library so I already have min-heap implementation but for easy-understanding I used the stl equivalent). Also I am expected to solve the question using AVL Tree or Heap.
Ps. I have found out that arrays can contain duplicate elements.
The size of A and B are the same which is: N.
I need to find a solution that has time complexity O(N * logN * logM)
#include <iostream>
#include <queue>
#include <span>
int minLenOfSubarrays(std::span<int> aArr, std::span<int> bArr, int M)
{
const int N = aArr.size();
int count = 0;
int L = 0;
int left = M;
int right = N;
while(left <= right){ //log N
//Heap a; //min-heap
//Heap b; //min-heap
int mid = left+(right-left)/2;
std::priority_queue<int,std::vector<int>,std::greater<int>> a;
std::priority_queue<int,std::vector<int>,std::greater<int>> b;
count = 0;
for(int j = 0; j < mid; j++){ //N log N
a.push(aArr[j]);
b.push(bArr[j]);
}
while(!a.empty()){ // N log N
if(a.top() > b.top()){
count++;
a.pop();
b.pop();
}
else{
a.pop();
}
}
if(count >= M){
L = mid;
right = mid-1;
}
else{
left = mid+1;
}
}
return L;
}
int main(int argc, char* argv[]){
int aArr[] = {2,4,10,6,1,11};
int bArr[] = {3,5,8,9,7,12};
int M = 3;
std::cout << minLenOfSubbarays(aArr, bArr, M) << '\n';
}
I have tried a solution using min-heaps and binary search but the complexity that I could reach is max O(N*logN*logN).
In your binary search, for each mid
do this:
The binary search contributes the remaining factor logN.
Bonus: You can make it O(N * logN) by sorting all A-values and all B-values before the binary search and then for each mid
you get the above A-values and B-values of steps 1 and 2 by instead filtering those already sorted values. In "executable pseudocode":
from bisect import bisect
A = 2,4,10,6,1,11
B = 3,5,8,9,7,12
M = 3
# Sort indices by their values
N = len(A)
Ai = sorted(range(N), key=A.__getitem__)
Bi = sorted(range(N), key=B.__getitem__)
# Check whether A[:L] and B[:L] have the desired property
def enough(L):
largeA = [A[i] for i in Ai if i < L][-M:]
smallB = [B[i] for i in Bi if i < L][:M]
return all(a > b for a, b in zip(largeA, smallB))
# Binary search, find the smallest L from M to N that's enough
L = bisect(range(N), False, M, N, key=enough)
print(L)
(This assumes there always is a valid L, otherwise you would've specified what to do if not.)
I think we can even achieve O(NlogM + MlogN). Still do the binary search, but don't start from scratch for each new mid
. Instead just update a tree of the M largest A-values:
mid
is larger than previously, insert each new value and remove the smallest after each insertion. Similar for B.mid
is smaller than previously, then undo the insertions/removals you did to get to the larger mid
. So this requires keeping track of what updates you did, at least which elements were removed. It's clear which elements got inserted.Overall you have O(N) single-value updates, each costing O(logM). And you have O(logN) binary search rounds, each costing O(M) for comparing the M pairs. (This might be inspired by Costantino's answer.)