cdecoding

Heap Buffer Overflow in findMedianSortedArrays Function in C (LeetCode)


double findMedianSortedArrays(int* nums1, int nums1Size, int* nums2, int nums2Size) {  
    int* asd;  
    bool flag1 = false, flag2 = false;  
    asd = malloc(sizeof(int) * (nums1Size + nums2Size));  
    int br1 = 0, br2 = 0;  

    for (int i = 0; i < (nums1Size + nums2Size); i++) {  
        if (flag1) {  
            asd[i] = nums2[br2];  
            br2++;  
        } else if (flag2) {  
            asd[i] = nums1[br1];  
            br1++;  
        } else if (nums1[br1] >= nums2[br2]) { 
            asd[i] = nums2[br2];  
            br2++;  
            if (br2 == nums2Size) {  
                br2--;  
                flag2 = true;  
            }  
        } else {  
            asd[i] = nums1[br1];  
            br1++;  
            if (br1 == nums1Size) {  
                br1--;  
                flag1 = true;  
            }  
        }  
        printf("%d\n", asd[i]);  
    }  

    if ((nums1Size + nums2Size) % 2 == 0) {  
        double a = (asd[(nums1Size + nums2Size) / 2] + asd[(nums1Size + nums2Size) / 2 - 1]);  
        return a / 2;  
    } else {  
        return asd[(nums1Size + nums2Size) / 2];  
    }  
}  

Test casses are good, but it wont submit.

When I try to submit the code, I get the following error:

==23==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000001f0
... SUMMARY: AddressSanitizer: heap-buffer-overflow solution.c:16 in findMedianSortedArrays
...

I suspect the issue is caused by br1 or br2 exceeding the bounds of their respective arrays (nums1 or nums2). I tried adding boundary checks for br1 and br2, but I still couldn't resolve the issue. I also verified the size of the dynamically allocated array asd (it should be nums1Size + nums2Size), and it seems correct. Is there a bug in the logic of merging the arrays or accessing their elements? How can I ensure I don't access out-of-bounds memory for nums1 or nums2? Is there a more efficient way to solve this problem?


Solution

  • } else if (nums1[br1] >= nums2[br2]) { 
    

    In this line, you will access an array out of bounds if its empty.

    You are also leaking memory since you do not free(asd) before returning. You don't even need to do any allocation at all to find the middle element(s). Just loop to the midpoint and store the current low and the previous value (in case of an even amount of elements).

    Example:

    #include <math.h>
    #include <stddef.h>
    
    double findMedianSortedArrays(int* nums1, size_t nums1Size, int* nums2,
                                  size_t nums2Size) {
        size_t totSize = nums1Size + nums2Size;
        if (totSize == 0) return NAN; // not a number
        size_t mid = totSize / 2 + 1;
        int prev, curr = 0; // previous number and current number
    
        // Loop for as long as there are numbers left in both arrays
        // and we've not reached the midpoint
        for (; nums1Size && nums2Size && mid; --mid) {
            prev = curr; // store the previous number
            // store the smallest number in curr and step the
            // pointer in the selected array and decrease its size:
            if (*nums1 < *nums2) {
                curr = *nums1;
                ++nums1;
                --nums1Size;
            } else {
                curr = *nums2;
                ++nums2;
                --nums2Size;
            }
        }
        if (mid--) {
            // We haven't reach the midpoint so one of the arrays must have
            // more elements. Fast forward "mid" steps in that array:
            if (nums1Size) {
                nums1 += mid;
                if (mid) prev = nums1[-1];
                else prev = curr;
                curr = *nums1;
            } else {
                nums2 += mid;
                if (mid) prev = nums2[-1];
                else prev = curr;
                curr = *nums2;
            }
        }
        if (totSize % 2 == 0) return (prev + curr) / 2.0;
        return curr;
    }
    

    Demo