javascripttypescriptalgorithmbinary-search

Binary Search for multiple items within a Range (Log time filter)


I have an Array of Log items, already sorted by their timestamp (number of milliseconds since 1970). Now I want to filter them by a specific time range, so I think of Binary Search, however this variant is different than all variants I knew before as I need to find a range within a range. Note that there may be none or multiple items at the value edges.

I came up with this to reduce one range requirement but still don't know how to get to the first/last edge items:

    filterByTime(min: number, max: number): LogItem[] {
        const items = this.items;
        const len = items.length;

        if (min === undefined && max === undefined) {
            return items.slice(0, len - 1);
        }
        min = min || Number.MIN_VALUE;
        max = max || Number.MAX_VALUE;

        const fn = (a: LogItem) => {
            if (a.time < min) {
                return -1;
            } else if (a.time > max) {
                return 1;
            } else {
                return 0;
            }
        };

        const lowerBound = this.binarySearchBound(fn, 0, len, true);
        if (lowerBound == -1) { return []; }

        const upperBound = this.binarySearchBound(fn, 0, len, false);

        return items.slice(lowerBound, upperBound + 1);
    }

    binarySearchBound(compareFn: (a: LogItem) => number, left: number, right: number, isLowerBound: boolean): number {
        const items = this.items;
        if (items.length == 0) {
            return -1;
        }

        if (isLowerBound && compareFn(items[0]) == 0) {
            return 0;
        } else if (!isLowerBound && compareFn(items[items.length - 1]) == 0) {
            return items.length - 1;
        }

        while (left <= right) {
            const mid = (left + right) / 2;
            const item = this.items[mid];

            const compare = compareFn(item);
            if (compare < 0) {
                left = mid + 1;
            } else if (compare > 0) {
                right = mid - 1;
            } else {
                // What to do now?
            }
        }

        return -1;
    }

Worst case scenario, I can just do a linear search from the edge since I can assume there are not that much items at the edge but surely there is a better way I didn't think of but then I may have to iterate through the whole result set if mid falls in the middle of the result set.

EDIT for adding a note: It's possible for min or max is undefined (and could be both, in which case I can just set an if and return the copy of the whole array). Is it better to just substitute it with MIN_VALUE and MAX_VALUE if they are undefined, or is there a better way to handle that case?


Solution

  • I would suggest the following:

    Here is the suggested implementation with integer data (instead of objects) to keep it simple. In order to have it run in a snippet I also removed the type references:

    function filterByTime(min=Number.MIN_VALUE, max=Number.MAX_VALUE) {
        const fn = (a, b) => a - b; // simplified (should be a.time - b.time)
    
        const lowerBound = this.binarySearchLowerBound(fn, 0, this.items.length, min);
        const upperBound = this.binarySearchUpperBound(fn, lowerBound, this.items.length, max);
    
        return this.items.slice(lowerBound, upperBound);
    }
    
    function binarySearchLowerBound(compareFn, left, right, target) {
        while (left < right) {
            const mid = (left + right) >> 1;
            if (compareFn(this.items[mid], target) < 0) {
                left = mid + 1;
            } else { // Also when equal...
                right = mid;
            }
        }
        return left;
    }
    
    function binarySearchUpperBound(compareFn, left, right, target) {
        while (left < right) {
            const mid = (left + right) >> 1;
            if (compareFn(this.items[mid], target) <= 0) { // Also when equal...
                left = mid + 1;
            } else {
                right = mid;
            }
        }
        return left;
    }
    
    // Demo with simplified data (instead of objects with time property)
    this.items = [1, 2, 2, 2, 3, 4, 4, 5, 5, 5, 6, 7, 8, 8];
    console.log(this.filterByTime(2, 4));
    console.log(this.filterByTime(4, 5));