stringalgorithmdata-structures

Given a binary string, count number of dense sub-strings in O(nlogn) time


Dense sub-string: number of 1's > number of 0's

Brute force approach in C++

#include <iostream>
#include <string>

using namespace std;

int bin_dense_count_bruteforce(string s) {
    int n = s.size();
    int ans = 0;

    for(int i=0; i<=n-1; i++) {
        int statusNow = 0;
        
        for(int j=i; j<=n-1; j++) {
            if(s[j] == '1')
                statusNow++;
            else
                statusNow--;

            if(statusNow>0)
                ans++;
        }
    }

    return ans;
}

int main() {
    string s = "11000101";

    cout<<bin_dense_count_bruteforce(s)<<endl;
    // 7
    // i, j substring
    // 0, 0 1
    // 0, 1 11
    // 0, 2 110
    // 1, 1 1
    // 5, 5 1
    // 5, 7 101
    // 7, 7 1

    return 0;
}

I tried thinking about any approach with unordered_map, like prefix sum like approach or similar way. I have tried searching online and even chatgpt, claude, etc. but no success.


Solution

  • First, a quick search on Google will give you this useful page https://www.geeksforgeeks.org/count-of-substrings-in-a-binary-string-that-contains-more-1s-than-0s/

    If we take the prefix sum pre, the array counting the difference between ones and zeros:

    pre[i+1] = count of 1s in s[0..i] - number of 0s in s[0..i]
    

    The problem is equivalent to:

    find all (i, j) with i < j such as pre[i] < pre[j].
    

    Where i, j can go from 0 to n. Hence, it's similar to counting the inversion. A fenwick tree is able to do it in O(nlog(n)) but it's not the only solution. To prevent indexes to be negative, they can be shift by n.

    vector<int> p(n+1);
    FenwickTree<long long> ft(2*n+1);
    for (int i = 0; i < n; ++i) {
      p[i+1] += (s[i] == '1'?1:-1)+p[i];
      ft.update(n+p[i+1], +1);
    }
    long long ans = 0;
    for (int i = 1; i <= n; ++i) {
      ans += ft.query_greater_than(n+p[i-1]);
      if (i != 1) ft.update(n+p[i-1],-1);
    }
    

    The result might not fit a int as it could be equivalent to n^2.