c++algorithmbit-manipulationgraph-theorydirected-acyclic-graphs

Summation of nodes reachable starting from every node present in given DAG with restriction on number of childs per node


That's a problem from XXX Polish Olympiad in Informatics, stage II, titled "Wspinaczka". Link to original problem statement in Polish.

Let's compress above story into algorithmic problem statement:

Given Directed Acyclic Graph (DAG) g, for every vertex p in this graph g calculate the sum of all reachable nodes if we start traversing from p. The values of nodes are given as array, where i-th value corresponds to i-th vertex.

It's known that when we are at vertex i, then directed edge from node i to node j exists only if j > i.

Also, we are given k, 1 <= k <= 8 and following condition must hold: j - i <= k

There are n nodes, 1 <= n <= 100 000 and m edges 1 <= m <= 800 000

That's how I started:

#include <iostream>
#include <vector>
#include <queue>
#include <algorithm>

using namespace std;

int main()
{
    ios_base::sync_with_stdio(false);
    cin.tie(nullptr);
    cout.tie(nullptr);

    int n, m, k;
    cin >> n >> m >> k;

    // values of nodes
    vector<int> beauty(n);
    for (int i = 0; i < n; ++i)
        cin >> beauty[i];

    vector<vector<int>> adj_list(n);
    vector<vector<int>> reversed_adj_list(n);
    vector<int> outgoing_edges(n);
    for (int i = 0; i < m; ++i)
    {
        int a, b;
        cin >> a >> b;

        --a;
        --b;

        adj_list[a].push_back(b);
        reversed_adj_list[b].push_back(a);
        ++outgoing_edges[a];
    }

    queue<int> q;
    for (int i = 0; i < n; ++i)
    {
        if (outgoing_edges[i] == 0)
        {
            q.push(i);
        }
    }

    vector<long long> dp(n);
    while (!q.empty())
    {
        int node = q.front();
        q.pop();

        dp[node] = beauty[node];
        for (const int child : adj_list[node])
        {
            dp[node] += dp[child];
        }

        for (const int parent : reversed_adj_list[node])
        {
            --outgoing_edges[parent];

            if (outgoing_edges[parent] == 0)
            {
                q.push(parent);
            }
        }
    }

    for (int i = 0; i < n; ++i)
        cout << dp[i] << '\n';
}

So I perform topological sorting (firstly I process n node, then n-1, then n-2 node and so on because we know that there CAN BE edge between node i and j only if j > i).

While this topological sort, I'm calculating dp[i] like this: dp[i] = value[i] + sum_of_values_of_all_childs. The problem is that this overcounts. Why?

See graph below (red numbers are values): enter image description here

For node 1, we are summing up values of all childs and their subgraphs, but both of these childs already have counted node 4. So instead of result 13 when I start at node 1, calculated answer is 14 because I counted 4 node twice.

I have no idea how to avoid this situation, and I didn't make use of small value of k. I know there exists solution which uses dp[node][mask], where mask is up to 2^K, but I'm unable to find it out. The time complexity of this solution is O(n * 2^K + m) (m because of reading input).


Solution

  • Let's process/build the DAG from vertex N to vertex 1. At each point when processing a vertex i, let's store dp[mask]: the sum of values of vertices which can be reached exactly from the masked subset of vertices {i+1...i+k}. Summing over reachable masks from i gives answers to the problem.

    We just need to recalculate the DP after adding vertex i, i.e. calculate dp_new[mask_new] for masks of {i...i+k-1}. Note that each vertex is only counted once in the DP, so we just need to calculate the uniquely given new mask for every dp[mask] and add that sum of weights to updated DP array. For every mask, we know that the first k-1 bits of mask are the last k-1 bits of mask_new, so that's just a shift; all k bits tell us if we're dealing with vertices reachable from i as well (they determine the first bit of new mask), and k-th bit isn't used for anything else. Finally we add the weight of vertex i to dp_new[100...0].

    With O(1) bitwise operations used for all calculations with masks, this has the desired time complexity.