It's similar to the problem of finding the minimal set of vertices in a directed graph from which all vertices can be reached, except that a node must have all its predecessors reached in order to be reached itself.
More formally, let S be a set of nodes belonging to a directed graph G. A node n of G is said to be reachable from S if and only if n belongs to S or if all predecessors of n are reachable from S.
We'll say that S is a generator of G if all nodes of G are reachable from S.
We can be sure that such a generator always exists for every graph (i.e. the set containing all nodes will always work).
The question is to design an algorithm that returns a minimum-length generator of G, given a graph G.
Let's take this graph.
{1,3,6} is a generator because {1} unlocks 2, {2,3} unlocks 5 and {1,3,6} unlocks 4.
Now, if you take S = {1,5}, nodes 3, 4 and 6 will remain unreached. S is therefore not a generator of this graph.
In fact, we can see that, if S is a generator, every cycle of the graph must contain at least one node in S.
This problem is NP hard. That can be seen by reducing the minimum feedback vertex set problem to your reachability problem.
Start with any directed graph G
. Construct a new graph H
by adding a single vertex v
with an edge connecting it to every vertex in G
.
Let S
be any solution to your reachability problem on H
. It is obvious that v
must be in S
because it has no predecessor, and is a predecessor of everything else. It is also clear that if you take S
away from G
, it cannot contain any cycles. And therefore S
must contain a vertex feedback set for G
.
Conversely, any subset S
of H
that contains both v
and a vertex feedback set for G
will solve your reachability problem. (You can prove this by induction on the size of G
minus S
. Just prove that there is always an element with no predecessor in G
, remove that element, then apply the induction hypothesis.)
Therefore a solver for your minimum reachability problem allows us to solve the minimum feedback vertex problem for arbitrary graphs. Since that problem is NP hard, so is your problem.