I am striving to write more readable, declarative programs. So I decided to implement a simple algorithm we currently use. The procedural implementation is the following:
So the datalog variant I came up with looks nice but has two problems:
You can find the complete source here.
It depends on hypothesis and you can easily run it with pytest.
Below the test that fails: If we require resources provided by a previous "rank" or order. It can't find it. I tried to make follows recursive, but then it fails even in the simple examples.
def test_graph_multirequire():
"""Test if the resolver can handle a graph with multiple requires"""
tree = [
[('A'), ('B'), ('C'), ('D'), ('E'), ('F'), ('G'), ()],
[(), ('A'), ('B'), ('C', 'A'), ('D'), ('E'), ('F'), ('G')]
]
run_graph(tree)
def run_graph(tree):
"""Run an example"""
try:
tree_len = len(tree[0])
index = list(range(tree_len))
random.shuffle(index)
for i in index:
+ is_command(i)
for provide in tree[0][i]:
+ provides(i, provide)
for require in tree[1][i]:
+ requires(i, require)
##############################
is_root(X) <= is_command(X) & ~requires(X, Y)
follows(X, Z) <= (
provides(X, Y) & requires(Z, Y) & (X != Z)
)
order(0, X) <= is_root(X)
order(N, X) <= (N > 0) & order(N - 1, Y) & follows(Y, X)
##############################
ordered = []
try:
for x in range(tree_len):
nodes = order(x, N)
if not nodes:
break
ordered.extend([x[0] for x in nodes])
except AttributeError:
ordered = index
assert len(ordered) >= tree_len
print(ordered)
provided = set()
for command in ordered:
assert set(tree[1][command]).issubset(provided)
provided.update(tree[0][command])
finally:
pd.clear()
My questions:
Edit:
pyDatalog (and prolog) is well adapted for such problem. The challenge is to depart from the traditional procedural programming mindset. You may want to search the web for a course on prolog : many principles will apply to pyDatalog too.
Writing a loop in a declarative language involves 3 steps :
1) define a predicate with all the variables that change while looping.
In this case, you want to keep track of a partial plan, what has been produced already, and what is left to be planned :
partial_plan(Planned, Produced, Todo)
For example, partial_plan([0,], ['A',], [1,2,3,4,5,6,7]) is true. To find a plan, you would query :
partial_plan([C0,C1,C2,C3,C4,C5,C6,C7], L, [])
2) describe the base (= simplest) case. Here, the starting point is when all commands still have to be planned.
+ partial_plan([], [], [0,1,2,3,4,5,6,7])
3) describe the iteration rule. Here, you want to pick a command to be done, and whose requirements have already been produced, and add it to Planned :
partial_plan(Planned1, Produced1, Todo1) <= (
# find a smaller plan first
(Planned0 == Planned1[:-1]) &
partial_plan(Planned0, Produced0, Todo0) &
# pick a command to be done, reducing the list of todos,
pick(Command, Todo0, Todo1) &
# verify that it can be done with what has been produced already
subset(requirement[Command], Produced0) &
# add it to the plan
(Planned1 == Planned0 + [Command, ]) &
(Produced1 == Produced0 + result[Command])
)
We now have to define the predicates introduced in the 3rd step.
# pick
pick(Command, Todo0, Todo1) <= (
(X._in(enumerate(Todo0))) & # X has the form [index, value]
(Command == X[1]) &
(Todo1 == Todo0[:X[0]] + Todo0[X[0]+1:]) # remove the command from Todo0
)
# result and requirement are defined as logic functions,
# so that they can be used in expressions
for i in range(len(tree[0])):
result[i] = tree[0][i]
requirement[i] = tree[1][i]
subset checks that all elements of L1 are in L2 (notice the all quantifier). It is also defined as a loop:
subset([], List2) <= (List2 == List2) # base case
subset(L1, L2) <= (
(0 < len(L1)) &
(L1[0]._in(L2)) & # first element in L2
subset(L1[1:], L2) # remainder is a subset of L2
)
You would then have to declare all pyDatalog variables and predicates above, and the 'enumerate', 'result' and 'requirement' functions.
Note: this has not been tested; some minor changes may be needed.