I am wondering how to translate this Prolog code to work in Datalog.
I don't think it will work with Datalog because we are not allowed to use things like nullable(rule(z,[d]))
in Datalog. Also, I do not know if Datalog allows lists. Maybe another option is to write rules as strings, but I do not know if Datalog allows strings and if all needed string functions are available.
% rules for nullable
% If we have the rule X -> Y where X does not appear in Y and each component of Y is nullable, then X is nullable
% We need that X does not appear in Y to avoid circular loops (If X is nullable it would be because of a non-circular definition so we are not omitting any case)
nullable(X) :- variable(X), rule(X,Y), \+ member(X, Y), nullable(Y). % The Y here is a list so we need to define nullable(Y) for lists which is one below
% The empty list is always a nullable list
nullable([]).
% A list is nullable if all of its components are nullable
nullable([X|Y]) :- nullable(X), nullable(Y).
% A rule X -> Y is nullable if Y is nullable
nullable(rule(_,Y)) :- nullable(Y).
Context about the code. This prolog code determines if the context-free grammar is nullable.
This means for all rules (e.g. for production S -> ABC | XYZ, the rules are: [ABC, XYZ ] ) if ANY of them is nullable then the whole rule is nullable. This is equivalent to OrLattice.
eq Rule.getNDecl().nullable() { for (int i = 0; i < getNumProd(); i++) { if (getProd(i).nullable()) return true; } return false; }
This means for all terminals and non-terminals in production (e.g. for a production S -> ABC, symbols are ABC ) if ALL of them is nullable then the whole rule is nullable. This is equivalent to AndLattice.
eq Prod.nullable() { for (int i = 0; i < getNumSymbol(); i++) { if (!getSymbol(i).nullable()) return false; } return true; }
an Epsilon terminal is nullable
eq Terminal.nullable() { return false; }
non-terminal is nullable if its use is nullable
eq NUse.nullable(){ return decl().nullable(); }
Paper (free to download) (page 14-15)
I think the Prolog code is a distraction. Since, as you say, Datalog doesn't support lists and other Prolog terms apart from atoms and variables, trying to go through Prolog doesn't seem helpful.
That said, your Prolog code is also not very well written. The language allows you to write what appears to be a single nullable/1
predicate that deals with symbols, productions, and rules. But that doesn't make it a good idea. You will also make yourself very unhappy by giving variables for lists unhelpful names like Y
. A common convention is to add an s
suffix to variable names for lists. Here is a cleaner version of your Prolog code:
nullable_nonterminal(X) :-
variable(X),
rule(X,Ys),
\+ member(X, Ys),
nullable_nonterminals(Ys).
nullable_nonterminals([]).
nullable_nonterminals([X|Xs]) :-
nullable_nonterminal(X),
nullable_nonterminals(Xs).
nullable_rule(rule(_,Ys)) :-
nullable_nonterminals(Ys).
But again, this won't translate directly to Datalog. And as an additional complication, it introduces manual handling for circularity, which both Datalog and CRAG (if I understand the abstract and code example correctly) do for you.
For the Datalog part, you need some reasonable representation of your data as facts. Consider a rule that you would represent as a Prolog term rule(x, [a, b, c])
. This isn't Datalog, but by giving this rule an arbitrary name (like rule1
) you can represent a series of facts expressing that "the first symbol produced by rule1
is a
", "the second symbol produced by rule1
is b
" and so on:
rule(rule1, 1, a).
rule(rule1, 2, b).
rule(rule1, 3, c).
The fuller version of the CRAG-like system is slightly more complex because rules can have multiple productions, which themselves produce symbols. Here is my Datalog version, translated directly from the example code at https://bitbucket.org/jastadd/crag-artifact/src/master/Nullable.jrag:
nullable_symbol(nothing).
nullable_symbol(Symbol) :-
terminal(Symbol),
nullable_terminal(Symbol).
nullable_symbol(Symbol) :-
nonterminal(Symbol),
nullable_nonterminal(Symbol).
nullable_terminal(_Symbol) :-
false.
nullable_nonterminal(Nonterminal) :-
rule(_Rule, _Index, Nonterminal, Production),
nullable_production(Production).
nullable_rule(Rule) :-
rule(Rule, _Index, _Nonterminal, Production),
nullable_production(Production).
nullable_production(Production) :-
not(some_nonnullable_symbol(Production)).
some_nonnullable_symbol(Production) :-
production(Production, _Index, Symbol),
not(nullable_symbol(Symbol)).
Two notes:
nothing
which is neither a terminal nor a nonterminal and is hard-coded as nullable.Examples:
% Example 1: a non-nullable rule
% r1: x --> a b
% where a and b are terminals.
rule(r1, 1, x, p1).
production(p1, 1, a).
production(p1, 2, b).
nonterminal(x).
terminal(a).
terminal(b).
% Example 2: a nullable rule
% r2: eps --> <nothing>
rule(r2, 1, eps, p2).
production(p2, 1, nothing).
nonterminal(eps).
% Example 3: a non-nullable rule
% r3: y --> eps x eps
rule(r3, 1, y, p3).
production(p3, 1, eps).
production(p3, 2, x).
production(p3, 3, eps).
nonterminal(y).
With this online Datalog the query nullable_rule(Rule)
gives the expected answer:
{
nullable_rule(r2)
}
If for example we comment out the production(p3, 2, x).
that makes r3
non-nullable, the query also correctly recognizes the modified r3
as nullable.