I have been assigned a project to implement a top-down backtracking parser for any grammar which contains only one nonterminal on the RHS of its rewrite rules (e.g S -> aaSb | aaSa | aSa)
So far, I have three methods, including main
, that are used to handle checking the validity of an input string.
My goal is to, using a char[][]
array for the grammar, check each character in the input string against the grammar, and return true
if the string is contained within the grammar.
public class TDBP {
public static void main(String[] args) {
char[][] g = new char[][]
{ {'a', 'a', 'S', 'b'},
{'a', 'a', 'S', 'a'},
{'a', 'S', 'a'},
{'\0'} };
SP(g);
}
public static void SP(char[][] g) {
Scanner s = new Scanner(System.in);
boolean again = true; int pn = 0;
String test;
while(again) {
System.out.print("Next string? ");
test = s.nextLine();
if(S(pn, test, g))
System.out.println("String is in the language");
else
System.out.println("String is not in the language");
if(s.nextLine() == "\n") again = false;
}
s.close();
}
public static boolean S(int pn, String test, char[][] g) {
char[] c = test.toCharArray();
boolean exists = false;
for(int i = pn; i < g.length; i++) {
for(int j = 0; j < g[i].length; j++) {
if(c[j] == 'S')
S(++pn, test, g);
if(c[j] == g[i][j])
exists = true;
}
}
return exists;
}
}
In my algorithm, pn
is an integer to keep track of which production in the grammar I am currently looking at, and to make sure that I don't scan the same grammar twice (e.g a pn
of 1 in above grammar would correspond to aaSa
). Also, I have \0
represent the empty string.
Am I parsing the string correctly?
Thank you!
I am bit rusty on my CS classes: but the following code worked for me:
public static boolean fullCheck(char[] toTest, char[][] g) {
int val = checkOnAll(0, toTest, g);
if (val == toTest.length) {
return true;
}
return false;
}
public static int checkOnAll(int charPos, char[] toTest, char[][] g) {
for(int i = 0; i < g.length; i++) {
int val = checkOne(charPos, toTest, g[i], g);
if (val != -1) {
return val;
}
}
return -1;
}
//also needs length checks
public static int checkOne(int charPos, char[] toTest, char[] rule, char[][] rules) {
for(int j = 0; j < rule.length; j++) {
if (charPos >= toTest.length) {
return -1;
}
if(rule[j] == 'S') {
int value = checkOnAll(charPos, toTest, rules);
if (value == -1) {
return -1;
} else {
charPos = value - 1;
}
} else if (toTest[charPos] != rule[j]) {
return -1;
}
charPos++;
}
return charPos;
}
Instead of the "S" function, use the fullCheck one (give the input as a char array to the new method). I also changed the "g" array to be:
char[][] g = new char[][]
{ {'a', 'a', 'S', 'b'},
{'a', 'a', 'S', 'a'},
{'a', 'S', 'a'},
{} };
(the "\0" gave me trouble with the length checks, the above change was the simplest solution).
I found a few issues in your code, and although I am not totally sure that my own code is bug-free, I thought I'll share anyways: 1. when S returns a "false" inside your recursion, but the value is ignored. 2. the "pn" should be restricted to knowing which the test char we are on not the rule char. 3. even if the value returned is true, you must make sure you checked the entire test string and not just part of it. I did not see you do that. 4. if you have one very long rule but a short input, you might get an array out of bounds exception since you never look at the test string length.
I tried my own code with various inputs, and I have a feeling like I might have missed something, but I could not find it. if you find a problem, please let me know :)
good luck.