javascriptparsingparser-generatorjison

Jison not assuming correct grammar


I'm creating a grammar in jison

This is my jison file:

sgr.jison

/*
AUX VARIABLES
*/
%{
var contratos = "(E1:ENTIDADE)-[C:CONTRATO] -> (E2:ENTIDADE)";
var dataArray = {};

function
translateQuery(dataArray) 
{
    var finalQuery = dataArray["Listar"] + " "
                     + dataArray["Contratos"] + "\n"
                     + dataArray["Onde"] + " "
                     + dataArray["condition"] + "\n"
                     + dataArray["Retornar"] + " "
                     + dataArray["returnAttributes"]
    console.log("\n" + finalQuery)
}

%}


/* description: Parses end executes mathematical expressions. */

/* lexical grammar */
%lex

%%
\s+                     /* skip whitespace */
Listar                  return 'MATCH'
Contratos               return 'CONTRACTS'
Onde                    return 'WHERE'
Retornar                return 'RETURN'
e                       return 'AND'
ou                      return 'OR'

","                     return 'DELIMITER'
";"                     return 'END'

[><>=<==]               return 'MATH_SYMBOL'
[0-9]+\b                return 'VALUE'
[A-Za-z0-9.]+\b         return 'ENTITY_ATTRIBUTE'
["]                     return 'QUOTATION_MARK'






/lex

%start expressions

%% /* language grammar */

expressions :
    regra               
        {
            /*
            ADD SOMETHING 
            ONLY IF NEEDED
            */
        }
    | /* | significa um OU o que quer dizer que isto aqui pode ter mais do que uma regra ISTO E FEITO PELA RECURSIVIDADE*/
    expressions regra
        {
            /*
            ADD SOMETHING 
            ONLY IF NEEDED
            */
        }
;

 regra: 
    MATCH CONTRACTS
    WHERE condition
    RETURN returnAttributes END
        {
            $$ = $1 + " "
                + $2 + " "
                + $3 + " "
                + $4 + " "
                + $5 + " "
                + $6 + " "
                dataArray[$1] = "MATCH"
                dataArray[$2] = contratos
                dataArray[$3] = "WHERE"
                dataArray["condition"] = $4
                dataArray[$5] = "RETURN"
                dataArray["returnAttributes"] = $6
                /*ESTA FUNCAO TRATA DE TRADUZIR A QUERY QUE E INTERPRETADA*/
                translateQuery(dataArray)
        }
 ;

 condition:
    ENTITY_ATTRIBUTE MATH_SYMBOL
        {
            $$ = $1 +  " "
                + $2
        }
    |
    condition VALUE
        {
            $$ = $1 +  " "
                + $2
        }
    |
    condition QUOTATION_MARK ENTITY_ATTRIBUTE QUOTATION_MARK
        {
                $$ = $1 +  " "
                + $2 + " "
                + $3 + " "
                + $4
        }
    |
    condition AND ENTITY_ATTRIBUTE MATH_SYMBOL VALUE
        {
            $$ = $1 +  " "
                + $2 + " "
                + $3 + " "
                + $4 + " "
                + $5
        }
    |
    condition OR ENTITY_ATTRIBUTE MATH_SYMBOL VALUE
        {
            $$ = $1 +  " "
                + $2 + " "
                + $3 + " "
                + $4 + " "
                + $5
        }
    |
    condition AND ENTITY_ATTRIBUTE MATH_SYMBOL QUOTATION_MARK ENTITY_ATTRIBUTE QUOTATION_MARK
        {
            $$ = $1 +  " "
                + $2 + " "
                + $3 + " "
                + $4 + " "
                + $5 + " "
                + $6 + " "
                + $7
        }
    |
    condition OR ENTITY_ATTRIBUTE MATH_SYMBOL QUOTATION_MARK ENTITY_ATTRIBUTE QUOTATION_MARK
        {
            $$ = $1 +  " "
                + $2 + " "
                + $3 + " "
                + $4 + " "
                + $5 + " "
                + $6 + " "
                + $7
        }
 ;

 returnAttributes:
    ENTITY_ATTRIBUTE
        {
            $$ = $1
        }
    |
    returnAttributes DELIMITER ENTITY_ATTRIBUTE
        {
            $$ = $1 + ""
                + $2 + " "
                + $3
        }
 ;

In my lexical grammar definition i have:

e     return 'AND'
ou    return 'OR'

so, whenever in my testfile "e" or "ou" were found they should return "AND" and "OR" respectively.

The problem is, when i test it, instead of returning me "AND" and "OR" it is returning me "e" and "ou".

Take a look:

This is my testfile:

test.sgr

Listar Contratos
Onde C.preco=1000
Retornar C.Preco, C.NifAdjudicante,C.NifAdjudicataria;


Listar Contratos
Onde C.preco=1000 e E1.name="ESTG"
Retornar C.Preco, C.NifAdjudicante,C.NifAdjudicataria;


Listar Contratos
Onde C.preco=1000 e E1.name="ESTG" e C.TipoProcedimento="ADS"
Retornar C.Preco, C.NifAdjudicante,C.NifAdjudicataria;


Listar Contratos
Onde E1.name="ESTG"
Retornar E1.name,C.Preco,C.NifAdjudicante,C.NifAdjudicataria;


Listar Contratos
Onde E1.name="ESTG" e C.preco=1000 ou C.preco>1000 
Retornar E1.name,C.Preco,C.NifAdjudicante,C.NifAdjudicataria;

The outputs should be:

MATCH (E1:ENTIDADE)-[C:CONTRATO] -> (E2:ENTIDADE)
WHERE C.preco = 1000
RETURN C.Preco, C.NifAdjudicante, C.NifAdjudicataria

MATCH (E1:ENTIDADE)-[C:CONTRATO] -> (E2:ENTIDADE)
WHERE C.preco = 1000 AND E1.name = " ESTG "
RETURN C.Preco, C.NifAdjudicante, C.NifAdjudicataria

MATCH (E1:ENTIDADE)-[C:CONTRATO] -> (E2:ENTIDADE)
WHERE C.preco = 1000 AND E1.name = " ESTG " AND C.TipoProcedimento = " ADS "
RETURN C.Preco, C.NifAdjudicante, C.NifAdjudicataria

MATCH (E1:ENTIDADE)-[C:CONTRATO] -> (E2:ENTIDADE)
WHERE E1.name = " ESTG "
RETURN E1.name, C.Preco, C.NifAdjudicante, C.NifAdjudicataria

MATCH (E1:ENTIDADE)-[C:CONTRATO] -> (E2:ENTIDADE)
WHERE E1.name = " ESTG " AND C.preco = 1000 OR C.preco > 1000
RETURN E1.name, C.Preco, C.NifAdjudicante, C.NifAdjudicataria

However the outputs are:

outputs

What i've done wrong?


Solution

  • Tokens identified in your lexical analyser have a token type, which is the string you return from the scanner action, and the matched text, which the lexical scanner keeps in its yytext property from which the parser initializes the token's semantic value. (This is not very well described by the documentation.)

    So in this action:

    condition:
        condition OR ENTITY_ATTRIBUTE MATH_SYMBOL VALUE
            {
                $$ = $1 +  " "
                   + $2 + " "
                   + $3 + " "
                   + $4 + " "
                   + $5
            }
    

    the value of $2 is the text matched by the token whose token type was "OR", which is ou. If you wanted the string "OR", that's what you should have placed into the action:

    condition:
        condition OR ENTITY_ATTRIBUTE MATH_SYMBOL VALUE
            {
                $$ = $1 + " OR "
                   + $3 + " "
                   + $4 + " "
                   + $5
            }
    

    (Having said that, I have to say that I think there are better ways of structuring an AST. But if this one works for you, that's cool.)