bisonstm32flex-lexer

How to setup flex and bison for low memory systems?


I like to build a simple protocol parser by the help of flex and bison. It parses commands like "reset", "led on", "set uart 115200,N,8". The parser should run on a stm32f4 MCU(48Mhz, 256KBytes RAM).

At the MCU runs the Zephyr OS which organizes timing and memory. After setting the stack size of the executing thread a bit higher (around 5kBytes) I can call yyparse without an memory exception. Now I am facing a new problem. Seems that flex runs out of memory:

out of dynamic memory in yyensure_buffer_stack()!
out of dynamic memory in yy_create_buffer()!
out of dynamic memory in yy_create_buffer()!
flex scanner jammed!
fatal flex scanner internal error--end of buffer missed!

Any ideas how to overcome that problem?

My flexer setup is:

%option warn noyywrap nodefault
%option 8Bit batch never-interactive
%{
#include <stdlib.h>
#include <stdio.h>

// stack size
#define YYMAXDEPTH 50


#define YY_NO_UNISTD_H
#include "notification_par.h"

#define MY_YY_INPUT(buf,result,max_size) \
    { \
    size_t n; \
    my_yyInput(buf, &n, max_size); \
    result = n; \
    } \

#define YY_INPUT MY_YY_INPUT
#define YY_DECL int yylex()

#define YY_FATAL_ERROR(msg) my_fatal_error( msg )

%}

and the Bison one:

%{
#define YY_NO_UNISTD_H

#include <stdio.h>
#include <stdlib.h>

extern int yylex();
extern int yyparse();
extern FILE* yyin;
void yyerror(const char* s);


#include <misc/printk.h>
%}

%union {
    char val[64];
}

my simple function to read the chars from the uart3 queue:

void my_yyInput( char *buffer, size_t * numBytesRead, size_t bLn ) {
    size_t i;
    *numBytesRead = 0;
    for (i = 0; i < bLn; i++) {
        if ( 0 == k_msgq_get(&uart3.rxQ, buffer, MSEC(100))) {
            buffer++;
            (*numBytesRead)++;
        } else {
            // make sure that we read at least one char
            if (*numBytesRead) return;
        }
    }
}

Solution

  • One important note before I get to the bit about saving memory. You override yy_fatal_error, presumably to avoid the use of fprintf. That's fine, but it's important to note that yy_fatal_error is declared noreturn, and the flex scanner does not expect it to return. Your implementation must obey this rule; it should call exit or whatever the equivalent in the Zephyr OS; otherwise, the scanner will attempt to continue after a fatal error, leading to more fatal errors, or segfaults, or other undesirable consequences. (Also, it's a little wierd to add the exclamation point. It makes it look like you're using a non-standard version of flex.)

    I'm actually a little suprised by the error out of dynamic memory in yyensure_buffer_stack()!, because that error is produced at a point where only a single pointer is being allocated, which presumably is a total of four bytes on your system. That suggests that malloc doesn't work at all, which will prove to be a problem since flex scanners cannot work without some dynamic storage allocation. (If malloc is not available, it is possible to provide an alternative function.)

    After the scanner allocates four bytes to store a pointer to a newly-created buffer state object, it proceeds to allocate the buffer state itself (about forty bytes) and then the buffer. The buffer is (usually) 16378 bytes (plus two extra bytes for a total of 16380), which is quite a lot for a machine with only 256k of memory. That's the part you will want to focus on.

    In practice, it's pretty rare to need a buffer that big, particularly since the scanner can usually reallocate the buffer if it needs to. The buffer needs to be a bit larger than the largest single token you expect to parse; although for interactive programs it can be helpful for it to be long enough to hold a single line of input. The default setting is helpful for compilers which need to deal with long comments and strings constants, but for your application it's probably sufficient to use a much smaller buffer.

    There are two basic approaches to reducing the buffer size. The easy one is to just define a macro in your prolog:

    #define YY_BUF_SIZE 200
    

    Then you can use the normal YY_INPUT mechanism to read input, as in the code in your sample.

    While that's simple, it's not the most efficient. Reading characters one at a time (or a few at a time) leads to lots of rescanning, and also the kernel queue mechanism is going to add a bit of overhead, none of which is necessary. Assuming that your protocol messages are unequivocally terminated with a newline (that is, every message is terminated by a newline and every newline terminates a message) then a much more efficient solution is to accumulate a line of input into your own buffer, which you can do directly from the serial port, and then parse the entire line from the buffer.

    The one thing you need to remember is that flex buffers must be terminated with two NUL bytes. So your input buffer needs to be two bytes longer than the longest protocol message you're expecting to be able to handle.

    The result might look something like this:

    /* Assume there is an ISR which accumulates characters into line until a newline
     * is received, updating line_read for each character. It needs to stop reading
     * when it reaches MAX_PROTOCOL_SIZE.
     */
    
    #define MAX_PROTOCOL_SIZE 254
    char line[MAX_PROTOCOL_SIZE + 2];
    int  line_read = 0;
    
    /* When the new line character is read, this function is called. Here I assume
     * that the ISR did not bother to NUL-terminate the input.
     */
    void process_line(char* buffer, int buflen) {
      buffer[buflen] = buffer[buflen + 1] = 0; /* Terminate with two NULs */
      YY_BUFFER_STATE flex_buf = yy_scan_buffer(buffer, buflen + 2);
      yyparse();
      yy_delete_buffer(flex_buf);
      yylex_destroy();  /* See note */
    }
    

    The call to yylex_destroy will free() the dynamic allocations (which, as mentioned, are tiny: a single pointer to a buffer state and the buffer state itself, which is about 40 bytes. You could reuse these allocations; the advantage of calling yylex_destroy is that it reinitializes the flex state, which can be useful if the parse failed without reading the entire input.

    The buffer itself is never malloc'd nor free'd, since it is your buffer. It's never refilled, either, so you can define YY_INPUT to do nothing or throw an error.