memory-managementheap-memoryforthgforth

How do you keep track of all strings allocated in Forth and free them on time?


I see a lot of Forth code just doing s" Hello " s" world" s+ like it's nothing, but now that I think about it, this actually allocates 3 pointers, and lose two of them to the great nothingness. Same problem goes with most uses of slurp-file.

But if I need to put every single string address I allocate into a temporary location to free them later, like s" foo" over >r ( ... do something ... ) r> free, I'm gonna lose my mind. What is the best practice on this?

I don't see a lot of Forth code around taking memory allocation into account, and the stack aspect of it seems to go in a kind of "fire and forget" mood.

The practical

I'm working on a web server which serves HTML files, and while the request is saved in a reusable pad, the response, on the other hand, is a mix of slurped files and string concatenations.

Which means if I let the server run over the internet for some time, and let the various soup of robots you find there play with it, I might lose a consequent amount of memory just to answer them to go away.

The question

So I'm turning to the vivid Forth community around here to ask you for the best practice.

Should I:

  1. Run after every memory allocation in my program and check that I free them sometime
  2. Let the program run and restart it once a limit has been reached
  3. Use the gforth garbage collector extension
  4. Prepare a big lot of memory dedicated to a request and free everything at once at the end of the response

(1) is a scenario in my worst nightmares
(2) is the lazy way, but not that bad
(3) I looked at the code and it seems overkill for me
(4) is what I'd really like to go for, but is a bit ambitious

Bonus: What I'd do if I had to implement solution (4)

Is this a good strategy? Am I missing something?


Solution

  • Short answer: you keep a list of them.

    s" does not always allocate new memory

    I was wrong in my interpretation of how s" works. At interpretation (during gforth reading your file or in the interactive terminal), it effectively allocates memory so that you get a string on the stack. But when s" is compiled into a word, it's execution calls allot instead, which uses existing dictionary space.

    Gforth 0.7.3
    see s" 
      34 parse save-mem ;            \ interpret
      34 parse  POSTPONE SLiteral ;  \ compile
    
    see save-mem 
      swap >r dup allocate throw swap 2dup r> -rot move ;
    
    see SLiteral 
      tuck 2>r  POSTPONE AHEAD here 2r> mem, align >r  POSTPONE THEN r>  POSTPONE Literal  POSTPONE Literal ;
    
    see mem, 
      here over allot swap move ;
    

    POSTPONE AHEAD allows the code doing the allocation to be called only once during the reading of the string, then skip this part during execution, going directly to the part which pushes the address and length on the stack.

    This means strings inlined in code are compiled in place and don't need to be freed.

    s" This string should be freed"  \ on the heap
    
    : hello  ( -- addr u )
      s" This one should not." ;     \ in dictionary space
    

    s" is implementation defined

    Some forths reuse the same buffer for all their s" calls, while some other forths gives you access to 2 or 3 strings at the same time, but the next one will erase existing data.

    So you should not take a s" string for granted and should copy it if you want to keep it.

    How to keep track of all strings allocated

    The main issue is therefore not the use of s", but mostly s+ and slurp-file, which both call allocate internally.

    I solved it using a so called "free list". Every time I use s+ or slurp-file, I keep a reference to the given pointer, store it in a linked list to be freed later.

    The code

    \ a simple linked-list keeping track of allocated strings
    
    variable STRBUF-POINTER  \ the current head of the list
    0 STRBUF-POINTER !
    
    struct
      cell% field strbuf-prev  \ previous entry
      cell% field strbuf-addr  \ the string allocated
    end-struct strbuf%
    
    : add-strbuf  ( addr -- )
      strbuf% %alloc >r
      ( addr )         r@ strbuf-addr !
      STRBUF-POINTER @ r@ strbuf-prev !
      r> STRBUF-POINTER ! ;                \ become the new head
    
    : (?free)  ( addr -- )
      dup if free throw else drop then ;
    
    : free-strbuf  ( -- )   \ walk up the list and free strings
      begin
        STRBUF-POINTER @
      while
        STRBUF-POINTER @ >r
        r@ strbuf-addr @ (?free)           \ free the string
        r@ strbuf-prev @ STRBUF-POINTER !  \ prev becomes new head
        r> (?free)                         \ free the struct itself
      repeat ;
    

    Usage

    : my-s+  ( $1 $2 -- $3 )
      s+ over add-strbuf ;
    
    : my-slurp-file  ( $path -- $content )
      slurp-file over add-strbuf ;
    
    : main-process
      begin
        listen  \ wait for client request
        ( ... use my-s+ and my-slurp-file ... )
        send-response
        free-strbuf   \ we free everything we used
      again 
      ;
    

    It seems like this solution was enough to drastically reduce memory usage in my case. But in some cases, you might want to improve it by implementing regions : instead of creating a new element in the linked list for every string, have them keep track of big reusable buffers, like I was talking in solution (4).