While trying to find out how Forth manages the dictionary (and memory in general), I came across this page. Being familiar with C, I have no problem with the concept of pointers, and I assume I understood everything correctly. However, at the end of the page are several exercises, and here I noticed something strange.
Exercise 9.4, assuming DATE
has been defined as a VARIABLE
, asks what the difference is between
DATE .
and
' DATE .
and exercise 9.5 does the same using the user variable BASE
.
According to the supplied answers, both phrases will give the same result (also with BASE
). Trying this with Win32Forth however, gives results with a difference of 4 bytes (1 cell). Here is what I did:
here . 4494668 ok
variable x ok
x . 4494672 ok
' x . 4494668 ok
Creating another variable gives a similar result:
variable y ok
y . 4494680 ok
' y . 4494676 ok
Thus, it looks like each variable gets not just one cell (for the value), but two cells. The variable itself points to where the actual value is stored, and retrieving the contents at the execution token (using ' x ?
) gives 0040101F
for both variables.
For exercise 9.5, my results are:
base . 195F90 ok
' base . 40B418 ok
These are not even close to each other. The answer for this exercise does however mention that the results can depend on how BASE
is defined.
Returning to normal variables, my main question thus is: why are two cells reserved per variable?
Additionally:
BASE
)?EDIT1: Okay, so Forth also stores a header for each variable, and using the '
gives you the address of this header. From my tests I would then conclude the header uses just one cell, which does not correspond to all the information the header should contain. Secondly, according to the exercise retrieving the address of a variable should for both cases give the same result, which appears to contradict the existence of a header altogether.
My gut feeling is that this is all very implementation-specific. If so, what happens in Win32Forth, and what should happen according to the exercise?
This is roughly how a definition looks like in the dictionary using a traditional memory layout. Note that implementations may well diverge from this, sometimes a lot. In particular, the order of the fields may be different.
Link to previous word (one cell)
Flags (a few bits)
Name length (one byte, less a few bits)
Name string (variable)
Code field (one cell)
Parameter field (variable)
Everything except the code and parameter fields is considered the header. The code field usually comes right before the parameter field.
Ticking a word with '
gives you an XT, or execution token. This can be anything the implementation fancies, but in many cases it's the address of the code field.
Executing a word created with CREATE
or VARIABLE
gives you the address of the parameter field.
This is probably why in Win32Forth, the two addresses differ by 4 bytes, or one cell. I don't know why the answers to the exercises state there should be no difference.
Assuming BASE
is a user variable, it probably works like this: Every task has its own user area in which user variables are allocated. All user variables know their specific offset inside this area. Ticking BASE
gives you its XT, which is the same for all tasks. Executing BASE
computes an address by adding its offset to the base of the user area.