garbage-collectioncomputer-sciencestack-overflowlow-levelknuth

Knuth GC stack overflow prevention algorithm - how does it work?


I'm reading the Garbage Collection. Algorithmis for Automatic Dynamic Memory Management book, and trying to write a blog post about it. I'm concentrating on mark-sweep chapter, where authors discuss using auxiliary stacks, to avoid mark-sweep to overflow the 'normal' stack. So far so good.

In the subchapter dedicated to the stack overflow of the auxiliary stack, there'a a direct quote:

Knuth handles overflow by treating the marking stack circularly, stacking pointers to nodes modulo h, where h is the fixed size of the stack. This means that older branch-point information on the stack is overwritten when the stack index grows larger than h. Consequently, the stack may become empty before the marking process is complete as the live graph below a node whose pointer on the mark stack has been overwritten may not have been marked.

As far as I understand it, GC process goes through the objects starting from the roots, adding pointers to the auxiliary stack. That's easy. However, this stack has a specific size (which is to prevent the overflow). When the situation occurs, that adding new pointers to this stack will be greater than the limit (resulting in SO), existing entries are discarded to make room for the new ones. That's how I understand this.

My problem is with this sentence:

Consequently, the stack may become empty before the marking process is complete as the live graph below a node whose pointer on the mark stack has been overwritten may not have been marked.

English is not my native language, and topic discussed here is not trivial, however I just need an answer - how does this mechanism work in the light of the above sentence? Why stack should be EMPTIED, and not keeping constant <= h amount of entries?

PS. In the second book of Jones - 'The Garbage Collection Handbook' there's no word about the above topics, so unfortunately I cannot compare the phrasing/preesntation.


Solution

  • I think it's saying exactly what you think it does. There is a stack of size 3, and I write A, B, C, and D onto it. The D overwrites the A. Because of this, the A never gets looked at.

    Depending on how you implement the limited-length stack, after you write D, the stack length may be 3 or it may be 4. But in either case, when you get back to the stack length being 0, you've still not looked at A.