tcl

Updating nested dictionary keys in a manner that results in minimal garbage collection?


I'm working with a nested dictionary (openDocuments) and attempting to update the level at dict set openDocuments $docId D_itemAt {1 {} 2 {} 0 {} e {}} where the four keys (1,2,0,e) under D_itemAt are the D_itemType used in the two versions of the procedure below. Please excuse my ugly testing code.

My question is, Is there a difference between assigning a new dictionary to each D_itemType in the first version and using dict with to update the values of the keys in the second version?

By difference I mean in terms of memory usage/turnover. For example, does dict with just update the values held in the memory locations of the keys resulting in no garbage collection; and setting each D_itemType to a new dictionary results in garbage collection of the old dictionary?

It's not very important, perhaps, but this procedure will be executed many, many times as a document is composed and I wondered if one version results in less "work" for Tcl in the background than the other.

Thank you for considering my question.

proc UpdateOpenDocsItemAt { data } {
  upvar 1 openDocuments openDocuments
  upvar 1 docId docId
  foreach {\
      D_itemType\
      D_dataKey\
      D_prevKey\
      D_nextKey\
      D_bufferId\
      D_charStart\
      D_charLength\
      D_charBegin } $data {
    dict set openDocuments $docId D_itemAt $D_itemType\
       [list D_dataKey $D_dataKey\
             D_prevKey $D_prevKey\
             D_nextKey $D_nextKey\
             D_bufferId $D_bufferId\
             D_charStart $D_charStart\
             D_charLength $D_charLength\
             D_charBegin $D_charBegin]
  }
}

proc UpdateOpenDocsItemAt { data } {
  upvar 1 openDocuments openDocuments
  upvar 1 docId docId
  foreach {\
      itemType\
      dataKey\
      prevKey\
      nextKey\
      bufferId\
      charStart\
      charLength\
      charBegin } $data {
      dict with openDocuments $docId D_itemAt $itemType {
         chan puts stdout "$itemType $dataKey $prevKey $nextKey $bufferId $charStart $charLength $charBegin"
         set D_dataKey $dataKey
         set D_prevKey $prevKey
         set D_nextKey $nextKey
         set D_bufferId $bufferId
         set D_charStart $charStart
         set D_charLength $charLength
         set D_charBegin $charBegin
      }
  }
}

Solution

  • The dict with and dict update commands exist to let you make your own dictionary manipulators. They make it easier to do things like inserting characters into the middle of a value for a key or incrementing the value for a key within a key, etc. (They were made because we can't syntactically have compound keys that reach deep inside, not without restricting the space of keys in ways that were counter to the design goals of dictionaries.)

    Conceptually, a dict with (with non-empty body) is like a dict get, extract, eval, repackage (as if with dict create or maybe dict put), and then a dict set. There are some minor optimizations with the management of the sequence of keys on the outside and carrying over the set of keys, but it really isn't super-smart about anything. In particular, the extract stage isn't very smart because it can't know what the keys in the inner dictionary really are; you should know that as the script author, but the compiler doesn't. No traces are used; the write back happens exactly at the end of the evaluation of the body.

    A consequence of this is that the values put into the variables will have reference counts greater than 1 (as there'll be at least one reference from the dictionary and a separate one from the local variable) and so won't be eligible for in-place updates.

    In your case, because you're only writing to the inner dictionary, you're probably best not using dict with. Use dict with when you are thinking "I need to read from inside, fiddle with things in a complicated/custom way, and write back". Or use the version with an empty body to do a simple read-into-variables from an inner dictionary; we skip the writeback in that case as an optimization.