multithreadingcommarshallingdomdocumentapartments

FreeThreadedDOMDocument, Neutral Apartments and Free-Threaded Marshaler


As MSDN states:

If you are writing a single threaded application (or a multi-threaded application where only one thread accesses a DOM at one time), use the rental threaded model (Msxml2.DOMDocument.3.0 or Msxml2.DOMDocument.6.0). If you are writing an application where multiple threads access will simultaneously access a DOM, use the free threaded model (Msxml2.FreeThreadedDOMDocument.3.0 or Msxml2.FreeThreadedDOMDocument.6.0).

Is there any connection between FreeThreadedDOMDocument, neutral apartments and free-threaded marshaler? I looked in OleView and found that FreeThreadedDOMDocument threading model is Both. As far as I understand neutral apartment objects are supported with a free-threaded marshaler. Does it mean that FreeThreadedDOMDocument doesn't use a free-threaded marshaler and it is called a bit confusing as free-threaded?

What is the implementation difference between COM classes that marked as Free, Both or Neutral? As far as I understand they all must be thread-safe, why is the difference? Is it correct that Neutral should support a free-threaded marshaler?


Solution

  • There are multiple questions here.

    TL;DR:

    Neutral objects:

    Free threaded objects:

    Is there any connection between FreeThreadedDOMDocument, neutral apartments and free-threaded marshaler?

    TL;DR: The FreeThreadedDOMDocument's threading model is "Both", so it's tied to the apartment where it's activated (created). It aggregates the free threaded marshaler, so it's a free threaded object.

    The FreeThreadedDOMDocument is a COM class whose objects aggregate the free threaded marshaler. What this marshaler does is to provide a raw pointer whenever marshaling in-process (i.e. IMarshal::MarshalInterface with dwDestContext set to MSHCTX_INPROC.

    I'll use definition of a free threaded object as an object which aggregates the free threaded marshaler.

    A free threaded object's threading model should be specified as "Neutral", or "Both" before Windows 2000, so it can be created and used in any thread, avoiding context switches.

    If its threading model is specified as "Both", the object's lifetime is tied to the apartment where it was created. For instance, if an STA thread terminates, all free threaded objects created within that apartment are either destroyed or no longer valid.

    As far as I understand neutral apartment objects are supported with a free-threaded marshaler.

    No, proxies to neutral objects are a bit lighter than other in-process proxies as it only sets up a COM context, but it never incurs in full marshaling and it avoids thread switching.

    Does it mean that FreeThreadedDOMDocument doesn't use a free-threaded marshaler and it is called a bit confusing as free-threaded?

    No, the FreeThreadedDOMDocument does use the free threaded marshaler.

    Historically, there were already free threaded objects before Microsoft provided its own support for them (due to popularity, and probably because most free threaded marshalers out there were flaky), and the Neutral apartment appeared only in Windows 2000.

    As such, instances of FreeThreadedDOMDocument are free threaded because they aggregate the free threaded marshaler, and the lifetime of each instance is tied to the apartment where it was created. Usually, there's little impact, but with e.g. a thread-pool of STA threads, the effect is observed more often, because STAs come and go as the owning threads terminate (either normally or to reclaim resources) and get created. For instance, classic ASP uses STA threads by default.

    PS: I've mentioned the following subject in another answer, but I believe the content is a bit different as the questions are different too.

    Here's the current threading model values:

    For any apartment that doesn't exist, COM creates it if needed.

    There are several peculiarities here:

    The main STA is the first created STA. It only matters for classes with an unspecified threading model.

    There can be several STAs, but there are at most one MTA and one NA.

    While there is an active MTA, any thread not initialized for COM is implicitly in the MTA if it doesn't call CoInitializeEx(NULL, COINIT_MULTITHREADED), but it also doesn't affect the MTA's lifetime at all, meaning the MTA may be destroyed while the thread is using it. Since this is scarcely documented and pretty much unreliable, you shouldn't rely on this.

    Implicitly created apartments are called host STA and host MTA. You have no control over them (unless by cheating with CoUninitialize while in that apartment; note: don't actually do this). In fact, if you activate "Apartment" objects outside of an STA or outside of an NA running over an STA, it'll be activated in a host STA. For further confusion, this might also be the main STA if the host STA was the first STA to be initialized.

    All COM threads that support host apartments are background threads, so they don't keep your application from exiting.

    You have no control whatsoever over the NA, other than creating it when activating a neutral object. You cannot enter it directly, but you can create your own neutral object with a method that runs a callback in the context of the neutral apartment. This callback could be a free threaded object.

    What is the implementation difference between COM classes that marked as Free, Both or Neutral?

    COM classes with apartment declared as "Free" will result in objects that belong to the MTA. Such objects may assume that the threads they run in don't have to pump window messages. Essentially, they may block.

    Free threaded objects and neutral objects must be prepared to run under any apartment. For free threaded objects, it should be obvious why: it bypasses any context marshaling, so methods execute in any thread. For neutral objects, there's the distinction of which kind of apartment was active (through CoGetApartmentType).

    In either case, you should use COM's utility functions, like CoWaitForMultipleHandles instead of WaitForMultipleHandles[Ex], which blocks and is unacceptable in STAs, or MsgWaitForMultipleHandles[Ex], which accesses the window message queue, probably creating it implicitly, and is usually unacceptable in MTAs.

    You can check the apartment type by yourself and choose to use the proper Win32 waiting functions or to use a polling strategy which waits and pumps messages with timeouts in the STA, in case you're waiting on something other than handles or if you require a specific waiting logic.

    The most striking difference between free threaded objects and neutral objects is marshaling of other COM objects.

    While using neutral objects, incoming and outgoing interface pointers are automatically marshaled. For instance, you can store incoming interface pointers in fields.

    While using free threaded objects, incoming and outgoing interface pointers are not marshaled at all, meaning either you get raw pointers to objects in the same apartment, or you get proxies to objects in other apartments. These proxies are tied to the current apartment too.

    For instance, an incoming raw pointer means you're getting an object that belongs to the current apartment, so you'll have to marshal it if you intend to store a reference to the object.

    An incoming proxy means you're getting a proxy to an object in another apartment, but this proxy is tied to the current apartment. You can't store this proxy either. Specifically, notwithstanding standard proxy/stubs' apartment verification, STA proxies may have thread affinity. You must marshal it as well. But don't worry, marshaling a proxy will not stack marshaling; when you again unmarshal, you'll get a proxy to the object, not a proxy to a proxy.

    When a free threaded object must store an interface pointer, it must always do so through manual marshaling, and when it must call methods from this interface pointer, it must do so through manually unmarshaling.

    Usually, the Global Interface Table (GIT; another misleading name, it's actually an in-process table) is used for this purpose.

    As far as I understand they all must be thread-safe, why is the difference?

    Regarding thread-safety, there's no difference.

    But as I explained in the previous question, there's a huge difference when storing interface pointers, and a subtle difference regarding object activation and lifetime.

    Is it correct that Neutral should support a free-threaded marshaler?

    The free threaded marshaler effectively ignores the apartment, so it's the methods' responsibility to behave, synchronize and/or lock correctly. So, neither apartment must support the free threaded marshaler, it's the free threaded object that must support every apartment.

    It's possible to aggregate the free threaded marshaler in objects with any threading model, including "Neutral".

    If you find that the context setup by the neutral apartment marshaler is somehow a bottleneck, then you may consider using the free threaded marshaler, at the cost of manually marshaling stored interface pointers. If not, just use the neutral apartment.