wcfpersistenceworkflowworkflow-foundation-4workflowservice

Broken WF4 workflow rehydration


Consider a WF4 project running in IIS, with a single workflow definition (xamlx) and a SqlInstanceStore for persistence. Instead of hosting the xamlx directly, we host a WorkflowServiceHostFactory which spins up a dedicated WorkflowServiceHost on a seperate endpoint for every customer.

This has been running fine for a while, until we required a new version of the workflow definition, so now on top of Flow.xamlx I have Flow1.xamlx. Since all interactions with the workflow service are wrapped with business logic which is smart enough to identify the required version, this homebrew versioning works fine for newly started workflows (both on Flow.xamlx and Flow1.xamlx).

However, workflows started before this change fail to be reactivated (on a post the servicehost throws an UnknownMessageReceived exception). Since WF isn't overly verbose in telling you WHY it can't reactivate the workflow (wrong version, instance not found, lock, etc), we attached a SQL profiler to the database.

It turns out the 'WorkflowServiceType' the WorkflowServiceHost uses in its queries is different from the stored instances' WorkflowServiceType. Likely this is why it fails to detect the persisted instance.

Since I'm pretty sure I instance the same xamlx, I can't understand where this value is coming from. What parameters go into the calculation of this Guid, does the environment matter (sitename), and what can I do to reactivate the workflow ?


Solution

  • In the end I decompiled System.Activities.DurableInstancing. The only setter for WorkflowHostType on SqlWorkflowInstanceStore was in ExtractWorkflowHostType:

    private void ExtractWorkflowHostType(IDictionary<XName, InstanceValue> commandMetadata)
    {
        InstanceValue instanceValue;
        if (commandMetadata.TryGetValue(WorkflowNamespace.WorkflowHostType, out instanceValue))
        {
            XName xName = instanceValue.Value as XName;
            if (xName == null)
            {
                throw FxTrace.Exception.AsError(new InstancePersistenceCommandException(SR.InvalidMetadataValue(WorkflowNamespace.WorkflowHostType, typeof(XName).Name)));
            }
            byte[] bytes = Encoding.Unicode.GetBytes(xName.ToString());
            base.Store.WorkflowHostType = new Guid(HashHelper.ComputeHash(bytes));
            this.fireRunnableInstancesEvent = true;
        }
    }
    

    I couldn't clearly disentangle the calling code path, so I had to find out at runtime by attaching WinDbg/SOS to IIS and breaking on HashHelper.ComputeHash.

    I was able to retreive the XName that goes into the hash calculation, which has a localname equal to the servicefile, and a namespace equal to the [sitename]/[path]/.

    In the end the WorkflowHostType calculation comes down to:

    var xName = XName.Get("Flow.xamlx.svc", "/examplesite/WorkflowService/1/");
    var bytes = Encoding.Unicode.GetBytes(xName.ToString());
    var WorkflowHostType = new Guid(HashHelper.ComputeHash(bytes));
    

    Bottomline: apparently workflows can only be rehydrated when the service filename, sitename and path are all identical (case sensitive) as when they were started