coldfusionstructrace-conditionintermittent

Possible race condition creating Structs in ColdFusion


I've been seeing intermittent errors in a couple of systems I've been working on, when using the same methodology (not the same code) leading me to believe the problem may be linked to creating and using structs in the same request. I'm wondering if it's possible there's a race condition?

The scenario is this: We're on an e-commerce system, looking at a product, or in some cases a list of products. The code in question is designed to return the images associated with each product, in a struct that we can use for display of said images.

At the beginning of the request, the code looks for database records associated with the item in question. These records represent images for the product(s). These records are returned in a single CFQuery call (or more accurately, a call to a function which returns the results of a CFQuery call, shaped into a struct containing various info).

The code then loops through the supplied image struct, and adds various information to a Local struct. Later in the request we use the data in the struct to display the images in our <img> tags. We also populate the <img> tag with data- attributes for use with JavaScript.

In the case that any particular image was not correctly returned by the query - usually because the physical file is missing - we use a generic placeholder image. This is done by putting the struct creation inside a try/catch block.

Importantly: this works.

What's happening however, is that very intermittently, when referring to a node in the struct we've created, we find that it does not exist and CF throws an error - this happens maybe 1% of the time and reloading the same page, everything will work perfectly.

I've had this same problem on multiple systems, on multiple servers, on different versions of ColdFusion (8 & 10 to be specific) and using completely different code to achieve similar results. The first system I saw this issue on, actually used FileExists to check that the image file was available and thus I thought that the problem was probably caused by the bottleneck of the filesystem - I tried many ways around this and eventually eliminated it altogether in the new system - but the problem persists.

The only thing I can think of, is that when creating a struct and then using that struct later in the same request, there's a possibility that a race condition occurs; whereby I refer to a node in the struct before it's finished being created. I'm not using threading here though, so I can't really see how that's possible... I'm out of other ideas.

Some code is below to show what I'm doing, but given that the same issue arises on completely different systems, I think it's the methodology rather than the code that has a problem.

<!--- Get product images --->
<cfset Local.stProductImages = Application.cfcParts.getPartImages(
        l_iItemID = Arguments.pid
) />


<!--- Loop through images --->
<cfloop list="#ListSort(structKeyList(Local.stProductImages['item_' & Arguments.pid]), 'text')#" index="i">
    <cftry>
        <cfset Local['ImageURL_' & i & '_Large']    = Local.stProductImages['item_' & Local.arguments.pid][i].large_watermarked.URL />
        <cfcatch>
            <cfset Local['ImageURL_' & i & '_Large']    = Application.com.Images.getMissingImages().large />
        </cfcatch>
    </cftry>                        
    <cftry>
        <cfset Local['ImageURL_' & i & '_Med']      = Local.stProductImages['item_' & Local.arguments.pid][i].med.URL />
        <cfcatch>
            <cfset Local['ImageURL_' & i & '_Med']      = Application.com.Images.getMissingImages().med />
        </cfcatch>
    </cftry>                        
    <cftry>
        <cfset Local['ImageURL_' & i & '_Small']        = Local.stProductImages['item_' & Local.arguments.pid][i].small.URL />
        <cfcatch>
            <cfset Local['ImageURL_' & i & '_Small']        = Application.com.Images.getMissingImages().small />
        </cfcatch>
    </cftry>                        

    <img class          = "altProdImg<cfif i EQ 'image_03'> endImage</cfif>" 
        src             = "#Local['ImageURL_' & i & '_Small']#" 
        image           = "#i#" 
        alt             = ""
        data-imgsmall   = "#Local['ImageURL_' & i & '_Small']#"
        data-imgmed     = "#Local['ImageURL_' & i & '_Med']#"
        data-imglarge   = "#Local['ImageURL_' & i & '_Large']#"
        data-imgnum     = "#i#"
        data-pid        = "#Arguments.pid#"
    />
</cfloop>

The error occurs in the <img> tag, when referring to a node created in the preceding code - Something like:

Element ImageURL_image_02_Large is undefined in a Java object of type class coldfusion.runtime.LocalScope.

But only very occasionally... I'll reload and it'll work perfectly every time.

So... sorry for the epic length of question, but can anybody see how this could occur?


Solution

  • Answer from comments...

    The behaviour you describe is symptomatic of not var scoping, so it might be as simple to fix as using index="local.i" in the cfloop tag (you only need the scoping when writing the variable).


    Side note: A relatively easy way to check if you're in a function, without going through code, is by throwing an error (i.e. <cfthrow message="where am i?" />) then check the stack trace - if you see stuff like coldfusion.runtime.UDFMethod or $funcSTUFF.runFunction(filename:line) you know you're inside a function (even when the template you're in shows no sign of it).