windowswinapiwindows-shellshell32

What is the SHCIDS_ALLFIELDS flag of IShellFolder.CompareIDs trying to be?


Short Version

What does the SHCIDS_ALLFIELDS flag of IShellFolder.CompareIDs mean?

Long Version

In Windows 95, Microsoft introduced the shell. Rather than assuming the computer is made up of files and folders, it is made up of an abstract namespace of items.

And in order to accommodate things that are not files and folders (e.g. network printers, Control Panel, my Android phone):

PIDLs are opaque blobs, each blob only makes sense to the folder that generated it.

In order to extend (or use) the shell namespace, you implement (or call) an IShellFolder interface.

One of the methods of IShellFolder is used to ask a namespace extension to compare to ID Lists (PIDLs):

IShellFolder::CompareIDs method

Determines the relative order of two file objects or folders, given their item identifier lists.

HRESULT CompareIDs(
      [in] LPARAM             lParam,
      [in] PCUIDLIST_RELATIVE pidl1,
      [in] PCUIDLIST_RELATIVE pidl2
);

For many years, the LPARAM was documented to nearly always be 0. From shlobj.h c. 1999:

// IShellFolder::CompareIDs(lParam, pidl1, pidl2)
//   This function compares two IDLists and returns the result. The shell
//  explorer always passes 0 as lParam, which indicates "sort by name".
//  It should return 0 (as CODE of the scode), if two id indicates the
//  same object; negative value if pidl1 should be placed before pidl2;
//  positive value if pidl2 should be placed before pidl1.

And so you compared two ID Lists - whatever it meant to compare them, and we were done.

Windows 2000 added additional sorting option flags

Starting with Version 5 of the shell, the upper 16-bits of the LPARAM can now contain additional flags to control how the IShellFolder should handle sorting.

From ShObjIdl.idl c. the Windows 8.1 SDK:

// IShellFolder::CompareIDs lParam flags
// *these should only be used if the folder supports IShellFolder2*
//
// SHCIDS_ALLFIELDS
//
// only be used in conjunction with SHCIDS_CANONCALONLY or column 0.
// This flag requests that the folder test for *pidl identity*, that is
// "are these pidls logically the same". This implies that cached fields
// in the pidl that would distinguish them should be tested.
// Without this flag, you are comparing the *object* s the pidls refer to.
//
// SHCIDS_CANONICALONLY
//
// This indicates that the sort should be *the most efficient sort possible*, the implication
// being that the result will not be displayed to the UI: the SHCIDS_COLUMNMASK portion
// of the lParam can be ignored. (Before we had SHCIDS_CANONICALONLY
// we assumed column 0 was the "efficient" sort column.)

Note the important points here:

As Raymond Chen pointed out, it's the moral equivalent of a Unicode ordinal comparison.

The header file even notes that we used to just assume column 0 was the "fastest" sort. But now we will use a flag to say "use the fastest sort available":

Before we had SHCIDS_CANONICALONLY we assumed column 0 was the "efficient" sort column.

It also notes that you can ignore the lower 16-bits of LPARAM (i.e. the column), because we don't care - we're using the most efficient one.

A lot of this is mirrored in the official documentation:

SHCIDS_CANONICALONLY

Version 5.0. When comparing by name, compare the system names but not the display names. When this flag is passed, the two items are compared by whatever criteria the Shell folder determines are most efficient, as long as it implements a consistent sort function. This flag is useful when comparing for equality or when the results of the sort are not displayed to the user. This flag cannot be combined with other flags.

But with SHCIDS_ALLFIELDS we start to run off the rails

The header file notes that AllFields can only be combined with CanonicalOnly:

only be used in conjunction with SHCIDS_CANONCALONLY or column 0.

But the SDK says that CanonicalOnly must appear alone:

This flag cannot be combined with other flags.

So which is it?

We could decide that the header file is wrong, that the SDK is cannon, and do what it says.

But what is AllFields saying?

There is some concept that AllFields is trying to ask for, but is obscured behind the documentation.

Compare all the information contained in the ITEMIDLIST structure, not just the display names.

ItemIDList doesn't contain a display name, it contains an ItemIDList. Are they trying to say i should only look at the contents of the pidl blob?

In what situation could two references to the *same** file have different names, sizes, file times, attributes, etc?

The SDK examples do something different

The Windows SDK Explorer Data Provider Shell Extension sample (github), seems to act as though CanonicalOnly and AllFields flags would appear together:

HRESULT CFolderViewImplFolder::CompareIDs(LPARAM lParam, PCUIDLIST_RELATIVE pidl1, PCUIDLIST_RELATIVE pidl2)
{
   if (lParam & (SHCIDS_CANONICALONLY | SHCIDS_ALLFIELDS))
   {
      // First do a "canonical" comparison, meaning that we compare with the intent to determine item
      // identity as quickly as possible.  The sort order is arbitrary but it must be consistent.
      _GetName(pidl1, &psz1);
      _GetName(pidl2, &psz2);
      ResultFromShort(StrCmp(psz1, psz2));
    }

    // If we've been asked to do an all-fields comparison, test for any other fields that
    // may be different in an item that shares the same identity.  For example if the item
    // represents a file, the identity may be just the filename but the other fields contained
    // in the idlist may be file size and file modified date, and those may change over time.
    // In our example let's say that "level" is the data that could be different on the same item.
    if ((ResultFromShort(0) == hr) && (lParam & SHCIDS_ALLFIELDS))
    {
       //...
    }
}
else
{
   //...Compares by the column number in LOWORD of LPARAM
}

So we have completely conflicting documentation, headers, and samples:

SHCIDS_ALLFIELDS

What is it trying to ask

Windows always assumed that column 0 was the fast column. This may have been because Windows shell API authors assumed that a PIDL's ItemID would always contain the name inside the pidl opaque blob.

This is reinforced by the fact that the shell STRRET structure lets you point to a string inside your pidl.

Bonus Reading: The kooky STRRET structure

And so at some point they added an express flag that says:

And that makes sense for the canonical flag:

But then what does the SDK example mean when they talk about the All Fields option:

If we've been asked to do an all-fields comparison, test for any other fields that may be different in an item that shares the same identity. For example:

  • if the item represents a file, the identity may be just the filename
  • but the other fields contained in the idlist may be file size and file modified date, and those may change over time.

If two PIDLs represent the same file what is the point in comparing their size, date, etc? I already told you they were the same file, what are you asking me for with the All Fields flag? Why can't i just do a binary compare of the blobs? Why won't the shell? What does CompareIDs do that

MemCmp(pidl1, pidl2)

doesn't?

What does it want me to do if SHCIDS_ALLFIELDS is passed? Should i hit the underlying data store to query all fields i know of?

Is CompareIDs used to compare IDs, or is it used to compare objects?

I wondered if the purpose of CompareIDs was to absolutely not hit the underlying data store (e.g. hard disk, phone over USB, Mapi), and only compare based on what you have on-hand in the pidl.

That makes sense for two reasons:

And so perhaps SHCIDS_CANONICALONLY means:

Is that the case?

Bonus Question

What does it mean to "sort" to itemID lists?

The SDK example does a switch based on each column, and looks up the values for every column. If it means i have to load a video from over a network in order to load the audio sample rate?


Solution

  • The SDK example is basically correct (depends on the pidl contents). if (lParam & (SHCIDS_CANONICALONLY | SHCIDS_ALLFIELDS)) is obviously the same as if ((lParam & SHCIDS_CANONICALONLY) || (lParam & SHCIDS_ALLFIELDS)) but does not tell us if they can be combined and the answer to that is I don't know. I don't see why not.

    Only members of the Microsoft shell team know the true answer but we can speculate.

    Win95 basically had 4 standard fields. You can see them in the documentation for the older IShellDetails interface:

    File system folders have a large standard set of information fields. The first four fields are standard for all file system folders.

    Index | Title
    -------------
    0       Name
    1       Size
    2       Type
    3       Date Modified
    

    File system folders may support a number of additional fields. However, they are not required to do so and the column indexes assigned to these fields may vary.

    Each virtual folder has its own unique set of information fields. Typically, the item's display name is in column zero, but the order and content of the available fields depend on the implementation of the particular folder object.

    Then in Windows 2000 things changed when support for shell extension column handlers were added. This was the basis for the property system powering Vistas stacking support etc. and the column index is the poor mans mapping to/from the PROPERTYKEY for the items properties (PROPERTYKEY was known a SHCOLUMNID back then).

    SHCIDS_CANONICALONLY:

    The important piece here is CANONICAL.

    MSDN says

    When comparing by name, compare the system names but not the display names.

    The shell is not consistent with its use of the term display name but what it actually means is, compare the parse name, not the name you see in Explorer.

    For example, a folder view might contain "foo" and "foo" files but in reality they are "foo.jpg" and "foo.png" but the "hide file extensions" feature hides the true names.

    The IShellFolder implementation knows which property (column) from its pidl is unique for each item in its folder and should use that to compare.

    SHCIDS_ALLFIELDS:

    This just means that you want to compare all supported columns until you find a difference.

    It can be implemented as:

    for (UINT i = 0; i < mycolumcount; ++i)
    {
      hr = CompareIDs(i, pidl1, pidl2);
      if (hr && SUCCEEDED(hr)) break;
    }
    return hr;
    

    Bonus Question

    SHCIDS_CANONICALONLY does not care what you compare, it can be localized/customized or not. Storing localized data in a pidl is a bad idea so in most cases it is not.

    Other columns are generally not compared as localized data either. Ideally your comparison function is lower level than your display code and localized strings are only returned when you have to return a string to the caller.

    There are two consumers of item properties:

    IShellFolder::GetDisplayNameOf retrieves the "main column" where SHGDN_NORMAL is the localized/customized name and SHGDN_FORPARSING is often the same as the property compared by SHCIDS_CANONICALONLY.

    Implementation Example

    typedef struct { UINT16 cb; WCHAR name[99]; UINT size; bool isFolder } MYITEM;
    enum { COL_NAME = 0, COL_SIZE, COLCOUNT, COLCANONICAL = COL_NAME };
    
    MYITEM* GetDataPtr(PCUIDLIST_RELATIVE pidl) { ... }
    bool IsFolder(MYITEM*p) { ... }
    
    void GetForDisplay_Name(WCHAR*buf, MYITEM*p)
    {
      lstrcpy(buf, p->name);
      SHGetSetSettings(...);
      if (!ss.fShowExtensions && !IsFolder(p)) PathRemoveExtension(buf); // Assuming p->name is a "filenameish" property.
    }
    
    void GetForDisplay_Size(WCHAR*buf, MYITEM*p)
    {
      // Localized size string returned by GetDetailsOf, not used by CompareIDs
    }
    
    HRESULT CompareIDs(LPARAM lParam, PCUIDLIST_RELATIVE pidl1, PCUIDLIST_RELATIVE pidl2)
    {
      HRESULT hr = E_FAIL; // Bad column
      MYITEM *p1 = GetDataPtr(pidl1), *p2 = GetDataPtr(pidl2); // A real implementation must validate items
    
      if (lParam & (SHCIDS_CANONICALONLY | SHCIDS_ALLFIELDS))
      {
        hr = ResultFromShort(StrCmp(p1->name, p2->name));
    
        if ((ResultFromShort(0) == hr) && (lParam & SHCIDS_ALLFIELDS))
        {
          for (UINT i = 0; i < COLCOUNT; ++i)
          {
            // if (COLCANONICAL == i) continue; // This optimization might be valid, depends on the difference between a items canonical and display name
            hr = CompareIDs(i, pidl1, pidl2);
            if (hr && SUCCEEDED(hr)) break;
          }
        }
    
        return hr;
      }
    
      WCHAR b1[99], b2[99];
      switch(LOWORD(lParam))
      {
      case COL_NAME:
        GetForDisplay_Name(b1, p1);
        GetForDisplay_Name(b2, p2);
        return ResultFromShort(StrCmp(b1, b2));
      case COL_SIZE:
        return ResultFromShort(p1->size - p2->size);
      }
      return hr;
    }