windowsfilewinapishort-filenames

How are Short File Names generated in Windows?


I am currently using the following P/Invoke signature to get the short filename of a regular Windows file:

[DllImport("kernel32.dll", CharSet = CharSet.Auto)]
public static extern int GetShortPathName([MarshalAs(UnmanagedType.LPTStr)] string path,
                                          [MarshalAs(UnmanagedType.LPTStr)] StringBuilder shortPath,
                                          int shortPathLength);

Currently - it is working without any problems, but I noticed something rather peculiar:
I know that Windows uses the following short filename convention:

Cut the name to 6 characters (without extension)
Append the tilde (~)
Append an unsigned integer number which indicates the match index (starting with 1)
Append the original file extension

Thus, the file name C:\abcdefghijklmn.txt should be accessible under the short name C:\abcdefg~1.txt. (Which is working perfectly fine.)

Now the strange part: I recently performed a small search inside my music directory for specific audio files. This was the result:

.\Rammstein & Tatu - Moscow.mp3
.\Rammstein - Asche zu Asche.mp3
.\Rammstein - Der Meister.mp3
.\Rammstein - Du Hast.mp3
.\Rammstein - Eifersucht.mp3
.\Rammstein - Feuer Frei.mp3
.\Rammstein - Führe Mich.mp3
.\Rammstein - Haifisch.mp3
...

And the same search in short notation:

.\RA8E17~1.MP3
.\RA23A6~1.MP3
.\RAMMST~1.MP3
.\RA0CAE~1.MP3
.\RAMMST~2.MP3
.\RAMMST~3.MP3
.\RAMMST~4.MP3
.\RA6BAA~1.MP3
...

My question is: Why is windows generating such "random" prefixes before the tilde (like RA23A6 or RA0CAE)?


Solution

  • Microsoft does not document this, but Wikipedia does:

    8.3 filename:

    Although there is no compulsory algorithm for creating the 8.3 name from an LFN, Windows uses the following convention:

    1.If the LFN is 8.3 uppercase, no LFN will be stored on disk at all.

    • Example: TEXTFILE.TXT

    2.If the LFN is 8.3 mixed case, the LFN will store the mixed-case name, while the 8.3 name will be an uppercased version of it.

    • Example: TextFile.Txt becomes TEXTFILE.TXT.

    3.If the filename contains characters not allowed in an 8.3 name (including space which was disallowed by convention though not by the APIs) or either part is too long, the name is stripped of invalid characters such as spaces and extra periods. Other characters such as + are changed to the underscore _, and uppercased. The stripped name is then truncated to the first 6 letters of its basename, followed by a tilde, followed by a single digit, followed by a period ., followed by the first 3 characters of the extension.

    • Example: TextFile1.Mine.txt becomes TEXTFI~1.TXT (or TEXTFI~2.TXT, should TEXTFI~1.TXT already exist). ver +1.2.text becomes VER_12~1.TEX.

    4.Beginning with Windows 2000, if at least 4 files or folders already exist with the same initial 6 characters in their short names, the stripped LFN is instead truncated to the first 2 letters of the basename (or 1 if the basename has only 1 letter), followed by 4 hexadecimal digits derived from an undocumented hash of the filename, followed by a tilde, followed by a single digit, followed by a period ., followed by the first 3 characters of the extension.

    • Example: TextFile.Mine.txt becomes TE021F~1.TXT.

    As Joey mentioned, the undocumented hash of the filename has been reverse engineered.