c++getopenfilename

OPENFILENAME dialog returns Asian letters instead of file path


I'm trying to create a function that gets a filepath from the OPENFILENAME dialog. My code looks something like this.

wstring src;

bool open()
{
    const string title  = "Select a File";

    wchar_t filename[MAX_PATH];

    OPENFILENAMEA ofn;
    ZeroMemory(&filename, sizeof(filename));
    ZeroMemory(&ofn, sizeof(ofn));

    ofn.lStructSize     = sizeof(ofn);
    ofn.hwndOwner       = NULL; 
    ofn.lpstrFilter     = "Music (.mp3)\0*.mp3\0All\0*.*\0";
    ofn.lpstrFile       = LPSTR(filename);
    ofn.nMaxFile        = MAX_PATH;
    ofn.lpstrTitle      = title.c_str();
    ofn.Flags           = OFN_DONTADDTORECENT | OFN_FILEMUSTEXIST;

    if (GetOpenFileNameA(&ofn))
    {
        src = filename;    //<----------Save filepath in global variable
        return true;
    }
    return false;
}

Upon placing a breakpoint at the commented row, I could inspect the value of 'src' as well as 'filename', which at this point were, to me, unidentifiable letters of asian origin. Why does this happen? Is this a conversion issue?

EDIT:

Thanks to a quick reply and a few comments, the code is now fully functional. Thanks Hans Passant for a very direct solution, and also huge thanks to Cody Gray for rewriting the function, explaining the mistake, as well as giving me a lesson in how it should be handled. As I am still taking my first steps to learn the winapi, this information will serve me well in future programs.


Solution

  • This is a classic bug caused by mixing ANSI and Unicode character types. Its hallmark is the appearance of random Asian-origin characters in strings, exactly as you describe.

    A quick glance at your code immediately reveals the problem. You're calling the ANSI versions of a Win32 function—GetOpenFileNameA—and using the ANSI versions of the data structures—OPENFILENAMEA—yet your filename array consists of wide characters (wchar_t). In Win32, the A-suffixed types are always ANSI and require the use of 1-byte char types. The W-suffixed types are always Unicode and require the use of 2-byte wchar_t types. You cannot mix the two without explicit conversion, e.g. using MultiByteToWideChar and/or WideCharToMultiByte.

    Note that the compiler would have caught this type mismatch error if you hadn't used explicit casts to shut it up.

    In the modern era, you should always use the W-suffixed APIs because you always want to support Unicode. The world is not ASCII, and the Windows ANSI encodings became obsolete when everyone switched to Windows NT and Windows 98/ME were dead and buried.

    So rewrite your code as follows (I've also taken the liberty to change some other idioms and clean up the code in other ways, as well, like using C++ language constructs to zero memory):

    std::wstring src;
    
    bool open()
    {
        const std::wstring title = L"Select a File";
        std::wstring filename(MAX_PATH, L'\0');
    
        OPENFILENAMEW ofn = { };
        ofn.lStructSize     = sizeof(ofn);
        ofn.hwndOwner       = NULL; 
        ofn.lpstrFilter     = L"Music (.mp3)\0*.mp3\0All\0*.*\0";
        ofn.lpstrFile       = &filename[0];  // use the std::wstring buffer directly
        ofn.nMaxFile        = MAX_PATH;
        ofn.lpstrTitle      = title.c_str();
        ofn.Flags           = OFN_DONTADDTORECENT | OFN_FILEMUSTEXIST;
    
        if (GetOpenFileNameW(&ofn))
        {
            src = filename;    //<----------Save filepath in global variable
            return true;
        }
        return false;
    }
    

    To avoid the need to explicitly type the W each time, ensure that both UNICODE and _UNICODE are defined globally for your project. The best way to do this is using the project properties, where you can predefine these symbols. Otherwise, you can define them at the top of your precompiled header. This ensures that the W-suffixed variants of functions and types are always used, even when you omit the suffix. So you could simply say GetOpenFileName and OPENFILENAME. Macros in the Windows header files handle the resolution.

    The advantage of this is it causes the compiler to generate a type mismatch error if you forget to prefix a wide-string literal with the L prefix, which is a common mistake if you're not yet in the habit of doing so. (Of course, this assumes that you also get out of the bad habit of using explicit casts to silence compiler warnings, because you can easily defeat this safety net by doing so.)