windowstext-to-speechsapi

Windows Speech SAPI: How to list attributes for voices?


I have used this Stack Overflow Answer to iterate through all installed Windows text-to-speech voices, but I am having a hard time extracting the attributes of each one. e.g. gender, language, name etc.

I am assuming there is a way of extracting the properties that can be used to find voices, e.g. gender=female;language=409

if ( FAILED( ::CoInitialize( NULL ) ) )
    return 1;

HRESULT hr = S_OK;

CComPtr<ISpVoice> cpVoice; //Will send data to ISpStream
CComPtr<ISpStream> cpStream; //Will contain IStream
CComPtr<IStream> cpBaseStream; //raw data
ISpObjectToken* cpToken( NULL ); //Will set voice characteristics

GUID guidFormat;
WAVEFORMATEX* pWavFormatEx = nullptr;

hr = cpVoice.CoCreateInstance( CLSID_SpVoice );

CComPtr<ISpObjectTokenCategory> cpSpCategory = NULL;
if ( SUCCEEDED( hr = SpGetCategoryFromId( SPCAT_VOICES, &cpSpCategory ) ) )
{
    CComPtr<IEnumSpObjectTokens> cpSpEnumTokens;
    if ( SUCCEEDED( hr = cpSpCategory->EnumTokens( NULL, NULL, &cpSpEnumTokens ) ) )
    {
        CComPtr<ISpObjectToken> pSpTok;
        while ( SUCCEEDED( hr = cpSpEnumTokens->Next( 1, &pSpTok, NULL ) ) )
        {
            // do something with the token here; for example, set the voice
            //pVoice->SetVoice( pSpTok, FALSE );
            WCHAR *pID = NULL;
            hr = pSpTok->GetId( &pID );
            // This succeeds, pID is "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices\Tokens\TTS_MS_EN-US_DAVID_11.0" 
            WCHAR *pName = NULL;
            pSpTok->GetStringValue( L"name", &pName );
            // pName, pGender and pLanguage are all null
            WCHAR *pGender = NULL;
            pSpTok->GetStringValue( L"gender", &pGender );
            WCHAR *pLanguage = NULL;
            pSpTok->GetStringValue( L"language", &pLanguage );
            LONG index = 0;
            WCHAR *key = NULL;
            while ( SUCCEEDED( hr = pSpTok->EnumKeys( index, &key ) ) )
            {
                // Gets some elements
                WCHAR* pValue = NULL;
                pSpTok->GetStringValue( key, &pValue );
                // Loops once, key value is "Attributes"
                index++;
            }
            index = 0;
            while ( SUCCEEDED( hr = pSpTok->EnumValues( index, &key ) ) )
            {
                // Loops many times, but none of these have what I need
                WCHAR* pValue = NULL;
                pSpTok->GetStringValue( key, &pValue );
                index++;
            }
            // NOTE:  IEnumSpObjectTokens::Next will *overwrite* the pointer; must manually release
            pSpTok.Release();
        }
    }
}

I am new to Windows C++ dev, my apologies.


Solution

  • After you obtain the ISpObjectToken that represents the voice, retrieve its “Attributes” subkey and then query the subkey’s values:

    CComPtr<ISpObjectToken> pSpTok;
    while (cpSpEnumTokens->Next(1, &pSpTok, NULL) == S_OK)
    {
        CComPtr<ISpDataKey> cpSpAttributesKey;
        if (SUCCEEDED(hr = pSpTok->OpenKey(L"Attributes", &cpSpAttributesKey)))
        {
            CSpDynamicString dstrName;
            cpSpAttributesKey->GetStringValue(L"Name", &dstrName);
            CSpDynamicString dstrGender;
            cpSpAttributesKey->GetStringValue(L"Gender", &dstrGender);
            // dstrName: Microsoft David Desktop
            // dstrGender: Male
        }
        pSpTok.Release();
    }
    

    I’m using CSpDynamicString so that the memory allocated for the strings is automatically freed. You can choose to use WCHAR* instead, but then you’d be responsible for calling CoTaskMemFree yourself.

    I also fixed another bug in your original code: The result of cpSpEnumTokens->Next should be compared to S_OK, not passed to SUCCEEDED, because Next returns S_FALSE to indicate the enumeration is complete. S_FALSE is a success result, so using SUCCEEDED causes an infinite loop.

    Reference: Object Tokens and Registry Settings White Paper - §5.3 Inspecting Underlying Keys of a Token