node.jsgoogle-cloud-platformtext-to-speechgoogle-text-to-speechssml

Only getting the audio of the last request when doing multiple requests at once using Google's Text to Speech API


When doing multiple requests at once, using Promise.all, I seem to only get the audioContent of the last resolving request.

I'm synthesizing large text's and need to split it up using the API's character limit.

I had this working before, so I know it should work, but stopped working recently.

I'm doing the exact same with Amazon's Polly, and there it works. It's exactly the same code, but with a different client and different request options.

So that made me think maybe it's a library thing? Or a Google service issue?

I'm using the latest version of: https://github.com/googleapis/nodejs-text-to-speech

export const googleSsmlToSpeech = async (
  index: number,
  ssmlPart: string,
  type: SynthesizerType,
  identifier: string,
  synthesizerOptions: GoogleSynthesizerOptions,
  storageUploadPath: string
) => {
  let extension = 'mp3';

  if (synthesizerOptions.audioConfig.audioEncoding === 'OGG_OPUS') {
    extension = 'opus';
  }

  if (synthesizerOptions.audioConfig.audioEncoding === 'LINEAR16') {
    extension = 'wav';
  }

  synthesizerOptions.input.ssml = ssmlPart;

  const tempLocalAudiofilePath = `${appRootPath}/temp/${storageUploadPath}-${index}.${extension}`;

  try {
    // Make sure the path exists, if not, we create it
    await fsExtra.ensureFile(tempLocalAudiofilePath);

      // Performs the Text-to-Speech request
    const [response] = await client.synthesizeSpeech(synthesizerOptions);

    // Write the binary audio content to a local file
    await fsExtra.writeFile(tempLocalAudiofilePath, response.audioContent, 'binary');

    return tempLocalAudiofilePath;
  } catch (err) {
    throw err;
  }
};
/**
 * Synthesizes the SSML parts into seperate audiofiles
 */
export const googleSsmlPartsToSpeech = async (
  ssmlParts: string[],
  type: SynthesizerType,
  identifier: string,
  synthesizerOptions: GoogleSynthesizerOptions,
  storageUploadPath: string
) => {
  const promises: Promise<string>[] = [];

  ssmlParts.forEach((ssmlPart: string, index: number) => {
    promises.push(googleSsmlToSpeech(index, ssmlPart, type, identifier, synthesizerOptions, storageUploadPath));
  });

  const tempAudioFiles = await Promise.all(promises);

  tempAudioFiles.sort((a: any, b: any) => b - a); // Sort: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 etc...

  return tempAudioFiles;
};

The above code creates multiple files with the correct naming and index number, however, they all contain the same audio. That is; the audio response that resolved the fastest.

824163ed-b4d9-4830-99da-6e6f985727e2-0.mp3
824163ed-b4d9-4830-99da-6e6f985727e2-1.mp3
824163ed-b4d9-4830-99da-6e6f985727e2-2.mp3

Replacing the Promise.all with a simple for loop, makes it work. But this takes way longer as it waits for every request to resolve. I know a Promise.all can work, because I had it working before, and would like to see it working again.

  const tempAudioFiles = [];
  for (var i = 0; i < ssmlParts.length; i++) {
    tempAudioFiles[i] = await googleSsmlToSpeech(i, ssmlParts[i], type, identifier, synthesizerOptions, storageUploadPath);
  }

I just can't seem to get it to work anymore with a Promise.all.


Solution

  • Got it working. The library seems to do things differently than I thought. Creating a copy of the synthesizerOptions using Object.assign did the trick

    Working code: https://github.com/googleapis/nodejs-text-to-speech/issues/210#issuecomment-487832411

    ssmlParts.forEach((ssmlPart: string, index: number) => {
      const synthesizerOptionsCopy = Object.assign({}, synthesizerOptions);
      promises.push(googleSsmlToSpeech(index, ssmlPart, type, identifier, synthesizerOptionsCopy, storageUploadPath));
    });
    
    // Inside googleSsmlToSpeech()
    const ssmlPartSynthesizerOptions = Object.assign(synthesizerOptions, {
      input: {
        ssml: ssmlPart
      }
    });