swiftstringamazon-web-servicescharacter-encodingamazon-polly

AWS Polly - Highlighting special characters


I am using the AWS Polly service for text to speech. But if the text contains some special characters, it is returning the wrong start and end numbers.

For example if the text is : "Böylelikle" it returns : {"time":6,"type":"word","start":0,"end":11,"value":"Böylelikle"}

But it should start from 0 and end to 10.

I've searched AWS Documentation and they say for the start and end values, the offset in bytes not characters.

My question is how can I convert this byte value to the character.

My code is:

builder.continueOnSuccessWith { (awsTask: AWSTask<NSURL>) -> Any? in
    if builder.error == nil {
        if let url = awsTask.result {
            do {
                let txtData = try Data(contentsOf: url as URL)
                if let txtString = String(data: txtData, encoding: .utf8) {
                    let lines = txtString.components(separatedBy: .newlines)
                    for line in lines {
                        let jsonData = Data(line.utf8)
                        let pollyVoiceSentence = try JSONDecoder().decode(PollyVoiceSentence.self, from: jsonData)
                        voiceSentences.append(pollyVoiceSentence)
                    }
                }
            } catch {
                print("Could not parse TXT file")
            }
        }
    } else {
        print("ParseJSON: \(builder.error!)")
    }
    completionHandler(voiceSentences)
    return nil
}

And to highlight words:

let start = pollyVoiceSentence.start
var end = pollyVoiceSentence.end
let voiceRange = NSRange(location: start, length: end - start)

print("RANGE: \(voiceRange) - Word: \(pollyVoiceSentence.value)")

Thanks.


Solution

  • It looks like they are providing you String.utf8.count for the word. Swift supports Unicode and not all characters can be represented within UTF8.

    You can read the official docs here - String and Characters

    There are a ton of useful details there. I would like to highlight following for your use case - enter image description here

    Here's how it looks for your input as well - enter image description here

    What you can do in your case is -

    1. Decode the PollyVoiceSentence the way you are today.
    2. Create an extension on PollyVoiceSentence to account for this char count issue.
    3. Iterate/account for all words in a sentence, because each previous word's char-count now affects start for all the subsequent words.
    4. And you can't trust the start & end provided by the json, because it clearly doesn't fit best with Swift's String API.