I am using the AWS Polly service for text to speech. But if the text contains some special characters, it is returning the wrong start and end numbers.
For example if the text is : "Böylelikle" it returns : {"time":6,"type":"word","start":0,"end":11,"value":"Böylelikle"}
But it should start from 0 and end to 10.
I've searched AWS Documentation and they say for the start and end values, the offset in bytes not characters.
My question is how can I convert this byte value to the character.
My code is:
builder.continueOnSuccessWith { (awsTask: AWSTask<NSURL>) -> Any? in
if builder.error == nil {
if let url = awsTask.result {
do {
let txtData = try Data(contentsOf: url as URL)
if let txtString = String(data: txtData, encoding: .utf8) {
let lines = txtString.components(separatedBy: .newlines)
for line in lines {
let jsonData = Data(line.utf8)
let pollyVoiceSentence = try JSONDecoder().decode(PollyVoiceSentence.self, from: jsonData)
} catch {
print("Could not parse TXT file")
} else {
print("ParseJSON: \(builder.error!)")
return nil
And to highlight words:
let start = pollyVoiceSentence.start
var end = pollyVoiceSentence.end
let voiceRange = NSRange(location: start, length: end - start)
print("RANGE: \(voiceRange) - Word: \(pollyVoiceSentence.value)")
It looks like they are providing you String.utf8.count
for the word. Swift supports Unicode and not all characters can be represented within UTF8.
You can read the official docs here - String and Characters
There are a ton of useful details there. I would like to highlight following for your use case -
Here's how it looks for your input as well -
What you can do in your case is -
the way you are today.PollyVoiceSentence
to account for this char count issue.start
& end
provided by the json, because it clearly doesn't fit best with Swift's String API.