iphoneobjective-ciosnsstringnsstringencoding

How to remove the last unicode symbol from NSString


I have implemented a custom keyboard associated with a text field, so when the user presses the delete button, I remove the last character from the string, and manually update the current text field text.

NSRange range = NSMakeRange(currentTextFieldString.length-1, 1);
[currentTextFieldString replaceCharactersInRange:range withString:@""];

So far so good.

Now, the problem is, that the user has the option to enter some special unicode symbols, these are not 1 byte, they can be 2 bytes too, now on pressing the delete button, I have to remove the entire symbol, but if I follow the above approach, the user has to press the delete button twice.

Here, if I do:

NSRange range = NSMakeRange(currentTextFieldString.length-2, 2);
[currentTextFieldString replaceCharactersInRange:range withString:@""];

it works fine, but then, the normal characters, which are just 1 byte, get deleted twice at a time.

How to handle such scenarios?

Thanks in advance.

EDIT:

It is strange, that if I switch to the iPhone keyboard, it handles both cases appropriately. There must be some way to do it, there is something that I am missing, but am not able to figure out what.


Solution

  • Here's the problem. NSStrings are encoded using UTF-16. Many common Unicode glyphs take up only one unichar (a 16 bit unsigned value). However, some glyphs take up two unichars. Even worse, some glyphs can be composed or decomposed, e.g.é might be one Unicode code point or it might be two - an acute accent followed by an e. This makes it quite difficult to do what you want viz delete one "character" because it is really hard to tell how many unichars it takes up.

    Fortunately, NSString has a method that helps with this: -rangeOfComposedCharacterSequenceAtIndex:. What you need to do is get the index of the last unichar, run this method on it, and the returned NSRange will tell you where to delete from. It goes something like this (not tested):

    NSUInteger lastCharIndex = [myString length] - 1; // I assume string is not empty
    NSRange rangeOfLastChar = [myString rangeOfComposedCharacterSequenceAtIndex: lastCharIndex];
    myNewString = [myString substringToIndex: rangeOfLastChar.location];