stringindexingrust

How to index a String in Rust


I am attempting to index a string in Rust, but the compiler throws an error. My code (Project Euler problem 4, playground):

fn is_palindrome(num: u64) -> bool {
    let num_string = num.to_string();
    let num_length = num_string.len();

    for i in 0 .. num_length / 2 {
        if num_string[i] != num_string[(num_length - 1) - i] {
            return false;
        }
    }
    
    true
}

The error:

error[E0277]: the trait bound `std::string::String: std::ops::Index<usize>` is not satisfied
 --> <anon>:7:12
  |
7 |         if num_string[i] != num_string[(num_length - 1) - i] {
  |            ^^^^^^^^^^^^^
  |
  = note: the type `std::string::String` cannot be indexed by `usize`

Is there a reason why String can not be indexed? How can I access the data then?


Solution

  • Yes, indexing into a string is not available in Rust. The reason for this is that Rust strings are encoded in UTF-8 internally, so the concept of indexing itself would be ambiguous, and people would misuse it: byte indexing is fast, but almost always incorrect (when your text contains non-ASCII symbols, byte indexing may leave you inside a character, which is really bad if you need text processing), while char indexing is not free because UTF-8 is a variable-length encoding, so you have to traverse the entire string to find the required code point.

    If you are certain that your strings contain ASCII characters only, you can use the as_bytes() method on &str which returns a byte slice, and then index into this slice:

    let num_string = num.to_string();
    
    // ...
    
    let b: u8 = num_string.as_bytes()[i];
    let c: char = b as char;  // if you need to get the character as a unicode code point
    

    If you do need to index code points, you have to use the chars() iterator:

    num_string.chars().nth(i).unwrap()
    

    As I said above, this would require traversing the entire iterator up to the ith code element.

    Finally, in many cases of text processing, it is actually necessary to work with grapheme clusters rather than with code points or bytes. With the help of the unicode-segmentation crate, you can index into grapheme clusters as well:

    use unicode_segmentation::UnicodeSegmentation
    
    let string: String = ...;
    UnicodeSegmentation::graphemes(&string, true).nth(i).unwrap()
    

    Naturally, grapheme cluster indexing has the same requirement of traversing the entire string as indexing into code points.