calgorithmstring-algorithm

Inplace string replacement in C


Write a function

void inplace(char *str, 
             const char pattern, 
             const char* replacement, 
             size_t mlen)

Input:
str: a string ending with \0. the input indicates that we need an inplace algorithm.

pattern: a letter.

replacement: a string.

mlen: the size of the memory holds the string str starts from the beginning of the memory and that mlen should be larger than strlen(str)


The final result is still pointed by str.

Note that all occurrence of pattern should be replaced.

For example,

helelo\0...........

Here "helelo" is the string to replace with '\0' at the end. After '\0' there are still L valid bytes. We want to replace "e" by "123".

A simple approach works like this, we go through str, when a pattern is matched, we shift all the rest with the place to fill the replacement string, then replace the pattern by the replacement.

If the original string is with length n and contains only e, we need (n-1) + (n-2) + ... + 1 shifts.

Is there an algorithm that scans the string with only one pass and constant memory cost?


Solution

  • I think two passes is the minimum. On the first pass, count the number of characters that will be replaced. Given that count and the length of the replacement string, you can compute the length of the final string. (And you should verify that it's going to fit into the buffer.)

    On the second pass, you scan the string backwards (starting at the last character), copying characters to their final location. When you encounter the search character, copy the replacement string to that location.

    In your example, the increase in length would be 2. So you would

    copy str[5] which is '\0' to str[7]
    copy str[4] which is 'o' to str[6]
    copy str[3] which is 'l' to str[5]
    copy str[2] which is 'l' to str[4]
    at str[1] you find the 'e' so str[3]='3' str[2]='2' str[1]='1'
    

    At this point the output index is the same as the input index, so you can break the loop.


    As @chux pointed out in the comments, the cases where the replacement string is either empty, or has exactly one character, can be handled with a single forward pass through the string. So the code should handle those cases separately.