In order to solve a problem from K&R C book, I am trying to write a function strstrci()
which is the case insensitive version of strstr()
char * strstrci (char *s, char *p)
{
int i, j;
for (i = 0; *(s + i) != '\0'; i++)
{
if (*(s + i) == *p || *(s + i) == *p + 32 || *(s + i) == *p - 32)
{
for (j = 1; *(s + i + j) == *(p + j) || *(s + i + j) == *(p + j) + 32 || *(s + i + j) == *(p + j) - 32; j++)
{
if (*(p + j) == '\0')
return (s + i);
}
}
}
return NULL;
}
The function seems to work well for all cases except when the pattern is the last part of the input string. For example: string : On a wall he sat pattern: sat In such a case it returns a NULL. Please point out the mistake
There are multiple problems:
const char *
.strstr
: an empty string pointed to by p
should be found at the beginning of s
, even if s
points to an empty string.32
is ASCII specific, which may be OK, but it is not correct for all characters. Your function will erroneously match @
, [
, \
, ]
with respectively `
, {
, |
, }
and ^
, among other pairs that you indiscriminately consider equivalent.i
and j
should have type size_t
to allow for arbitrary long strings.*(p + j) == '\0'
if the characters match, hence the function only matches the needle at the end of the haystack string.*(p + j) == '\0'
is cumbersome and less readable than the equivalent array syntax p[j] == '\0'
.Here is a modified version using the standard header <ctype.h>
:
#include <ctype.h>
char *strstrci(const char *s, const char *p) {
if (*p == '\0')
return (char *)s;
for (; *s; s++) {
if (tolower((unsigned char)*s) == tolower((unsigned char)*p)) {
size_t i;
for (i = 1;; i++) {
if (p[i] == '\0')
return (char *)s;
if (tolower((unsigned char)s[i]) != tolower((unsigned char)p[i]))
break;
}
}
}
return NULL;
}