I tried to write my own implementation of the strchr() method.
It now looks like this:
char *mystrchr(const char *s, int c) {
while (*s != (char) c) {
if (!*s++) {
return NULL;
}
}
return (char *)s;
}
The last line originally was
return s;
But this didn't work because s is const. I found out that there needs to be this cast (char *), but I honestly don't know what I am doing there :( Can someone explain?
I believe this is actually a flaw in the C Standard's definition of the strchr()
function. (I'll be happy to be proven wrong.) (Replying to the comments, it's arguable whether it's really a flaw; IMHO it's still poor design. It can be used safely, but it's too easy to use it unsafely.)
Here's what the C standard says:
char *strchr(const char *s, int c);
The strchr function locates the first occurrence of c (converted to a char) in the string pointed to by s. The terminating null character is considered to be part of the string.
Which means that this program:
#include <stdio.h>
#include <string.h>
int main(void) {
const char *s = "hello";
char *p = strchr(s, 'l');
*p = 'L';
return 0;
}
even though it carefully defines the pointer to the string literal as a pointer to const
char
, has undefined behavior, since it modifies the string literal. gcc, at least, doesn't warn about this, and the program dies with a segmentation fault.
The problem is that strchr()
takes a const char*
argument, which means it promises not to modify the data that s
points to -- but it returns a plain char*
, which permits the caller to modify the same data.
Here's another example; it doesn't have undefined behavior, but it quietly modifies a const
qualified object without any casts (which, on further thought, I believe has undefined behavior):
#include <stdio.h>
#include <string.h>
int main(void) {
const char s[] = "hello";
char *p = strchr(s, 'l');
*p = 'L';
printf("s = \"%s\"\n", s);
return 0;
}
Which means, I think, (to answer your question) that a C implementation of strchr()
has to cast its result to convert it from const char*
to char*
, or do something equivalent.
This is why C++, in one of the few changes it makes to the C standard library, replaces strchr()
with two overloaded functions of the same name:
const char * strchr ( const char * str, int character );
char * strchr ( char * str, int character );
Of course C can't do this.
An alternative would have been to replace strchr
by two functions, one taking a const char*
and returning a const char*
, and another taking a char*
and returning a char*
. Unlike in C++, the two functions would have to have different names, perhaps strchr
and strcchr
.
(Historically, const
was added to C after strchr()
had already been defined. This was probably the only way to keep strchr()
without breaking existing code.)
strchr()
is not the only C standard library function that has this problem. The list of affected function (I think this list is complete but I don't guarantee it) is:
void *memchr(const void *s, int c, size_t n);
char *strchr(const char *s, int c);
char *strpbrk(const char *s1, const char *s2);
char *strrchr(const char *s, int c);
char *strstr(const char *s1, const char *s2);
(all declared in <string.h>
) and:
void *bsearch(const void *key, const void *base,
size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
(declared in <stdlib.h>
). All these functions take a pointer to const
data that points to the initial element of an array, and return a non-const
pointer to an element of that array.