I am trying to call ruby regex from C code:
#include <ruby.h>
#include "ruby/re.h"
int main(int argc, char** argv) {
char string[] = "regex";
ruby_setup();
rb_reg_regcomp(string);
return 0;
}
I compiled the newest version of ruby myself (commit 0b303c683007598a31f2cda3d512d981b278f8bd) and I link my program against it. It compiles with the warning:
fuzzer.c: In function ‘main’:
fuzzer.c:10:17: warning: passing argument 1 of ‘rb_reg_regcomp’ makes integer from pointer without a cast [-Wint-conversion]
10 | rb_reg_regcomp(string);
| ^~~~~~
| |
| char *
In file included from fuzzer.c:4:
/home/cyberhacker/Asioita/Hakkerointi/Rubyregex/ruby/build/output/include/ruby-3.3.0+0/ruby/re.h:36:28: note: expected ‘VALUE’ {aka ‘long unsigned int’} but argument is of type ‘char *’
36 | VALUE rb_reg_regcomp(VALUE str);
That I think is because the "VALUE" keyword in the ruby source code is a generic pointer to any type. When I try to run the program I get a segfault with this backtrace:
Program received signal SIGSEGV, Segmentation fault.
rb_enc_dummy_p (enc=enc@entry=0x0) at ../encoding.c:181
181 return ENC_DUMMY_P(enc) != 0;
(gdb) where
#0 rb_enc_dummy_p (enc=enc@entry=0x0) at ../encoding.c:181
#1 0x000055555569bd00 in rb_reg_initialize (obj=obj@entry=140737345038080, s=0xc62000007ffff78a <error: Cannot access memory at address 0xc62000007ffff78a>, len=-4574812796478291968, enc=enc@entry=0x0, options=options@entry=0, err=err@entry=0x7fffffffdb30 "", sourcefile=0x0, sourceline=0) at ../re.c:3198
#2 0x00005555556a11c8 in rb_reg_initialize_str (sourceline=0, sourcefile=0x0, err=0x7fffffffdb30 "", options=0, str=140737488346082, obj=140737345038080) at ../include/ruby/internal/core/rstring.h:516
#3 rb_reg_init_str (options=0, s=140737488346082, re=140737345038080) at ../re.c:3299
#4 rb_reg_new_str (options=0, s=140737488346082) at ../re.c:3291
#5 rb_reg_regcomp (str=140737488346082) at ../re.c:3373
#6 0x0000555555584648 in main () at ../include/ruby/internal/encoding/encoding.h:418
I tried to fiddle around with the type of the string which I pass to the function, but nothing really seemed to work. Expected behaviour is that it runs succesfully.
Can someone help? Thanks in advance!
After a bit of digging I figured out that you need to convert the c string to a ruby string and then pass it to the function. I was confused, because in the documentation they say that: "Ruby’s String kinda corresponds to C’s char*." .
#include <ruby.h>
#include "ruby/re.h"
int main(int argc, char** argv) {
VALUE x;
char string[] = "regex";
x = rb_str_new_cstr(string);
rb_reg_regcomp(x);
return 0;
}