rustffi

Why does converting a `String` to a `*const i8` and back cause it to change its data?


I am doing some work on FFI. I am confused about the conversion of const i8 to string method. My code is as follows:

use std::ffi::{CStr, CString};

#[repr(C)]
#[derive(Debug, Clone, Copy)]
pub enum MyValue {
    String(*const i8),
}

fn main() {
    let content = vec!["Hello", "world"];
    let mut container: Vec<MyValue> = vec![];

    // Generate data
    for item in content {
        let c_item = CString::new(item).unwrap();
        container.push(MyValue::String(c_item.as_ptr() as *const i8));

        let my_var = *container.last().unwrap();
        // println!("Before-Pointer > value: {:?}", my_var);
        let MyValue::String(my_var_address) = my_var else {
            panic!("Error")
        };

        // println!("Before-Pointer > address: {:?}", my_var_address);
        let real_content = unsafe { CStr::from_ptr(my_var_address) };
        println!("Before-Pointer > real_value: {:?}", real_content);
    }

    println!("====================");
    
    // Convert vector to vector pointer
    let container_ptr = container.as_ptr();
    // Get value
    for i in 0..container.len() as isize {
        let my_var_p = unsafe { *container_ptr.offset(i) };
        // println!("After-Pointer > value: {:?}", my_var_p);
        let MyValue::String(my_var_address_ptr) = my_var_p else {
            panic!("Error")
        };

        // println!("After-Pointer > address: {:?}", my_var_address_ptr);
        let real_content_ptr = unsafe { CStr::from_ptr(my_var_address_ptr) };
        println!("After-Pointer > real_value: {:?}", real_content_ptr);
    }
}

Output as follows:

Before-Pointer > real_value: "Hello"
Before-Pointer > real_value: "world"
====================
After-Pointer > real_value: "\x10\xa6\x92\x04\xf5\x01"
After-Pointer > real_value: "\x10\xa6\x92\x04\xf5\x01"

I am confused as to why the conversion results are different when using the same method.


Solution

  • c_item is freed at the closing brace of the first loop, that means any use after it is a use after free and UB.

    You can replace as_ptr which doesn't take owenrship of c_str with into_raw which does, so now you're responsible for tracking who owns the strings:

    container.push(MyValue::String(c_item.into_raw() as *const i8));
    

    To avoid leaking the memory you should adapt MyValue's Drop implementation accordingly, but you should be very careful to not put in strings that are not owned or not managed by Rust.