cstringrustffibytebuffer

First byte is suddenly zeroed while converting to CString


I want to do FFI call. To do so i need to convert a Rust string into C-zero-terminated-string in a proper encoding. I wrote this code but somehow it replaces first byte with zero. Why this happen? How do I fix this?

use encoding::all::WINDOWS_1251;
use encoding::{EncoderTrap, Encoding};
use std::ffi::{c_char, CString};

fn str_to_c_str(input: &str) -> *const c_char {
    let enc_input: Vec<u8> = WINDOWS_1251.encode(input, EncoderTrap::Strict).expect("TODO: panic message");
    println!("enc_input {:?}", enc_input);
    let cstring: CString = CString::new(enc_input).expect("TODO: panic message");
    let s1 = unsafe { std::slice::from_raw_parts(cstring.as_ptr(), 4) };
    println!("str_to_c_str {:?}", s1);
    cstring.as_ptr()
}

fn main() {
    unsafe {
        let data = str_to_c_str("ABC");
        let s2 = unsafe { std::slice::from_raw_parts(data, 4) };
        println!("main {:?}", s2);
    }
}

Here is an actual output:

enc_input [65, 66, 67]
str_to_c_str [65, 66, 67, 0]
main [0, 66, 67, 0]

I expect a last line to be:

main [65, 66, 67, 0]

[dependencies]
libc = "0.2.0"
encoding = "0.2"
byteorder = "1.4.3"

Added

If I rewrite my code like this would it be valid? Like if a compiler see that there is no "valid" usage of data after 1 and there is no custom deallocator in CString or is has no visible side-effect, can the compiler move data's deallocation from 2 to 1 as an optimisation?

use encoding::all::WINDOWS_1251;
use encoding::{EncoderTrap, Encoding};
use std::ffi::{c_char, CString};

unsafe fn ffi_call(arg: *const c_char) -> () {
    let s1 = unsafe { std::slice::from_raw_parts(arg, 4) };
    println!("str_to_c_str {:?}", s1);
}

fn str_to_c_str(input: &str) -> CString {
    let enc_input: Vec<u8> = WINDOWS_1251.encode(input, EncoderTrap::Strict).expect("TODO: panic message");
    println!("enc_input {:?}", enc_input);
    let cstring: CString = CString::new(enc_input).expect("TODO: panic message");
    cstring
}

fn main() {
    let data = str_to_c_str("ABC");
    // 1
    unsafe {
        ffi_call(data.as_ptr());
    }
    // 2
}

Solution

  • If I rewrite my code like this would it be valid? Like if a compiler see that there is no "valid" usage of data after 1 and there is no custom deallocator in CString or is has no visible side-effect, can the compiler move data's deallocation from 2 to 1 as an optimisation?

    No, it is not allowed to, exactly because this will break your program.

    As long as you don't have UB in your code, you don't have to worry about compiler optimizations. If they will break your program, it will be a bug in the compiler (and it's pretty rare).