rustdbfdbase

Reading .dfb file with rust throws invalid character error


I am new to rust and creating a POC to convert dbf file to csv. I am reading a .dbf file using rust library dbase.

The issue is, when i crate a sample .dbf file using dbfview the code works fine. But when i use .dbf file which i will be using in real time. I am getting the following error.

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: InvalidFieldType('M')', src/libcore/result.rs:999:5

Here is the code i am using from the given link.

use dbase::FieldValue;
let records = dbase::read("tests/data/line.dbf").unwrap();
for record in records {
    for (name, value) in record {
        println!("{} -> {:?}", name, value);
        match value {
            FieldValue::Character(string) => println!("Got string: {}", string),
            FieldValue::Numeric(value) => println!("Got numeric value of  {}", value),
            _ => {}
        }
    }
}

I think the ^M shows the character appended by windows. What can i do to handle this error and read the file successfully. Any help will be much appreciated.


Solution

  • The short answer to your question is no, you will not be able to read this file with dbase-rs (or any current library) and you'll most likely have to rework this file to not contain a memo field.


    A deep dive into the DBF file format

    The InvalidFieldType error points at a structural feature of the file that your library cannot handle - a Memo field. We're going to deep-dive into the file to figure out why that is, and whether there is anything we can do to fix it.

    This is the header definition:

    enter image description here

    Of particular importance is byte 28 (offset 0000010, byte 0C), which is a bitmask indicating if the table contains a bunch of possible things, most notably:

    At 0x03, your file comes with both an associated .cdx file and contains a memo. As we know (ahead of time) that dbase-rs does not handle that, that's looking increasingly more likely.

    Let's keep looking. From here on, each field is 32 bytes long.

    Here are your fields:

    enter image description here

    Bytes 0-10 contain the field name, byte 11 is the type. Due to how the library you wanted to use can only parse certain fields, we only really care about byte 11.

    In order of appearance by what the library can parse:

    The last field is the problematic one. Looking into the library itself, this field type is not supported and will therefore yield an Error, which you are trying to unwrap(). This is the source of your error.

    There are two three ways around it: