Error when trying to read binary file in VHDL

I am trying to write a VHDL simulation module that reads a binary file containing 16-bit input samples. The code is as follows :

library ieee; 
use IEEE.std_logic_1164.all; 
use IEEE.numeric_std.all; 
use std.textio.all;

entity read_binary_file is 
    generic ( clk_per : time := 8 ns ); 
end read_binary_file;

architecture behav of read_binary_file is

    subtype sample is signed(15 downto 0);
    
    type sample_file is file of sample;
    
    file input_file : sample_file;
    
    signal input : signed(15 downto 0);

begin

    read_file : process

    variable input_data : sample;

    begin
        
        file_open(input_file, ".\HPGex2_Na22\data_chA_r00020.bin", read_mode);

        while not endfile(input_file) loop

            read(input_file, input_data);

            input <= input_data;
            
            wait for clk_per;

        end loop;

        if endfile(input_file) then
            file_close(input_file);
            wait;
        end if;

    end process;
end behav;

For the following commands I am getting this output :

ghdl -a "--std=08" ./read_binary_file/read_binary_file.vhdl

ghdl -e "--std=08" read_binary_file

ghdl -r "--std=08" read_binary_file --stop-time="1000ms" --vcd=wave.vcd --disp-time

.\read_binary_file.exe:internal error: file: IO error

.\read_binary_file.exe:error: simulation failed

I've noticed that replacing in the definition of the sample type "signed (WIDTH downto 0)" with "integer", "bit" or any type that is not vector-like (std_logic_vector, signed, unsigned etc.)

Solution

The issue is that the implementation defined storage format for sample is not compatible with the contents of data_chA_r00020.bin. The binary representation expected for type unresolved_signed (-2008 numeric_std, the base type of subtype sample) includes a ghdl specific header (#GHDL-BINARY-FILE-0.0[e].) and byte position values for the enumerated type std_ulogic for each element. Your .bin file wouldn't match that.

There are two general ways to cure the issue:

You could external convert your binary file to something compatible with TEXTIO.
You can read the binary file as a sequence of characters. Type CHARACTER is a binary representation with all 256 values of it's byte representation defined in its declaration.

The second option depends on several things.

The ISO/IEC 8859-1:1998, Information technology—8-bit single-byte coded graphic character sets—Part 1: Latin alphabet No. 1 defines the values for the enumerated character type CHARACTER which is required to implement all 256 values. ghdl implements all predefined or IEEE enumerated types as bytes. Because there is no guarantee another (vendor) implementation uses bytes there can be a portability issue. However it's likely that any implementation of CHARACTER will use a byte to hold it's value. There are no plans to support bigger character sets.

Implementation details are not defined by the VHDL standard (IEEE Std 1076-2008). Historically you could implement Ada or VHDL on any computer platform including those using binary encoded decimal arithmetic because arithmetic operations have their mathematical meaning. Implementations don't reveal how many bits or structures in simulation kernel space.

VHDL does allow overloading of subprograms here A READ procedure that reads two characters, converts their positional value to a binary representation and returns a signed value. There can be issues taking two consecutive characters in order (here bytes) and constructing a 16 bit sample data value. It depends on which byte is indexed as 0 or 1 for subtype sample. This is target platform dependent.

The character bytes to sample subtype conversion takes place in a READ procedure overload:

library ieee; 
use IEEE.std_logic_1164.all; 
use IEEE.numeric_std.all; 
-- use std.textio.all;

entity read_binary_file is 
    generic ( clk_per : time := 8 ns ); 
end read_binary_file;

architecture behav of read_binary_file is
    subtype sample is signed(15 downto 0);
    -- type sample_file is file of sample;   
    type sample_file is file of CHARACTER;
    file input_file : sample_file;
    signal input : signed(15 downto 0);
    
    -- OVERLOAD added:
    procedure READ (file f: sample_file; ivalue: out  sample) is
        variable retval: unsigned (sample'range);
        variable read_char:  CHARACTER;
    begin
        read (f, read_char);
    -- LITTLE ENDIAN:
        retval( 7 downto 0) := to_unsigned(CHARACTER'POS(read_char), 8);
        read (f, read_char);
        retval(15 downto 8) := to_unsigned(CHARACTER'POS(read_char), 8);
        ivalue := signed(retval);
    end procedure;
    
begin

read_file: 
    process
        variable input_data : sample;
    begin
        
        -- file_open(input_file, ".\HPGex2_Na22\data_chA_r00020.bin", read_mode);  -- ORIGINAL 
        file_open(input_file, "./x29.bin", read_mode);
        
        while not endfile(input_file) loop
            read(input_file, input_data);
            input <= input_data;
            wait for clk_per;
        end loop;

        if endfile(input_file) then
            file_close(input_file);
            wait;
        end if;

    end process;
end behav;

Note this doesn't depend on TEXTIO, which can't be used to convert binary representations, being line oriented and not guaranteeing the end of line indication is returned as part of a read byte stream.

The example above can demonstrate character (byte) to 16 bit representation conversion:

Depending on any header for the .bin file you might need to skip to the first datum. You could provide a hex dump of the first part of your .bin file and describe how to find the first couple of datums and their values. an example of skipping beyond any header could be provided in this or another answer.

To distinguish the .bin file doesn't match the binary format of a file of type sample I wrote the read in values to a file named output:

library ieee; 
use IEEE.std_logic_1164.all; 
use IEEE.numeric_std.all; 
-- use std.textio.all;

entity read_binary_file is 
    generic ( clk_per : time := 8 ns ); 
end read_binary_file;

architecture behav of read_binary_file is
    subtype sample is signed(15 downto 0);
    -- type sample_file is file of sample;   
    type sample_file is file of CHARACTER;
    type save_file is file of sample;
    file output_file: save_file;
    file input_file : sample_file;
    signal input : signed(15 downto 0);
    
    -- OVERLOAD added:
    procedure READ (file f: sample_file; ivalue: out  sample) is
        variable retval: unsigned (sample'range);
        variable read_char:  CHARACTER;
    begin
        read (f, read_char);
    -- LITTLE ENDIAN:
        retval( 7 downto 0) := to_unsigned(CHARACTER'POS(read_char), 8);
        read (f, read_char);
        retval(15 downto 8) := to_unsigned(CHARACTER'POS(read_char), 8);
        ivalue := signed(retval);
    end procedure;
    
begin

read_file: 
    process
        variable input_data : sample;
    begin
        
        -- file_open(input_file, ".\HPGex2_Na22\data_chA_r00020.bin", read_mode);  -- ORIGINAL 
        file_open(input_file, "./x29.bin", read_mode);
        file_open(output_file, "./outfile", write_mode);
        
        while not endfile(input_file) loop
            read(input_file, input_data);
            write(output_file, input_data);
            input <= input_data;
            wait for clk_per;
        end loop;

        if endfile(input_file) then
            file_close(input_file);
            wait;
        end if;

    end process;
end behav;

And the contents were viewed with a hex editor:

The red underlines of first 6 + 2 values then 8 values corresponds to bytes in the x29.bin input file. Here 8 element values of the base type (unresolved_signed, the elements are std_ulogic) have their positional value recorded for each element of the subtype sample (left to right associative, either 15 downto 8 or 7 downto 0). In these values a '0' has a positional value of 02 while a '1' has a positional value of 03. By comparing the hex values from the x29.bin hex dump to the output hex dump elements (bits) you can see there is a correspondence in the data values being described.

It also points out there's an expectation the subtype declaration for sample doesn't change over the life of a binary file holding values of the type.

Writing an external conversion program to convert an original .bin file to a file of type sample (and using the VHDL code as originally posted in the question) would depend on knowing the datum size (15 downto 0) and byte endianness of the .bin file as well as any header information that needs to be skipped depending on the ABI to which the .bin file adheres(dangling participles are a pain to get rid of). Knowing the number of datums (e.g. words) the .bin file contains can be helpful as well.

Such an external conversion program would be type (subtype) dependent and for more complex composite types would be unique and represent more implementation effort than reading in some recognized .bin file format which could have ABI inferences like alignment to cache lines and the like.

There are tools that convert binary images to tables of values, where the table part could likely be readily converted to a form compatible with TEXTIO reading. While anything to do with file I/O is at risk of being non-portable between implementations by different vendors TEXTIO is somewhat portable for implementations compliant with the VHDL standard.