c++hdf5data-conversionbinary-data

HDF5 simple dataset only writing half of the elements


I have a larger data conversion utility this is a small piece of. It converts from an old format to memory, then writes memory to an HDF5 based output format. Below is a function, WriteFloatDataset, that as described, writes a float dataset. The first few lines show how it is called. It is writing the first half of the lines (always half) correctly, then garbage for the rest. The specification calls for floats rather than doubles. I have the double equivalent of this, and a character string version as well, work like a charm.

Update: code below has been updated with fix and other suggestions

...
    float* fArray = new float[ ppArrayCount ];
    for ( i = 0; i < ppArrayCount; i++ ) fArray[ i ] = this->ppArray[ i ].maop;
    MyDataset::WriteFloatDataset( fileid, dataspace, "/data/", "MAOP", fArray );
...


void MyDataset::WriteFloatDataset( 
   hid_t fileid,
   hid_t dataspace,
   const char* group,
   const char* name,
   float data[] )
{
   // HDF5 file and datagroup sized to arrayCountx1 should be open prior to calling
// This line was the problem, sizeof( H5T_NATIVE_FLOAT ) is 8, sizeof (float) is 4
//   hid_t datatype = H5Tcreate( H5T_COMPOUND, sizeof( H5T_NATIVE_FLOAT ) );
   hid_t datatype = H5Tcreate( H5T_COMPOUND, sizeof( float ) );
   H5Tinsert( datatype, name, 0, H5T_NATIVE_FLOAT );

   char datasetName[ TEMP_BUF_LEN ];
   
   strcpy_s( datasetName, TEMP_BUF_LEN, group );
   strcat_s( datasetName, TEMP_BUF_LEN, name );

   hid_t dataset = H5Dcreate( fileid, datasetName, datatype, dataspace, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
   H5Dwrite( dataset, datatype, dataspace, dataspace, H5P_DEFAULT, ( ( const void* )data ) );
   
   H5Dclose( dataset );
   H5Tclose( datatype );
}

enter image description here

As mentioned, this very similar function works great, it is something about the data type.

void POF110Dataset::WriteDoubleDataset(
   hid_t fileid,
   hid_t dataspace,
   const char* group,
   const char* name,
   double data[] )
{
   // HDF5 file and datagroup sized to arrayCountx1 should be open prior to calling
   hid_t datatype = H5Tcreate( H5T_COMPOUND, sizeof( double ) );
   H5Tinsert( datatype, name, 0, H5T_NATIVE_DOUBLE );

   char datasetName[ TEMP_BUF_LEN ];

   strcpy_s( datasetName, TEMP_BUF_LEN, group );
   strcat_s( datasetName, TEMP_BUF_LEN, name );

   hid_t dataset = H5Dcreate( fileid, datasetName, datatype, dataspace, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT );
   H5Dwrite( dataset, datatype, dataspace, dataspace, H5P_DEFAULT, ( ( const void* )data ) );

   H5Dclose( dataset );
   H5Tclose( datatype );
}

enter image description here

It occurred to be to try H5T_IEEE_F32LE instead of H5T_NATIVE_FLOAT. If that fixes it, I will answer my own question. Otherwise, any advice appreciated.


Solution

  • This is wrong:

    hid_t datatype = H5Tcreate( H5T_COMPOUND, sizeof( H5T_NATIVE_FLOAT ) );
    

    specifically the sizeof( H5T_NATIVE_FLOAT ). You are supposed to give HDF5 the size of the compound struct. You specify the size of the hid_t type for the H5T_NATIVE_FLOAT global constant.

    In the double case it accidentally works because sizeof(double) == sizeof( H5T_NATIVE_DOUBLE ) on your system.

    You can also see your mistake in the screenshots when you look at the Storage field. Both are 64464 bytes, even though both have the same logical dimensions while one stores 32 bit and the other one 64 bit.

    Change it to sizeof(float) and sizeof(double), respectively, if that is all you want to store. Normally, its use would be like

    struct CompoundType {
        float x;
        int y;
    };
    hid_t datatype = H5Tcreate( H5T_COMPOUND, sizeof( CompoundType ) );