pythonnumpyhdf5h5pynastran

Replace table in HDF5 file with a modified table


I have an existing HDF5 file with multiple tables. I want to modify this HDF5 file: in one of the tables I want to drop some rows entirely, and modify values in the remaining rows.

I tried the following code:

import h5py
import numpy as np

with h5py.File("my_file.h5", "r+") as f:
    # Get array
    table = f["/NASTRAN/RESULT/ELEMENTAL/STRESS/QUAD4_COMP_CPLX"]
    arr = np.array(table)
    
    # Modify array
    arr = arr[arr[:, 1] == 2]
    arr[:, 1] = 1

    # Write array back
    table[...] = arr

This code however results in the following error when run:

Traceback (most recent call last):

  File "C:\_Work\test.py", line 10, in <module>
    arr[arr[:, 1] == 2]

IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

So one of the problems seems to be that the numpy array arr that I've created is not a two-dimensional array. However I'm not sure exactly how to create a two-dimensional array out of the HDF5 table (or whether that is even the best approach here).

Would anyone here be able to help put me on the right path?

Edit

Output from h5dump on my dataset is as follows

HDF5 "C:\_Work\my_file.h5" {
DATASET "/NASTRAN/RESULT/ELEMENTAL/STRESS/QUAD4_COMP_CPLX" {
   DATATYPE  H5T_COMPOUND {
      H5T_STD_I64LE "EID";
      H5T_STD_I64LE "PLY";
      H5T_IEEE_F64LE "X1R";
      H5T_IEEE_F64LE "Y1R";
      H5T_IEEE_F64LE "T1R";
      H5T_IEEE_F64LE "L1R";
      H5T_IEEE_F64LE "L2R";
      H5T_IEEE_F64LE "X1I";
      H5T_IEEE_F64LE "Y1I";
      H5T_IEEE_F64LE "T1I";
      H5T_IEEE_F64LE "L1I";
      H5T_IEEE_F64LE "L2I";
      H5T_STD_I64LE "DOMAIN_ID";
   }
   DATASPACE  SIMPLE { ( 990 ) / ( H5S_UNLIMITED ) }
   ATTRIBUTE "version" {
      DATATYPE  H5T_STD_I64LE
      DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
   }
}
}

Solution

  • This answer is specifically focused on OP's request in comments to "throw away all rows where the value for PLY is not 2. Then in the remaining rows change the value for PLY from 2 to 1".

    The procedure is relatively straight-forward...if you know the tricks. Steps are summarized here, with matching comments in the code:

    1. Created stress dataset object (but don't extract to an array).
    2. Rename/move original output dataset to a saved name (not req'd but good practice)
    3. Create a new stress array by extracting row indices where PLY==2. This is the most sophisticated step. np.nonzero() returns row indices that match the condition stress_arr['PLY']==2, then uses them as indices to slice values from the array.
    4. Modify all rows in the new array from PLY ID 2 to 1
    5. Save the new array to a dataset with the original name

    Code below:

    with h5py.File('quad4_comp_cplx_test.h5', 'r+') as h5f:
        # Create stress dataset object
        stress_ds = h5f['/NASTRAN/RESULT/ELEMENTAL/STRESS/QUAD4_COMP_CPLX']
        ## stress array below not reqd
        ## stress_arr = stress_ds[()]  
        print(stress_ds.shape)   
    
        # Rename/move original output dataset to saved name 
        h5f.move('/NASTRAN/RESULT/ELEMENTAL/STRESS/QUAD4_COMP_CPLX',\
                 '/NASTRAN/RESULT/ELEMENTAL/STRESS/QUAD4_COMP_CPLX_save')
    
        # Slice a stress array from dataset using indices where PLY==2   
        # modified reference from stress_arr to stress_ds
        ## mod_stress_arr = stress_arr[np.nonzero(stress_arr['PLY']==2)] 
        mod_stress_arr = stress_ds[np.nonzero(stress_ds['PLY']==2)]
        print(mod_stress_arr.shape) 
    
        # Modify PLY ID from 2 to 1 for all rows
        mod_stress_arr[:]['PLY'] = 1
            
        # Finally, save the ply stress array to a dataset with the original name
        h5f.create_dataset('/NASTRAN/RESULT/ELEMENTAL/STRESS/QUAD4_COMP_CPLX', 
                                        data=mod_stress_arr)