I have selected specific hdf5 datasets and want to copy them to a new hdf5 file. I could find some tutorials on copying between two files, but what if you have just created a new file and you want to copy datasets to the file? I thought the way below would work, but it doesn't. Are there any simple ways to do this?
>>> dic_oldDataset['old_dataset']
<HDF5 dataset "old_dataset": shape (333217,), type "|V14">
>>> new_file = h5py.File('new_file.h5', 'a')
>>> new_file.create_group('new_group')
>>> new_file['new_group']['new_dataset'] = dic_oldDataset['old_dataset']
RuntimeError: Unable to create link (interfile hard links are not allowed)
Answer 1 (using h5py):
This creates a simple structured array to populate the first dataset in the first file.
The data is then read from that dataset and copied to the second file using my_array
.
import h5py, numpy as np
arr = np.array([(1,'a'), (2,'b')],
dtype=[('foo', int), ('bar', 'S1')])
print (arr.dtype)
h5file1 = h5py.File('test1.h5', 'w')
h5file1.create_dataset('/ex_group1/ex_ds1', data=arr)
print (h5file1)
my_array=h5file1['/ex_group1/ex_ds1']
h5file2 = h5py.File('test2.h5', 'w')
h5file2.create_dataset('/exgroup2/ex_ds2', data=my_array)
print (h5file2)
h5file1.close()
h5file2.close()