rdatasetrhdf5

How can we have unlimited dimensions for a dataset in HDF5 File using RHDF5 package?


Though it might seem similar to an already question: is-it-possible-to-update-dataset-dimensions-in-hdf5-file-using-rhdf5-in-r but they are not exactly same.

In RHDF5 documentation, it is written that we can define the maximum dimensions that a dataset can have at the time dataset is created using h5createDataset() (using maxdims parameter). But what if we don't know the dimensions of the dataset beforehand e.g. we might have a situation that the size of the dataset is continuously increasing and so, we have no idea about the maximum dimensions that a dataset can attain.

In the answer to the question mentioned above, it was mentioned that it can be done with the help of dataspace and HDF5 constants.

Can anyone please give some idea regarding how HDF5 constants and dataspace can be used to do so?


Solution

  • While doing some experimentation with the function h5createDataset(), I have found a way of doing this:

    > library(rhdf5)
    
    > fid <- H5Fcreate('test.h5')
    
    > h5createGroup(fid,'1')
    [1] TRUE
    
    > h5createDataset(fid,'1/1',dims = c(2,2,2),maxdims = c(Inf,Inf,Inf))
    [1] TRUE
    Warning message:
      In H5Screate_simple(dims, maxdims) :
      NAs introduced by coercion to integer range
    
    > arr <- array(c(1:8),c(2,2,2))
    
    > h5write(arr,fid,'1/1')
    
    > h5read(fid,'1/1')
    , , 1
    
         [,1] [,2]
    [1,]    1    3
    [2,]    2    4
    
    , , 2
    
         [,1] [,2]
    [1,]    5    7
    [2,]    6    8
    

    Please correct me if I am wrong somewhere or if there is a better method to do so.