I am fairly new to Tensorflow and I am having trouble with Dataset. I work on Windows 10, and the Tensorflow version is 2.6.0 used with CUDA. I have 2 numpy arrays that are X_train and X_test (already split). The train is 5Gb and the test is 1.5Gb. The shapes are:
X_train: (259018, 30, 30, 3), <class 'numpy.ndarray'>
Y_train: (259018, 1), <class 'numpy.ndarray'>
I create Datasets using the following code:
dataset_train = tf.data.Dataset.from_tensor_slices((X_train , Y_train)).batch(BATCH_SIZE)
And BATCH_SIZE = 32.
But I cannot create a Dataset, I get the following error:
2021-09-02 15:26:35.429930: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-09-02 15:26:35.772235: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3495 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3060 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6
2021-09-02 15:26:36.414627: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 2700000000 exceeds 10% of free system memory.
2021-09-02 15:26:47.146977: W tensorflow/core/common_runtime/bfc_allocator.cc:457] Allocator (GPU_0_bfc) ran out of memory trying to allocate 607.1KiB (rounded to 621824)requested by op _EagerConst
If the cause is memory fragmentation maybe the environment variable 'TF_GPU_ALLOCATOR=cuda_malloc_async' will improve the situation.
Current allocation summary follows.
Current allocation summary follows.
2021-09-02 15:26:47.147299: I tensorflow/core/common_runtime/bfc_allocator.cc:1004] BFCAllocator dump for GPU_0_bfc
2021-09-02 15:26:47.147383: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (256): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.147514: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (512): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.147636: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (1024): Total Chunks: 1, Chunks in use: 1. 1.2KiB allocated for chunks. 1.2KiB in use in bin. 1.0KiB client-requested in use in bin.
2021-09-02 15:26:47.147761: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (2048): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.147905: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (4096): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.148040: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (8192): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.148157: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (16384): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.148276: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (32768): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.148402: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (65536): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.148518: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (131072): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.148645: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (262144): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.148786: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (524288): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.148918: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (1048576): Total Chunks: 1, Chunks in use: 1. 1.91MiB allocated for chunks. 1.91MiB in use in bin. 1.91MiB client-requested in use in bin.
2021-09-02 15:26:47.149079: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (2097152): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.149212: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (4194304): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.149342: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (8388608): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.149477: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (16777216): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.164471: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (33554432): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.164619: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (67108864): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.164765: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (134217728): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-02 15:26:47.164884: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (268435456): Total Chunks: 2, Chunks in use: 2. 3.41GiB allocated for chunks. 3.41GiB in use in bin. 3.30GiB client-requested in use in bin.
2021-09-02 15:26:47.164982: I tensorflow/core/common_runtime/bfc_allocator.cc:1027] Bin for 607.2KiB was 512.0KiB, Chunk State:
2021-09-02 15:26:47.165040: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] Next region of size 3665166336
2021-09-02 15:26:47.165106: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at b0e200000 of size 2700000000 next 1
2021-09-02 15:26:47.165159: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at baf0ebb00 of size 1280 next 2
2021-09-02 15:26:47.165208: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at baf0ec000 of size 2000128 next 3
2021-09-02 15:26:47.165250: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at baf2d4500 of size 963164928 next 18446744073709551615
2021-09-02 15:26:47.165297: I tensorflow/core/common_runtime/bfc_allocator.cc:1065] Summary of in-use Chunks by size:
2021-09-02 15:26:47.165341: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 1 Chunks of size 1280 totalling 1.2KiB
2021-09-02 15:26:47.165382: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 1 Chunks of size 2000128 totalling 1.91MiB
2021-09-02 15:26:47.165426: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 1 Chunks of size 963164928 totalling 918.54MiB
2021-09-02 15:26:47.165470: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 1 Chunks of size 2700000000 totalling 2.51GiB
2021-09-02 15:26:47.165514: I tensorflow/core/common_runtime/bfc_allocator.cc:1072] Sum Total of in-use chunks: 3.41GiB
2021-09-02 15:26:47.165558: I tensorflow/core/common_runtime/bfc_allocator.cc:1074] total_region_allocated_bytes_: 3665166336 memory_limit_: 3665166336 available bytes: 0 curr_region_allocation_bytes_: 7330332672
2021-09-02 15:26:47.165633: I tensorflow/core/common_runtime/bfc_allocator.cc:1080] Stats:
Limit: 3665166336
InUse: 3665166336
MaxInUse: 3665166336
NumAllocs: 4
MaxAllocSize: 2700000000
Reserved: 0
PeakReserved: 0
LargestFreeBlock: 0
2021-09-02 15:26:47.165771: W tensorflow/core/common_runtime/bfc_allocator.cc:468] *************************************************************************************************xxx
Traceback (most recent call last):
File "C:/Users/headl/Documents/github projects/datascience/DL_model_deep_insight.py", line 100, in <module>
dataset_train, dataset_test = prepare_tf_dataset(path_to_x_train, config.y_train_combined,
File "C:/Users/headl/Documents/github projects/datascience/DL_model_deep_insight.py", line 28, in prepare_tf_dataset
dataset_test = tf.data.Dataset.from_tensor_slices((X_test , Y_test)).batch(BATCH_SIZE)
File "C:\Users\headl\Documents\virtual_env\datascience\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 685, in from_tensor_slices
return TensorSliceDataset(tensors)
File "C:\Users\headl\Documents\virtual_env\datascience\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 3844, in __init__
element = structure.normalize_element(element)
File "C:\Users\headl\Documents\virtual_env\datascience\lib\site-packages\tensorflow\python\data\util\structure.py", line 129, in normalize_element
ops.convert_to_tensor(t, name="component_%d" % i, dtype=dtype))
File "C:\Users\headl\Documents\virtual_env\datascience\lib\site-packages\tensorflow\python\profiler\trace.py", line 163, in wrapped
return func(*args, **kwargs)
File "C:\Users\headl\Documents\virtual_env\datascience\lib\site-packages\tensorflow\python\framework\ops.py", line 1566, in convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "C:\Users\headl\Documents\virtual_env\datascience\lib\site-packages\tensorflow\python\framework\tensor_conversion_registry.py", line 52, in _default_conversion_function
return constant_op.constant(value, dtype, name=name)
File "C:\Users\headl\Documents\virtual_env\datascience\lib\site-packages\tensorflow\python\framework\constant_op.py", line 271, in constant
return _constant_impl(value, dtype, shape, name, verify_shape=False,
File "C:\Users\headl\Documents\virtual_env\datascience\lib\site-packages\tensorflow\python\framework\constant_op.py", line 283, in _constant_impl
return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
File "C:\Users\headl\Documents\virtual_env\datascience\lib\site-packages\tensorflow\python\framework\constant_op.py", line 308, in _constant_eager_impl
t = convert_to_eager_tensor(value, ctx, dtype)
File "C:\Users\headl\Documents\virtual_env\datascience\lib\site-packages\tensorflow\python\framework\constant_op.py", line 106, in convert_to_eager_tensor
return ops.EagerTensor(value, ctx.device_name, dtype)
tensorflow.python.framework.errors_impl.InternalError: Failed copying input tensor from /job:localhost/replica:0/task:0/device:CPU:0 to /job:localhost/replica:0/task:0/device:GPU:0 in order to run _EagerConst: Dst tensor is not initialized.
Process finished with exit code 1
There seems to be a problem of running out of GPU memory, and indeed, when I follow this process in the Windows task manager I can see a peak in GPU usage just before the script dies. I tried to use only some part of the X_train. I can create a Dataset up to X_train[:240000]. When I add more rows after that, the error appears. I thought that the Tensorflow Dataset is a generator that was supposed to take care of the memory problem, along with batches? Also, reducing the batch size did not have any effect. I also tried to do the suggested 'TF_GPU_ALLOCATOR=cuda_malloc_async' but it didn't work neither.
What can I do to load the whole data?
Thank you very much in advance!
That's working as designed. from_tensor_slices is really only useful for small amounts of data. Dataset is designed for large datasets that need to be streamed from disk.
The hard way but ideal way to do this would be to write your numpy array data to TFRecords then read them in as a dataset via TFRecordDataset. Here's the guide.
https://www.tensorflow.org/tutorials/load_data/tfrecord
The easier way but less performant way to do this would be Dataset.from_generator. Here is a minimal example:
>>> ds = tf.data.Dataset.from_generator(lambda: np.arange(100), output_signature=tf.TensorSpec(shape=(), dtype=tf.int32))
>>> for d in ds:
... print(d)
...
tf.Tensor(0, shape=(), dtype=int32)
tf.Tensor(1, shape=(), dtype=int32)
...