My goal is to run a 1D Convolutional Neural Network in Julia on the NVIDIA GPU, with Flux and CUDA. I managed to make a model and train it on the CPU but it will not work on the GPU. Have a look at the working example:
import Pkg
Pkg.activate("env")
using CUDA
using Random
using Flux
# Show info about CUDA
CUDA.versioninfo()
CUDA.device()
# Make training data and labels
train = rand(Float32, 720, 1, 100000)
labels = rand(Bool, 100000)
hot_labels = Flux.onehotbatch(labels, 0:1)
# Make the 1D CNN model
model = Chain(
Conv((16,), 1=>16, relu, stride = 1, pad = 0,),
MaxPool((2,)),
Conv((8,), 16=>32, relu, stride = 1, pad = 0,),
MaxPool((2,)),
Flux.flatten,
Dense(5504=>512, relu),
Dropout(0.4),
Dense(512=>256, relu),
Dropout(0.4),
Dense(256=>2, relu),
softmax
)
optimizer = Flux.setup(Adam(), model)
# Copy the model and data to GPU
gpu_model = gpu(model)
gpu_train = gpu(train)
println("Try to use the cpu model with cpu data")
output = model(train)
println("Success")
println("Try to use the gpu model with gpu data")
gpu_output = gpu_model(gpu_train)
println("Success")
exit(0)
The output is this:
CUDA runtime 12.1, artifact installation
CUDA driver 12.1
NVIDIA driver 530.30.2
CUDA libraries:
- CUBLAS: 12.1.3
- CURAND: 10.3.2
- CUFFT: 11.0.2
- CUSOLVER: 11.4.5
- CUSPARSE: 12.1.0
- CUPTI: 18.0.0
- NVML: 12.0.0+530.30.2
Julia packages:
- CUDA.jl: 4.3.2
- CUDA_Driver_jll: 0.5.0+1
- CUDA_Runtime_jll: 0.6.0+0
- CUDA_Runtime_Discovery: 0.2.2
Toolchain:
- Julia: 1.9.1
- LLVM: 14.0.6
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5
- Device capability support: sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86
2 devices:
0: NVIDIA RTX A6000 (sm_86, 47.533 GiB / 47.988 GiB available)
1: NVIDIA RTX A6000 (sm_86, 47.533 GiB / 47.988 GiB available)
Try to use the cpu model with cpu data
Success
Try to use the gpu model with gpu data
ERROR: LoadError: CUDNNError: CUDNN_STATUS_NOT_SUPPORTED (code 9)
Stacktrace:
[1] throw_api_error(res::cuDNN.cudnnStatus_t)
@ cuDNN ~/.julia/packages/cuDNN/3J08S/src/libcudnn.jl:11
[2] check
@ ~/.julia/packages/cuDNN/3J08S/src/libcudnn.jl:21 [inlined]
[3] cudnnPoolingForward
@ ~/.julia/packages/CUDA/pCcGc/lib/utils/call.jl:26 [inlined]
[4] #cudnnPoolingForwardAD#679
@ ~/.julia/packages/cuDNN/3J08S/src/pooling.jl:90 [inlined]
[5] cudnnPoolingForwardAD
@ ~/.julia/packages/cuDNN/3J08S/src/pooling.jl:89 [inlined]
[6] #cudnnPoolingForwardWithDefaults#678
@ ~/.julia/packages/cuDNN/3J08S/src/pooling.jl:57 [inlined]
[7] #cudnnPoolingForward!#677
@ ~/.julia/packages/cuDNN/3J08S/src/pooling.jl:36 [inlined]
[8] cudnnPoolingForward!
@ ~/.julia/packages/cuDNN/3J08S/src/pooling.jl:36 [inlined]
[9] maxpool!(y::CuArray{Float32, 4, CUDA.Mem.DeviceBuffer}, x::CuArray{Float32, 4, CUDA.Mem.DeviceBuffer}, pdims::PoolDims{2, 2, 2, 4, 2})
@ NNlibCUDA ~/.julia/packages/NNlibCUDA/C6t0p/src/cudnn/pooling.jl:16
[10] maxpool!
@ ~/.julia/packages/NNlibCUDA/C6t0p/src/cudnn/pooling.jl:54 [inlined]
[11] maxpool(x::CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, pdims::PoolDims{1, 1, 1, 2, 1}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ NNlib ~/.julia/packages/NNlib/Fg3DQ/src/pooling.jl:119
[12] maxpool
@ ~/.julia/packages/NNlib/Fg3DQ/src/pooling.jl:114 [inlined]
[13] MaxPool
@ ~/.julia/packages/Flux/n3cOc/src/layers/conv.jl:697 [inlined]
[14] macro expansion
@ ~/.julia/packages/Flux/n3cOc/src/layers/basic.jl:53 [inlined]
[15] _applychain(layers::Tuple{Conv{1, 2, typeof(relu), CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, MaxPool{1, 2}, Conv{1, 2, typeof(relu), CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, MaxPool{1, 2}, typeof(Flux.flatten), Dense{typeof(relu), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dropout{Float64, Colon, CUDA.RNG}, Dense{typeof(relu), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dropout{Float64, Colon, CUDA.RNG}, Dense{typeof(relu), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, typeof(softmax)}, x::CuArray{Float32, 3, CUDA.Mem.DeviceBuffer})
@ Flux ~/.julia/packages/Flux/n3cOc/src/layers/basic.jl:53
[16] (::Chain{Tuple{Conv{1, 2, typeof(relu), CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, MaxPool{1, 2}, Conv{1, 2, typeof(relu), CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, MaxPool{1, 2}, typeof(Flux.flatten), Dense{typeof(relu), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dropout{Float64, Colon, CUDA.RNG}, Dense{typeof(relu), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dropout{Float64, Colon, CUDA.RNG}, Dense{typeof(relu), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, typeof(softmax)}})(x::CuArray{Float32, 3, CUDA.Mem.DeviceBuffer})
@ Flux ~/.julia/packages/Flux/n3cOc/src/layers/basic.jl:51
[17] top-level scope
@ ~/julia/test.jl:49
Newest CUDA version is in path, CUDNN isn't. All packages in Julia have been re-installed after the changes. Any ideas how to fix this?
This isn't really a Julia problem. This error is raised when your memory isn't enough for the batch size you are trying to train at.
I suggest lowering your batch size or, in your case, the number of training data points (since you are doing it all in one batch).