I have a resnet
model which I am working with. I originally trained the model using batches of images. Now that it is trained, I want to do inference on a single image (224x224 with 3 color channels). However, when I pass the image to my model via model(imgs[:, :, :, 2])
I get:
DimensionMismatch("Rank of x and w must match! (3 vs. 4)")
Stacktrace:
[1] DenseConvDims(x::Array{Float32, 3}, w::Array{Float32, 4}; kwargs::Base.Iterators.Pairs{Symbol, Any, NTuple{4, Symbol}, NamedTuple{(:stride, :padding, :dilation, :groups), Tuple{Tuple{Int64, Int64}, Tuple{Int64, Int64}, Tuple{Int64, Int64}, Int64}}})
@ NNlib ~/.julia/packages/NNlib/P9BhZ/src/dim_helpers/DenseConvDims.jl:58
[2] (::Conv{2, 2, typeof(identity), Array{Float32, 4}, Vector{Float32}})(x::Array{Float32, 3})
@ Flux ~/.julia/packages/Flux/ZnXxS/src/layers/conv.jl:162
...
...
For reference, imgs[:, :, :, 2]
gives:
224×224×3 Array{Float32, 3}:
[:, :, 1] =
0.4 0.419608 0.482353 0.490196 … 0.623529 0.611765 0.627451
0.423529 0.478431 0.513726 0.486275 0.65098 0.65098 0.65098
0.419608 0.47451 0.541176 0.54902 0.682353 0.670588 0.639216
0.52549 0.529412 0.568627 0.564706 0.588235 0.592157 0.572549
0.556863 0.541176 0.513726 0.505882 0.603922 0.635294 0.654902
0.486275 0.490196 0.521569 0.537255 … 0.635294 0.654902 0.65098
0.529412 0.513726 0.533333 0.537255 0.603922 0.596078 0.596078
0.521569 0.52549 0.505882 0.513726 0.580392 0.576471 0.572549
...
...
Any idea what I am missing here? Does the model require the same dimensions during inference that it was trained on? IS there a way to check this to make sure I am giving the correct input dimensions?
Update: I realized that I need to pass in the number of images (which in this case is one), so I did:
img1 = cat(imgs[:, :, :, 1]; dims = ndims(imgs[:, :, :, 1]) + 1 )
img1
model(img1)
which works as expected. I'll leave this question open if anyone has an answer to the original questions about checking input dims.
As you discovered, NNlib.jl (the library that implements convolution for Flux) expects the input to have the batch dimension. If you are passing through a single image, you can pad a singleton dimension. There are several ways to achieve this.
First, the problem:
julia> size(x[:, :, :, 1])
(3, 3, 16)
A color image should be 4D (width by height by color/depth by batch).
One option to pad a batch dimension is to use a range for the index:
julia> size(x[:, :, :, 1:1])
(3, 3, 16, 1)
Another option if you are given an single image as a 3D array is to use Flux.unsqueeze
:
julia> size(Flux.unsqueeze(x[:, :, :, 1], ndims(x)))
(3, 3, 16, 1)
julia> all(Flux.unsqueeze(x[:, :, :, 1], ndims(x)) .== x[:, :, :, 1])
true
We pass unsqueeze
the dimension we want to pad as an argument. In our case, this should be the last dimension of x
(i.e. the batch dimension) which we can get in Julia with ndims(x)
.