juliaflux.jl

No decrease in loss after lots traning


I was trying to train a convolution network. But it is not improving, i.e. loss is not decreasing. And the train function is also terminating much more quickly than usual. Below is the minimal code to show the problem.

using Flux

data=rand(200, 100, 1, 50)

label=rand([0.0,1.0], 1, 50)

model=Chain(

   Conv((3,3), 1=>5, pad=(1,1)),
   MaxPool((2,2)),
   Conv((3,3), 5=>5, pad=(1,1)),
   MaxPool((2,2)),
   Conv((3,3), 5=>5, pad=(1,1)),
   MaxPool((2,2)),
   x->reshape(x, :, size(x, 4)),
   x->σ.(x),
   Dense(1500,100),
   Dense(100,1)
)

model(data)

loss=Flux.mse

opt=Descent(0.1)

param=params(model)

loss(model(data), label)  #=>0.3492440767136241

Flux.train!(loss, param, zip(data, label), opt)

loss(model(data), label)  #=>0.3492440767136241

Solution

  • The first argument to Flux.train! needs to be function which accepts the data, runs the model, and returns the loss. Its loop looks something like this:

    for dtup in zip(data, label)
        gradient(() -> loss(dtup...), params)
        ...
    end
    

    But the function loss you provide doesn't call the model at all, it just compares the data point to the label directly.

    There is more to fix here though. What's being iterated over is tuples of numbers, starting with zip(data, label) |> first, which I don't think is what you want. Maybe you wanted Flux.DataLoader to iterate batches of images?