optimization linear-algebra julia factorization levenberg-marquardt

A_ldiv_B! with sparse matrices

The following lines of code appear in levenberg-marquardt algorithm in the optimization package "Optim":

DtD = diagm(Float64[max(x, MIN_DIAGONAL) for x in sum(J.^2,1)])
delta_x = ( J'*J + sqrt(lambda)*DtD ) \ -J'*fcur

However, my questions has nothing to do with the algorithm or anything specific to the package. I guess it has more to do with linear algebra and factorization in base julia.

If I have a full matrix J, the following works:

In  [3]: n = 5
J = rand(n,n)
fcur = rand(n)
DtD = diagm(Float64[max(x, 1e-6) for x in sum(J.^2,1)])
( J'*J + sqrt(100)*DtD ) \ -J'*fcur

Out [3]: 5-element Array{Float64,1}:
 -0.0740316
 -0.0643279
 -0.0665033
 -0.10568  
 -0.0599613

However, if J is sparse, I get an error:

In  [4]: J = sparse(J)
DtD = diagm(Float64[max(x, 1e-6) for x in sum(J.^2,1)])
( J'*J + sqrt(100)*DtD ) \ -J'*fcur

Out [4]: ERROR: `A_ldiv_B!` has no method matching A_ldiv_B!(::Cholesky{Float64}, ::SparseMatrixCSC{Float64,Int64})
while loading In[4], in expression starting on line 3

 in \ at linalg/generic.jl:233

So as far as I understand (with my limited knowledge of julia as a beginner), this error occurs because julia tries to compute ( J'*J + sqrt(100)*DtD ) \ -J' first. My first question is how can I know what path julia is taking when implementing the above code? I am aware of @which but I don't know how to use it to get to A_ldiv_B! as this should start with \(A,B) and then somehow end up with A_ldiv_B! :

In  [6]: a = ( J'*J + sqrt(100)*DtD ); b = -J'; @which a\b

Out [6]: \(A::Union(SubArray{T,2,A<:DenseArray{T,N},I<:(Union(Range{Int64},Int64)...,)},DenseArray{T,2}),B::Union(SubArray{T,2,A<:DenseArray{T,N},I<:(Union(Range{Int64},Int64)...,)},SubArray{T,1,A<:DenseArray{T,N},I<:(Union(Range{Int64},Int64)...,)},DenseArray{T,2},DenseArray{T,1})) at linalg/dense.jl:409

Also note that:

In  [7]: typeof(a)

Out [7]: Array{Float64,2}

In  [8]: typeof(b)

Out [8]: Array{Float64,2}

Which makes this even more confusing, as there is no Cholesky type in the above. My second question is: how does the Cholesky type appear? The error message says: A_ldiv_B! has no method matching A_ldiv_B!(::Cholesky{Float64}, ::SparseMatrixCSC{Float64,Int64})

Another interesting point that I accidentally found was that if the sparse matrix is of size (2,2) the above error does not occur:

In  [9]: n = 2
J = sparse(rand(n,n))
fcur = rand(n)
DtD = diagm(Float64[max(x, 1e-6) for x in sum(J.^2,1)])
( J'*J + sqrt(100)*DtD ) \ -J'*fcur

Out [9]: 2-element Array{Float64,1}:
 -0.0397989
 -0.052129

Finally, I could solve this problem by putting -J'*fcur in parantheses, which seems to be the intention of the author anyway. But I am very confused. Any thoughts are greatly appreciated. Thanks!

In  [12]: n = 5
J = sparse(rand(n,n))
fcur = rand(n)
DtD = diagm(Float64[max(x, 1e-6) for x in sum(J.^2,1)])
( J'*J + sqrt(100)*DtD ) \ (-J'*fcur)

Out [12]: 5-element Array{Float64,1}:
 -0.0536266
 -0.0692286
 -0.0673166
 -0.0616349
 -0.0559813

Solution

It can be a bit tricky to figure out exactly which code path is taken when your are running into code that uses substitutions during parsing as is the case for '. You could try julia> ( J'*J + sqrt(100)*DtD ) \ -J'fcur to see another substitution taking place.

I don't know if it really is an answer to your question, but I'll note three things about your example.

Julia parses from left to right so I think the example should be read like (( J'*J + sqrt(100)*DtD ) \ ctranspose(-J))*fcur
We haven't implemented sparse right hand side in solves because even when b in Ax=b is sparse then typically the result is not sparse, so the gain from exploiting the sparseness of the b is modest.
The "right" way of solving the problem is to add the parentheses around (-Jfcur) because then the solution is a sparse matrix-vector multiplication and a sparse matrix-vector solve instead of sparse matrix-matrix solve and a dense matrix-vector multiplication.