QR decomposition successful but has high errors

I have two implementations of QR decomposition in a Fortran code. One is a custom QR decomposition function:

  ! QR decomposition solving for vector B in AX = B
  subroutine qr_eqsystem(A, B, X, M, N, q, r, error)
  
  implicit none
  !CALL qr_eqsystem(A, C(:,i,j), B, GNDINT, NEQ, IERR)
  integer, intent(in) :: m, n
  real, intent(in) :: A(m, n), B(m)
  
  real, intent(out) :: x(n),q(m, n),r(n, n)
  real, intent(out) :: error
  
  !locals
  real :: aux
  integer :: i, j, k
  
  do i = 1,n
    do k = 1,m
      q(k, i) = A(k, i)
    enddo
  enddo
  
  do i = 1,n
    do j = 1,n
      r(j, i) = 0.0
    enddo
  enddo
  
  do i = 1,n
    do j = 1,i-1
      r(j, i) = 0.0
      do k = 1,m
        r(j, i) = r(j, i) + q(k, j)*A(k, i)
      enddo
      
      do k = 1,m
        q(k, i) = q(k, i) - r(j, i)*q(k,j)
      enddo
    enddo
    
    r(i, i) = 0.0
    do k = 1,m
      r(i, i) = r(i, i) + q(k, i)*q(k, i)
    enddo
    r(i, i) = sqrt(r(i, i))
    
    if(r(i,i).ne.0.0)then
      do k = 1,m
        q(k, i) = q(k, i) / r(i, i)
      enddo
    endif
    
  enddo
  
  do i = 1,n
    x(i) = 0.0
    do j = 1,m
      x(i) = x(i) + q(j, i)*B(j)
    enddo
  enddo
  
  if(r(n, n).ne.0.0)then
    x(n) = x(n) / r(n,n)
  endif
  
  do i = n-1,1,-1
    do j = i+1,n
      x(i) = x(i) - r(i, j) * x(j)
    enddo
    
    if(r(i, i).ne.0.0)then
       x(i) = x(i) / r(i, i)
    endif
  enddo
  
  error = 0
  
  ! calculate AX and compare with B as an error, report total raw error
  error = sum(abs(matmul(A,X)-B))
  endsubroutine qr_eqsystem

The second one uses Intel Fortran's built-in LAPACK with dgeqrf, dormqr, and dtrtrs:

  q = A
  call dgeqrf(totPoints,ncol,q,totPoints,tau,work,1280,info)
  Call dormqr('L','T',totPoints,1,ncol,q,totPoints,tau,C,
#               totPoints,work,1280,info)
  x = C
  Call dtrtrs('U','N','N',ncol,1,q,totPoints,x,totPoints,info)
  error = sum(abs(matmul(A,x(1:ncol))-C))

The above codes solve for x in the system of equations Ax = B, where the size of A is 125x21, B is 125x1, and x is 21x1. During the simulation, the matrix A is relatively constant, as it is constructed from spatial coordinates, but vector B evolves over time, where the magnitude of the elements in B and the difference between the largest and smallest elements grow larger over time. Both codes are implemented in a Fortran code for debugging, where the custom implementation is the main solver, and when the error exceeds a high value, such as 10.0, the code calls the Intel subroutine to check.

The values of error in both codes return very low values (<1e-5) at the start of the simulation, but error increases to high magnitudes as the simulation continues and doesn't seem to have a saturated value, and I don't understand why it happens. I checked the decomposed Q and R matrices: Q^T*Q returns an identity matrix, and Q*R gives back A, within reasonable accuracy. From what I understand of the QR decomposition approach, since A is successfully decomposed, x should always be solvable, since it just involves a simple Q^T*B calculation, and the back substitution of Rx = Q^T*B to solve x should always be possible since all values are real and my variables are sufficiently large (8-byte with flag real-size:64) to not cause significant floating point errors.

Can anyone give me some more insight on why the QR decomposition of Ax = B might give a large error when evaluating A*xp - B, where xp is the solved x from the process?

Additionally, I checked with MATLAB's x = A\B. Interestingly, both the Intel Fortran dgeqrf and MATLAB's solutions give higher errors than the first custom implementation, with MATLAB's process giving the largest error, despite the decomposed Q and R matrices from all 3 approaches are virtually identical to each other.

Code for the A matrix. This is written in fixed form Fortran with extension .for. gridCoords is simply generated by mapping a 5x5x5 grid of points in range [-1 1] to a range corresponding to an object with size on the magnitude of 1e-5, such as [-2.5e-5 2.5e-5] for an object with 50 micron in diameter.

  subroutine defineA(ix,iy,iz,A,ipt)
  
  implicit none
  
  integer, intent(in) :: ix,iy,iz,ipt
  real, intent(out) :: A(totPoints,ncol)
  
  if (iInterp.ge.1) then
    A(ipt,1) = 1
    A(ipt,2) = gridCoords(1,ix)
    A(ipt,3) = gridCoords(2,iy)
    A(ipt,4) = gridCoords(3,iz)
    A(ipt,5) = gridCoords(1,ix)*gridCoords(2,iy)
    A(ipt,6) = gridCoords(1,ix)*gridCoords(3,iz)
    A(ipt,7) = gridCoords(2,iy)*gridCoords(3,iz)
    A(ipt,8) = gridCoords(1,ix)*gridCoords(2,iy)*gridCoords(3,iz)
    if (iInterp.ge.2) then
      A(ipt,9) = gridCoords(1,ix)*gridCoords(1,ix)
      A(ipt,10) = gridCoords(2,iy)*gridCoords(2,iy)
      A(ipt,11) = gridCoords(3,iz)*gridCoords(3,iz)
      A(ipt,12) = gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(2,iy)
      A(ipt,13) = gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(3,iz)
      A(ipt,14) = gridCoords(1,ix)*gridCoords(2,iy)*gridCoords(2,iy)
      A(ipt,15) = gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(3,iz)
      A(ipt,16) = gridCoords(1,ix)*gridCoords(3,iz)*gridCoords(3,iz)
      A(ipt,17) = gridCoords(2,iy)*gridCoords(3,iz)*gridCoords(3,iz)
      A(ipt,18) = gridCoords(1,ix)*gridCoords(1,ix)
 #                        *gridCoords(2,iy)*gridCoords(2,iy)
      A(ipt,19) = gridCoords(1,ix)*gridCoords(1,ix)
 #                        *gridCoords(3,iz)*gridCoords(3,iz)
      A(ipt,20) = gridCoords(2,iy)*gridCoords(2,iy)
 #                        *gridCoords(3,iz)*gridCoords(3,iz)
      A(ipt,21) = gridCoords(1,ix)*gridCoords(1,ix)
 #                        *gridCoords(2,iy)*gridCoords(3,iz)
      A(ipt,22) = gridCoords(1,ix)*gridCoords(2,iy)
 #                        *gridCoords(2,iy)*gridCoords(3,iz)
      A(ipt,23) = gridCoords(1,ix)*gridCoords(2,iy)
 #                        *gridCoords(3,iz)*gridCoords(3,iz)
      A(ipt,24) = gridCoords(1,ix)*gridCoords(1,ix)
 #                        *gridCoords(2,iy)*gridCoords(2,iy)
 #                        *gridCoords(3,iz)
      A(ipt,25) = gridCoords(1,ix)*gridCoords(1,ix)
 #                        *gridCoords(2,iy)*gridCoords(3,iz)
 #                        *gridCoords(3,iz)
      A(ipt,26) = gridCoords(1,ix)*gridCoords(2,iy)
 #                        *gridCoords(2,iy)*gridCoords(3,iz)
 #                        *gridCoords(3,iz)
      A(ipt,27) = gridCoords(1,ix)*gridCoords(1,ix)
 #                        *gridCoords(2,iy)*gridCoords(2,iy)
 #                        *gridCoords(3,iz)*gridCoords(3,iz)
      if (iInterp.ge.3) then
        A(ipt,28)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
        A(ipt,29)=gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
        A(ipt,30)=gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        
        A(ipt,31)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)
        A(ipt,32)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(3,iz)
        A(ipt,33)=gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
        
        A(ipt,34)=gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)
        A(ipt,35)=gridCoords(1,ix)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        A(ipt,36)=gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        
        A(ipt,37)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)
 #               *gridCoords(3,iz)
        A(ipt,38)=gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)
        A(ipt,39)=gridCoords(1,ix)
 #               *gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        
        A(ipt,40)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)
        A(ipt,41)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(3,iz)*gridCoords(3,iz)
        A(ipt,42)=gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
        
        A(ipt,43)=gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)
        A(ipt,44)=gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        A(ipt,45)=gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        
        A(ipt,46)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)
        A(ipt,47)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)
        A(ipt,48)=gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)
        
        A(ipt,49)=gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)
        A(ipt,50)=gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        A(ipt,51)=gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        
        A(ipt,52)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)
        A(ipt,53)=gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)
        A(ipt,54)=gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        
        A(ipt,55)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
        A(ipt,56)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        A(ipt,57)=gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        
        A(ipt,58)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)
        A(ipt,59)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        A(ipt,60)=gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        
        A(ipt,61)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)
        A(ipt,62)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        A(ipt,63)=gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
        
        A(ipt,64)=gridCoords(1,ix)*gridCoords(1,ix)*gridCoords(1,ix)
 #               *gridCoords(2,iy)*gridCoords(2,iy)*gridCoords(2,iy)
 #               *gridCoords(3,iz)*gridCoords(3,iz)*gridCoords(3,iz)
      endif
    endif
  endif
  
  endsubroutine defineA

Solution

I have two implementations of QR decomposition... The above codes solve for x in the system of equations Ax = B, where the size of A is 125x21, B is 125x1, and x is 21x1...

Note that the number of equations in your system exceeds the number of variables, so there is no exact solution. And the solution of your Ax = B system, the resulting solution of the Rx = Q^T*B system, will be such an x for which the minimum functional (B - Ax)*(B - Ax) is achieved. In other words, it is a solution to the least squares problem.

The values of error in both codes return very low values (<1e-5) at the start of the simulation, but error increases to high magnitudes as the simulation continues and doesn't seem to have a saturated value, and I don't understand why it happens.

In the previous question, you wrote: "The QR decomposition is used to solve the coefficients used in the quadratic interpolation, adapted from trilinear...". I don't really understand how you have a function with 21 coefficients: f(x,y,z) = a_0 + a_1x + a_2y + a_3*z+ ....

But, it seems quite likely that during the simulation process, the objective function tends to a form that cannot be well represented by the function f(x,y,z) with 21 coefficients. Therefore, the solution of your system starts to have a big residual.

Additionally, I checked with MATLAB's x = A\B. Interestingly, both the Intel Fortran dgeqrf and MATLAB's solutions give higher errors than the first custom implementation, with MATLAB's process giving the largest error, despite the decomposed Q and R matrices from all 3 approaches are virtually identical to each other.

Since you are comparing error = sum(abs(matmul(A,X)-B)), but qr_eqsystem(), Intel Fortran dgeqrf and MATLAB, minimize error2 = DOT_PRODUCT(matmul(A,X)-B, matmul(A,X)-B), this comparison says nothing about the quality of these three solutions. It is quite possible that a more accurate solution to the least squares problem (in the L2 norm) has a worse residual in the L1 norm (the sum of the absolute values of deviations).

If there is no physical reason to use the L1 norm, it is better to use the L2 norm.

During the simulation, the matrix A is relatively constant, as it is constructed from spatial coordinates...

If A is a constant, then cond(A) = cond(R) ∼ max(abs(diagonal(R)))/min(abs(diagonal(R))) it's also a constant. Accordingly, if the problem is solved well at the start of the simulation, then the reason is most likely not the accuracy of the matrix, its decomposition or its solution.

Summary, since you write "relatively constant", it is still worth looking at the change in cond(A) during simulation.

If cond(A) > 10¹⁰, at the beginning or during the simulation, then there may be a problem in the accuracy of the solution and/or perhaps a coordinate problem of the function f(x,y,z);
If cond(A) is always < 10⁶, then the problem is definitely that the shape of your f(x,y,z) does not match the model. Then it needs to be replaced with something else, maybe with multidimensional B-splines, maybe with something more well-suited to the physics of the problem.