In Haskell, ridge regression can be expressed as:
import Numeric.LinearAlgebra
createReadout :: Matrix Double → Matrix Double → Matrix Double
createReadout a b = oA <\> oB
where
μ = 1e-4
oA = (a <> (tr a)) + (μ * (ident $ rows a))
oB = a <> (tr b)
However, this operation is very memory expensive. Here is a minimalistic example that requires more than 2GB on my machine and takes 3 minutes to execute.
import Numeric.LinearAlgebra
import System.Random
createReadout :: Matrix Double -> Matrix Double -> Matrix Double
createReadout a b = oA <\> oB
where
mu = 1e-4
oA = (a <> (tr a)) + (mu * (ident $ rows a))
oB = a <> (tr b)
teacher :: [Int] -> Int -> Int -> Matrix Double
teacher labelsList cols' correctRow = fromBlocks $ f <$> labelsList
where ones = konst 1.0 (1, cols')
zeros = konst 0.0 (1, cols')
rows' = length labelsList
f i | i == correctRow = [ones]
| otherwise = [zeros]
glue :: Element t => [Matrix t] -> Matrix t
glue xs = fromBlocks [xs]
main :: IO ()
main = do
let n = 1500 -- <- The constant to be increased
m = 10000
cols' = 12
g <- newStdGen
-- Stub data
let labels = take m . map (`mod` 10) . randoms $ g :: [Int]
a = (n >< (cols' * m)) $ take (cols' * m * n) $ randoms g :: Matrix Double
teachers = zipWith (teacher [0..9]) (repeat cols') labels
b = glue teachers
print $ maxElement $ createReadout a b
return ()
$ cabal exec ghc -- -O2 Test.hs
$ time ./Test
./Test 190.16s user 5.22s system 106% cpu 3:03.93 total
The problem is to increase the constant n, at least to n = 4000, while RAM is limited by 5GB. What is minimal space that matrix inversion operation requires in theory? How can this operation be optimized in terms of space? Can ridge regression be efficiently replaced with a cheaper method?
Simple Gauss-Jordan elimination only takes space to store the input and output matrices plus constant auxiliary space. If I'm reading correctly, the matrix oA
you need to invert is n
x n
so that's not a problem.
Your memory usage is completely dominated by storing the input matrix a
, which uses at least 1500 * 120000 * 8 = 1.34 GB. n = 4000
would be 4000 * 120000 * 8 = 3.58 GB which is over half of your space budget. I don't know what matrix library you are using or how it stores its matrices, but if they are on the Haskell heap then GC effects could easily account for another factor of 2 in space usage.