opencljuliablasjulia-gpu

OpenCL BLAS in julia


I am really amazed by the julia language, having implemented lots of machine learning algorithms there for my current project. Even though julia 0.2 manages to get some great results out of my 2011 MBA outperforming all other solutuions on similar linux hardware (due to vecLib blas, i suppose), I would certainly like more. I am in a process of buying radeon 5870 and would like to push my matrix operations there. I use basically only simple BLAS operations such as matmul, additios and transpositions. I use julia's compact syntax A' * B + C and would certainly like to keep it.

Is there any way (or pending milestone) I can get those basic operations execute on GPU? I have like 2500x2500 single precision matrices so I expect significant speedup.


Solution

  • I don't believe that GPU integration into the core of Julia is planned at this time. One of the key issues is that there is substantial overhead moving the data to and from the GPU, making a drop-in replacement for BLAS operations infeasible.

    I expect that most of the progress in this area will actually come from the package ecosystem, in particular packages under the JuliaGPU organization. I see there is a CLBLAS package in there.