machine-learninggpucaffenvidiaazure-dsvm

caffe powered and GPU enabled Microsoft Azure VM


I'm trying to build a VM for model training in Azure. I found this Data Science Virtual Machine for Linux (Ubuntu) VM which seems to be a suitable candidate.

Unfortunately, when I spun up the VM and installed the caffe prerequisites I wasn't able to run the tests. I'm getting the following error on make runtest (make all and make test were completed without errors):

NVIDIA: no NVIDIA devices found
Cuda number of devices: 0
Setting to use device 0
Current device id: 0
Current device name: 
Note: Randomizing tests' orders with a seed of 97204 .
[==========] Running 2041 tests from 267 test cases.
[----------] Global test environment set-up.
[----------] 11 tests from AdaDeltaSolverTest/3, where TypeParam = caffe::GPUDevice<double>
[ RUN      ] AdaDeltaSolverTest/3.TestAdaDeltaLeastSquaresUpdateWithHalfMomentum
NVIDIA: no NVIDIA devices found
E0715 02:24:32.097311 59355 common.cpp:114] Cannot create Cublas handle. Cublas won't be available.
NVIDIA: no NVIDIA devices found
E0715 02:24:32.103780 59355 common.cpp:121] Cannot create Curand generator. Curand won't be available.
F0715 02:24:32.103914 59355 test_gradient_based_solver.cpp:80] Check failed: error == cudaSuccess (30 vs. 0)  unknown error
*** Check failure stack trace: ***
    @     0x7f77a463f5cd  google::LogMessage::Fail()
    @     0x7f77a4641433  google::LogMessage::SendToLog()
    @     0x7f77a463f15b  google::LogMessage::Flush()
    @     0x7f77a4641e1e  google::LogMessageFatal::~LogMessageFatal()
    @           0x7115e3  caffe::GradientBasedSolverTest<>::TestLeastSquaresUpdate()
    @           0x7122af  caffe::AdaDeltaSolverTest_TestAdaDeltaLeastSquaresUpdateWithHalfMomentum_Test<>::TestBody()
    @           0x8e6023  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @           0x8df63a  testing::Test::Run()
    @           0x8df788  testing::TestInfo::Run()
    @           0x8df865  testing::TestCase::Run()
    @           0x8e0b3f  testing::internal::UnitTestImpl::RunAllTests()
    @           0x8e0e63  testing::UnitTest::Run()
    @           0x466ecd  main
    @     0x7f77a111c830  __libc_start_main
    @           0x46e589  _start
    @              (nil)  (unknown)
Makefile:532: recipe for target 'runtest' failed
make: *** [runtest] Aborted (core dumped)

Is it possible to spin up a virtual machine in Azure suitable for GPU enabled machine learning using caffe?

All the details about the VM here enter image description here


Solution

  • The Data Science Virtual Machine (DSVM) for Ubuntu already has Caffe installed in /opt/caffe. To use it on a GPU, create a VM with a K80 GPU by choosing the one of the NC sizes. (Be sure to choose HDD as the storage type, or the NC sizes will not appear.) Caffe will then be available out of the box.

    Also note that PyCaffe is available. At a terminal:

    source activate root
    

    And python will then have PyCaffe available.