c++openclnvidiasyclspir

SYCL exception caught: Error: [ComputeCpp:RT0101] Failed to create kernel ((Kernel Name: SYCL_class_multiply))


I cloned https://github.com/codeplaysoftware/computecpp-sdk.git and modified the computecpp-sdk/samples/accessors/accessors.cpp file.

I just added std::cout << "SYCL exception caught: " << e.get_cl_code() << '\n';.

See the fully modified code:

/***************************************************************************
 *
 *  Copyright (C) 2016 Codeplay Software Limited
 *  Licensed under the Apache License, Version 2.0 (the "License");
 *  you may not use this file except in compliance with the License.
 *  You may obtain a copy of the License at
 *
 *      http://www.apache.org/licenses/LICENSE-2.0
 *
 *  For your convenience, a copy of the License has been included in this
 *  repository.
 *
 *  Unless required by applicable law or agreed to in writing, software
 *  distributed under the License is distributed on an "AS IS" BASIS,
 *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 *  See the License for the specific language governing permissions and
 *  limitations under the License.
 *
 *  Codeplay's ComputeCpp SDK
 *
 *  accessor.cpp
 *
 *  Description:
 *    Sample code that illustrates how to make data available on a device
 *    using accessors in SYCL.
 *
 **************************************************************************/

#include <CL/sycl.hpp>

#include <iostream>

using namespace cl::sycl;

int main() {
  /* We define the data to be passed to the device. */
  int data = 5;

  /* The scope we create here defines the lifetime of the buffer object, in SYCL
   * the lifetime of the buffer object dictates synchronization using RAII. */
  try {
    /* We can also create a queue that uses the default selector in
     * the queue's default constructor. */
    queue myQueue;

    /* We define a buffer in order to maintain data across the host and one or
     * more devices. We construct this buffer with the address of the data
     * defined above and a range specifying a single element. */
    buffer<int, 1> buf(&data, range<1>(1));

    myQueue.submit([&](handler& cgh) {
      /* We define accessors for requiring access to a buffer on the host or on
       * a device. Accessors are are like pointers to data we can use in
       * kernels to access the data. When constructing the accessor you must
       * specify the access target and mode. SYCL also provides the
       * get_access() as a buffer member function, which only requires an
       * access mode - in this case access::mode::read_write.
       * (make_access<>() has a second template argument which defaults
       * to access::mode::global) */
      auto ptr = buf.get_access<access::mode::read_write>(cgh);

      cgh.single_task<class multiply>([=]() {
        /* We use the subscript operator of the accessor constructed above to
         * read the value, multiply it by itself and then write it back to the
         * accessor again. */
        ptr[0] = ptr[0] * ptr[0];
      });
    });

    /* queue::wait() will block until kernel execution finishes,
     * successfully or otherwise. */
    myQueue.wait();

  } catch (exception const& e) {
    std::cout << "SYCL exception caught: " << e.what() << '\n';
    std::cout << "SYCL exception caught: " << e.get_cl_code() << '\n';
    return 2;
  }

  /* We check that the result is correct. */
  if (data == 25) {
    std::cout << "Hurray! 5 * 5 is " << data << '\n';
    return 0;
  } else {
    std::cout << "Oops! Something went wrong... 5 * 5 is not " << data << "!\n";
    return 1;
  }
}

After building I executed the binary and got the following error output:

$ ./accessors 
./accessors: /usr/local/cuda-8.0/lib64/libOpenCL.so.1: no version information available (required by /usr/local/computecpp/lib/libComputeCpp.so)
./accessors: /usr/local/cuda-8.0/lib64/libOpenCL.so.1: no version information available (required by /usr/local/computecpp/lib/libComputeCpp.so)
./accessors: /usr/local/cuda-8.0/lib64/libOpenCL.so.1: no version information available (required by /usr/local/computecpp/lib/libComputeCpp.so)
SYCL exception caught: Error: [ComputeCpp:RT0101] Failed to create kernel ((Kernel Name: SYCL_class_multiply))
SYCL exception caught: -45
 SYCL Runtime closed with the following errors:
SYCL objects are still alive while the runtime is shutting down

 This probably indicates that a SYCL object was created  but not properly destroyed. 
terminate called without an active exception
Aborted (core dumped)

Hardware configuration is given below:

$ /usr/local/computecpp/bin/computecpp_info  /usr/local/computecpp/bin/computecpp_info: /usr/local/cuda-8.0/lib64/libOpenCL.so.1: no version information available (required by /usr/local/computecpp/bin/computecpp_info) /usr/local/computecpp/bin/computecpp_info: /usr/local/cuda-8.0/lib64/libOpenCL.so.1: no version information available (required by /usr/local/computecpp/bin/computecpp_info)
********************************************************************************

ComputeCpp Info (CE 0.7.0)

********************************************************************************

Toolchain information:

GLIBC version: 2.19 GLIBCXX: 20150426 This version of libstdc++ is supported.

********************************************************************************


Device Info:

Discovered 3 devices matching:   platform    : <any>   device type : <any>

-------------------------------------------------------------------------------- Device 0:

  Device is supported                     : NO - Device does not support SPIR   CL_DEVICE_NAME                          : GeForce GTX 750 Ti   CL_DEVICE_VENDOR                        : NVIDIA Corporation  CL_DRIVER_VERSION                       : 384.111   CL_DEVICE_TYPE     : CL_DEVICE_TYPE_GPU 
-------------------------------------------------------------------------------- Device 1:

  Device is supported                     : UNTESTED - Device not tested on this OS   CL_DEVICE_NAME                          : Intel(R) HD Graphics   CL_DEVICE_VENDOR                        : Intel(R) Corporation   CL_DRIVER_VERSION                       : r5.0.63503   CL_DEVICE_TYPE                          : CL_DEVICE_TYPE_GPU 
-------------------------------------------------------------------------------- Device 2:

  Device is supported                     : YES - Tested internally by Codeplay Software Ltd.   CL_DEVICE_NAME                          : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz   CL_DEVICE_VENDOR             : Intel(R) Corporation   CL_DRIVER_VERSION                       :
1.2.0.475   CL_DEVICE_TYPE                          : CL_DEVICE_TYPE_CPU 

If you encounter problems when using any of these OpenCL devices, please consult this website for known issues: https://computecpp.codeplay.com/releases/v0.7.0/platform-support-notes

********************************************************************************

Please help to understand the error and to solve the same. Let me know if any more information is needed. I would like to run this sample code on my NVidia GPU.


Solution

  • ComputeCpp, an implementation of the open standard SYCL, outputs SPIR instructions by default, the NVidia OpenCL implementation is not able to consume SPIR instructions. Instead you will need to use ComputeCpp to output PTX instructions that can be understood by the NVidia hardware.

    To do this, add the argument "-DCOMPUTECPP_BITCODE=ptx64" when making your cmake call using the sample code project from GitHub.

    The FindComputeCpp.cmake file in this project takes this flag and gives the compiler instructions to output PTX. If you would like to do this with your own project you can take the relevant section from the FindComputeCpp.cmake file.