Is there a way to get OpenCL to give me a list of all unique physical devices which have an OpenCL implementation available? I know how to iterate through the platform/device list but for instance, in my case, I have one Intel-provided platform which gives me an efficient device implementation for my CPU, and the APP platform which provides a fast implementation for my GPU but a terrible implementation for my CPU.
Is there a way to work out that the two CPU devices are in fact the same physical device, so that I can choose the most efficient one and work with that, instead of using both and having them contend with each other for compute time on the single physical device?
I have looked at CL_DEVICE_VENDOR_ID
and CL_DEVICE_NAME
but they don't solve my issues, the CL_DEVICE_NAME
will be the same for two separate physical devices of the same model (dual GPU's) and CL_DEVICE_VENDOR_ID
gives me a different ID for my CPU depending on the platform.
An ideal solution would be some sort of unique physical device ID, but I'd be happy with manually altering the OpenCL configuration to rearrange the devices myself (if such a thing is possible).
As far as I could investigate the issue now, there is no reliable solution. If all your work is done within a single process, you may use the order of entries returned by clGetDeviceIDs
or cl_device
values themselves (essentially they're pointers), but things get worse if you try to share those identifiers between processes.
See that guy's blog post about it, saying:
The issue is that if you have two identical GPUs, you can’t distinguish between them. If you call
clGetDeviceIDs
, the order in which they are returned is actually unspecified, so if the first process picks the first device and the second takes the second device, they both may wind up oversubscribing the same GPU and leaving the other one idle.
However, he notes that nVidia and AMD provide their custom extensions, cl_amd_device_topology
and cl_nv_device_attribute_query
. You may check whether these extensions are supported by your device, and then use them as the following (the code by original author):
// This cl_ext is provided as part of the AMD APP SDK
#include <CL/cl_ext.h>
cl_device_topology_amd topology;
status = clGetDeviceInfo (devices[i], CL_DEVICE_TOPOLOGY_AMD,
sizeof(cl_device_topology_amd), &topology, NULL);
if(status != CL_SUCCESS) {
// Handle error
}
if (topology.raw.type == CL_DEVICE_TOPOLOGY_TYPE_PCIE_AMD) {
std::cout << "INFO: Topology: " << "PCI[ B#" << (int)topology.pcie.bus
<< ", D#" << (int)topology.pcie.device << ", F#"
<< (int)topology.pcie.function << " ]" << std::endl;
}
or (code by me, adapted from the above linked post):
#define CL_DEVICE_PCI_BUS_ID_NV 0x4008
#define CL_DEVICE_PCI_SLOT_ID_NV 0x4009
cl_int bus_id;
cl_int slot_id;
status = clGetDeviceInfo (devices[i], CL_DEVICE_PCI_BUS_ID_NV,
sizeof(cl_int), &bus_id, NULL);
if (status != CL_SUCCESS) {
// Handle the error.
}
status = clGetDeviceInfo (devices[i], CL_DEVICE_PCI_SLOT_ID_NV,
sizeof(cl_int), &slot_id, NULL);
if (status != CL_SUCCESS) {
// Handle the error.
}
std::cout << "Topology = [" << bus_id <<
":"<< slot_id << "]" << std::endl;