I'm using LibSVM in a project which I am trying to parallelize with CUDA. The problem is that before train and prediction I store the useful data in a struct defined as
struct svm_node
{
int index;
double value;
};
and allocated, for example, in this way:
struct svm_node** testnode;
testnode = (struct svm_node**)malloc(sz[0] * sz[1] * sizeof(struct svm_node*));
for(i=0; i<sz[0] * sz[1]; i++){
testnode[i] = (struct svm_node*)malloc((no_classes * tnum + 2) * sizeof(struct svm_node));
}
So, practically I have a matrix which I access, for instance, in this way
testnode[0][0].index;
testnode[0][0].value;
Now, index and value are obtained using CUDA and they are stored in two continuous vector (linearized matrices). Is there any way to directly bound the vector pointer of each to the testnode struct in order to "transfer" data without using any for loop ?
Is there any way to directly bound the vector pointer of each to the testnode struct in order to "transfer" data without using any for loop ?
No, there is not. When allocations are made using multiple calls to a host memory allocator such as malloc
or new
(thus creating multiple host pointers), you cannot transfer all of the referenced data to the device using a single cudaMemcpyXXX
operation. It will require one cudaMemcpy
call per individually created host pointer.