I was writing a program which does some basic object detection with cuda. I ran into a problem where I allocate unified memory with cudaMallocManaged, do some processing with it and then free it with cudaFree. Event though, cudaFree never returned an error, the memory never seems to actually get released, as task manager shows that both system memory usage and gpu shared memory usage are continously increasing. Is there something fundamentally wrong about my understanding of unified memory or is this a bug?
Minimal example:
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <cassert>
int main()
{
while (1)
{
void* ptr;
cudaMallocManaged(&ptr, 1 << 20);
assert(cudaFree(ptr) == cudaSuccess);
}
}
Im using Windows 10, cuda version is 10.2, driver version is 26.21.14.4122.
Either my driver installation was corrupt or it's a driver bug. The way I fixed was by reinstalling cuda and then reinstalling the latest gpu driver (the game-ready driver from the nvidia website). Im not sure why it was corrupt in the first place though.
EDIT: new driver version is 445.87