So I'm working with qemu kvm for a while and now I need to passthrough PCI devices. I did all required procedures to make this work: enabled iommu, modprobed vfio module, binded device to vfio and checked that vfio group was indeed created, etc... But when I start qemu with any pci devices I get the error message:
vfio: Failed to read device config space
I dig into qemu's code to see what the issue might be and found out that the issue occurs on a pread to the device. This happens even when the offset is 0, and doing a normal read on the file descriptor works without problems, as I changed the code to test it. Checking errno for the reason of pread failure gives me an 'Illegal seek' error message.
I wrote some code to see if this was happening outside of the qemu context(thought it might be something in qemu's code that was interfering with the device), and had the same issue. I also tried to read a normal file with pread and that works perfectly... Here is the code I wrote to test it, I broke it down a bit to be able to point out the more relevant parts:
#define BUF_SIZE 4096
int main(){
char buf[BUF_SIZE], buf1[BUF_SIZE], buf2[BUF_SIZE];
int ret,group_fd, fd, fd2;
size_t nbytes = 4096;
ssize_t bytes_read;
int iommu1, iommu2;
int container, group, device, i;
struct vfio_group_status group_status = { .argsz = sizeof(group_status) };
struct vfio_iommu_type1_info iommu_info = { .argsz = sizeof(iommu_info) };
struct vfio_iommu_type1_dma_map dma_map = { .argsz = sizeof(dma_map) };
struct vfio_device_info device_info = { .argsz = sizeof(device_info) };
container = open("/dev/vfio/vfio",O_RDWR);
if(ioctl(container,VFIO_GET_API_VERSION)!=VFIO_API_VERSION){
printf("Unknown api version: %m\n");
}
group_fd = open("/dev/vfio/22",O_RDWR); printf("Group fd = %d\n", group_fd);
ioctl(group_fd, VFIO_GROUP_GET_STATUS, &group_status);
if (!(group_status.flags & VFIO_GROUP_FLAGS_VIABLE)){
printf("Group not viable\n");
return 1;
}
ret = ioctl(group_fd, VFIO_GROUP_SET_CONTAINER,&container);
ret = ioctl(container,VFIO_SET_IOMMU,VFIO_TYPE1_IOMMU);
ioctl(container, VFIO_IOMMU_GET_INFO, &iommu_info);
/* Allocate some space and setup a DMA mapping */
dma_map.vaddr = (unsigned long int) mmap(0, 1024 * 1024, PROT_READ | PROT_WRITE,MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
dma_map.size = 1024 * 1024;
dma_map.iova = 0; /* 1MB starting at 0x0 from device view */
dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
ioctl(container, VFIO_IOMMU_MAP_DMA, &dma_map);
printf("\n\nGETTING DEVICE FD\n");
fd = ioctl(group_fd,VFIO_GROUP_GET_DEVICE_FD,"0000:08:00.0");
printf("Fd = %d\n",fd);
printf("VFIO_GROUP_GET_DEV_ID = %lu\n",VFIO_GROUP_GET_DEVICE_FD);
This read works fine, gives me a ret code of nbytes
ret = read(fd,buf,nbytes);
if(ret<1){
printf("ERROR: %m \n");
}
This pread fails with ret code -1 and errno 'Illegal seek'
ret = pread(fd,buf,nbytes,0);
if(ret<0){
printf("ERROR: %m \n");
}
Here I try read and pread on a common file in sysfs to see if pread fails, and on both read and pread work just fine in this case:
printf("TESTING PREAD ON A COMMON FILE\n");
fd2 = open("/sys/bus/pci/devices/0000:08:00.0/device",O_RDONLY);
ret = read(fd2,buf1,nbytes);
if(ret<0){
printf("ERROR: %m\n");
}
printf("Result from read: ret = %d, content = %s\n",ret,buf1);
ret = pread(fd2,buf2,nbytes,2);
if(ret<0){
printf("ERROR: %m\n"); #
}
printf("Result from pread: ret = %d, content = %s\n",ret,buf2);
close(fd2);
getchar();
close(fd);
close(container);
close(group_fd);
return 0;
}
I'm using a generic linux kernel v4.7.8 compiled with uClibc for an embedded system.... Anyone have any ideas of why this might be happening? I'm clueless right now!! T.T
UPDATE: I installed ubuntu 16.04 (kernel v4.4.0) on the same machine and repeated the steps and pci passthrough works fine and the pread on my test code also works perfectly. So I'm not sure what is going wrong with the custom generic kernel.
As per arash suggestion, I tried pread(fd,buf,nbytes,SEEK_CUR) and it gave me the same 'illegal seek' error. The offset I get from ftell is 0xffffffff both in ubuntu and in the generic kernel.
I found what was the issue and have been meaning to post it here for a while for anyone who might hit this wall. It turns out the pread and pwrite functions of the uClibc version 0.9.33 are broken, resulting in those functions failing to work on offsets bigger than 4G. The patches from the link below fixed the problem for me: http://uclibc.10924.n7.nabble.com/backport-pread-pwrite-fix-for-0-9-33-branch-td11921.html