We have some code that goes along the lines of
aiocb* aiocbptr = new aiocb;
// populate aiocbptr with info for the write
aio_write( aiocbptr );
// Then do this periodically:
if(aio_error( aiocbptr ) == 0) {
delete aiocbptr;
}
aio_error is meant to return 0 when the write is completed, and hence we assume that we can call delete on aiocbptr at this point.
This mostly seems to work OK, but we recently started experiencing random crashes. The evidence points to the data pointed to by aiocbptr being modified after the call to delete.
Is there any issue using aio_error to poll for aio_write completion like this? Is there a guarantee that the aiocb will not be modified after aio_error has returned 0?
This change seems to indicate that something may have since been fixed with aio_error. We are running on x86 RHEL7 linux with glibc v 2.17, which predates this fix.
We tried using aio_suspend in addition to aio_error, so once aio_error has returned 0, we call aio_suspend, which is meant to wait for the operation to complete. But the operation should have already completed, so aio_suspend should do nothing. However, it seemed to fix the crashes.
Yes, my commit was fixing a missing memory barrier. Using e.g. aio_suspend triggers the memory barrier and thus fixes it too.