We have a project which evolves Nvidia GPU and Intel Xeon Phi. The host code and the GPU code is written in Fortran and compiled by pgfortran. To offload some of our job to the Phi, we have to make a shared library compiled by the ifort( static link cannot work) and call the shared subroutine from the pgfortran part of the code. By doing so, we can offload arrays from the pgfortran part of code to the intel fortran shared library which can communicate with the Xeon Phi.
Now I'm trying to pass a derived type which contains allocatable arrays from the pgfortran part of code to the ifort shared library. Looks like there are some problems.
Here is a simple example( no Xeon Phi offload directive here):
caller.f90:
program caller
type cell
integer :: id
real, allocatable :: a(:)
real, allocatable :: b(:)
real, allocatable :: c(:)
end type cell
integer :: n,i,j
type(cell) :: cl(2)
n=10
do i=1,2
allocate(cl(i)%a(n))
allocate(cl(i)%b(n))
allocate(cl(i)%c(n))
end do
do j=1, 2
do i=1, n
cl(j)%a(i)=10*j+i
cl(j)%b(i)=10*i+j
end do
end do
call offload(cl(1))
print *, cl(1)%c
end program caller
called.f90:
subroutine offload(cl)
type cell
integer :: id
real, allocatable :: a(:)
real, allocatable :: b(:)
real, allocatable :: c(:)
end type cell
type(cell) :: cl
integer :: n
print *, cl%a(1:10)
print *, cl%b(1:10)
end subroutine offload
Makefile:
run: caller.o libcalled.so
pgfortran -L. caller.o -lcalled -o $@
caller.o: caller.f90
pgfortran -c caller.f90
libcalled.so: called.f90
ifort -shared -fPIC $^ -o $@
Notice the "cl%a(1:10)
" here, witout the "(1:10)
" there would be nothing printed.
This code finally printed out the elements in the cl(1)%a
and then hit a segmentation fault in the next line where I tried to print out the array cl(1)%b
.
If I change the "cl%a(1:10)
" to "cl%a(1:100)", and delete the "print *, cl%b(1:10)
". It would give a result of:
We can find that the elements in the b array are there but I just can not fetch them by the "cl%b(1:10)
".
I know that this may be caused by the different derived type structure of different compilers. But I really want a way by which we can pass this kind of derived type between compilers. Any solutions?
Thank you!
The ABI of the compilers can differ. You should not pass the structures directly, but build them inside the subroutines and use pointers, which you should pass as type(c_ptr)
or as assumed size arrays (but a copy can happen then!).
The interoperability with C from Fortran 2003 is not meant only to interact with C but any other compiler interoperable with C. It can be a diferent Fortran compiler.
Be aware it is against the rules of Fortran to declare the same type in more places and use it as the same type, unless the type is sequence
or bind(C)
. This is another reason why your program is not standard conforming.
called.f90:
subroutine offload(cl_c)
use iso_c_binding
type, bind(C) :: cell_C
integer :: id
integer :: na, nb, nc
type(c_ptr) :: a,b,c
end type cell_C
type cell
integer :: id
real, pointer :: a(:)
real, pointer :: b(:)
real, pointer :: c(:)
end type cell
type(cell) :: cl
type(cell_C) :: cl_C
integer :: n
cl%id = cl_C%id
call c_f_pointer(cl_C%a, cl%a, [cl_c%na])
call c_f_pointer(cl_C%b, cl%b, [cl_c%nb])
call c_f_pointer(cl_C%c, cl%c, [cl_c%nc])
print *, cl%a(1:10)
print *, cl%b(1:10)
end subroutine offload
caller.f90:
program caller
use iso_c_binding
type, bind(C) :: cell_C
integer :: id
integer :: na, nb, nc
type(c_ptr) :: a,b,c
end type cell_C
type cell
integer :: id
real, allocatable :: a(:)
real, allocatable :: b(:)
real, allocatable :: c(:)
end type cell
integer :: n,i,j
type(cell),target :: cl(2)
type(cell_c) :: cl_c
n=10
do i=1,2
allocate(cl(i)%a(n))
allocate(cl(i)%b(n))
allocate(cl(i)%c(n))
end do
do j=1, 2
do i=1, n
cl(j)%a(i)=10*j+i
cl(j)%b(i)=10*i+j
end do
end do
cl_c%a = c_loc(cl(1)%a)
cl_c%b = c_loc(cl(1)%b)
cl_c%c = c_loc(cl(1)%c)
cl_c%na = size(cl(1)%a)
cl_c%nb = size(cl(1)%b)
cl_c%nc = size(cl(1)%c)
cl_c%id = cl(1)%id
call offload(cl_c)
print *, cl(1)%c
end program caller
with gfortran and ifort:
>gfortran called.f90 -c -o called.o
>ifort caller.f90 -c -o caller.o
>ifort -o a.out called.o caller.o -lgfortran
>./a.out
11.0000000 12.0000000 13.0000000 14.0000000 15.0000000 16.0000000 17.0000000 18.0000000 19.0000000 20.0000000
11.0000000 21.0000000 31.0000000 41.0000000 51.0000000 61.0000000 71.0000000 81.0000000 91.0000000 101.000000
0.0000000E+00 0.0000000E+00 0.0000000E+00 0.0000000E+00 0.0000000E+00
0.0000000E+00 0.0000000E+00 0.0000000E+00 0.0000000E+00 0.0000000E+00
No dynamic libraries necessary here.
For 100% theoretical portability one could use c_int
, c_float
,... the formatting could be better and so on, but you get the point.
You can also overload the assignments between cell
and cell_C
to ease the conversion.