I am trying to compile one of the FORTRAN 77 example programs given in the NVIDIA cuBLAS documentation. This little example program involves using the FORTRAN bindings that NVIDIA provides to call cuBLAS functions from a FORTRAN application. The code uses a C-style macro IDX2F
to provide the proper device pointer index arithmetic expected by CUDA.
The full program is given below:
! Example B.2. Same Application Using Non-thunking cuBLAS Calls
!-------------------------------------------------------------
#define IDX2F (i,j,ld) ((((j)-1)*(ld))+((i)-1))
subroutine modify ( devPtrM, ldm, n, p, q, alpha, beta )
implicit none
integer sizeof_real
parameter (sizeof_real=4)
integer ldm, n, p, q
#if ARCH_64
integer*8 devPtrM
#else
integer*4 devPtrM
#endif
real*4 alpha, beta
call cublas_sscal ( n-p+1, alpha,
1 devPtrM+IDX2F(p, q, ldm)*sizeof_real,
2 ldm)
call cublas_sscal(ldm-p+1, beta,
1 devPtrM+IDX2F(p, q, ldm)*sizeof_real,
2 1)
return
end
program matrixmod
implicit none
integer M,N,sizeof_real
#if ARCH_64
integer*8 devPtrA
#else
integer*4 devPtrA
#endif
parameter(M=6,N=5,sizeof_real=4)
real*4 a(M,N)
integer i,j,stat
external cublas_init, cublas_set_matrix, cublas_get_matrix
external cublas_shutdown, cublas_alloc
integer cublas_alloc, cublas_set_matrix, cublas_get_matrix
do j=1,N
do i=1,M
a(i,j)=(i-1)*M+j
enddo
enddo
call cublas_init
stat= cublas_alloc(M*N, sizeof_real, devPtrA)
if (stat.NE.0) then
write(*,*) "device memory allocation failed"
call cublas_shutdown
stop
endif
stat = cublas_set_matrix(M,N,sizeof_real,a,M,devPtrA,M)
if (stat.NE.0) then
call cublas_free( devPtrA )
write(*,*) "data download failed"
call cublas_shutdown
stop
endif
No matter what I do, I keep getting these very annoying errors on compilation:
cuBLAStest1NonThunking.f:16:33:
1 devPtrM+IDX2F(p, q, ldm)*sizeof_real,
1
Error: Expected a right parenthesis at (1)
and
cuBLAStest1NonThunking.f:19:33:
1 devPtrM+IDX2F(p, q, ldm)*sizeof_real,
1
Error: Expected a right parenthesis at (1)
This error makes very little sense to me. I am using gfortran 8.5.0 and using the -cpp
option to compile.
The original define in the NVIDIA docs #define IDX2F(i,j,ld) ((((j)-1)*(ld))+((i)-1))
has no space between IDX2F and (i,j,ld). If there is the mistaken space then the IDX2F is substituted by (i,j,ld) ((((j)-1)*(ld))+((i)-1))
that is not expected. Since a fixed format is used for Fortran source then due to extra (i,j,ld)
symbols the resulting string exceeds 72 symbols and some right parentheses 'are eaten'. And, the used define operator has correct numbers of the left and right parentheses.