Using Chapel 1.13.1, Gasnet 1.26.4, Fedora release 24
Trying to run the hello6-taskpar-dist.chpl, produces an error:
login_node> ./a.out -nl 1
bash: -c: line 0: unexpected EOF while looking for matching `''
bash: -c: line 1: syntax error: unexpected end of file
connection to work_node1 failed.
Terminated
My understanding is that gasnet exports the environment of the login node to the work-nodes, and some malformed definition is causing this problem (i.e. This is an environmental bug).
Unfortunately, I'm not sure what scripts are being executed during execution of the chapel binary, and am finding it difficult to trace where and when badness happens.
Double unfortunately, I've bandaged this problem once before by unsetting the module function (unset module
) but now that fix no longer works:
login_node> unset module
login_node> ./a.out -nl 1
bash: -c: line 0: syntax error near unexpected token `('
bash: -c: line 0: ` cd '/home/me/TestCode/Chapel/gasnet-helloworld' ; env 'AMUDP_SLAVE_ARGS=1,login_node:34413,' './a.out_real' '-nl' '1' '-E' 'MANPATH=/home/me/usr/opt/chapel-gasnet/man:/usr/share/man:/usr/man:/usr/local/man' '-E' 'XDG_SESSION_ID=397' '-E' 'GUESTFISH_INIT=\e[1;34m' '-E' 'HOSTNAME=login_node' '-E' 'HARDWARE_PLATFORM=x86_64' '-E' 'RTE_INCLUDE=/usr/include/dpdk' '-E' 'TERM=xterm-256color' '-E' 'SHELL=/bin/bash' '-E' 'WISECONFIGDIR=/usr/share/wise2/' '-E' 'HISTSIZE=1000' '-E' 'SSH_CLIENT=24.255.27.170 58652 22' '-E' 'LOCALUSR=/home/me/usr' '-E' 'QTDIR=/usr/lib64/qt-3.3' '-E' 'OLDPWD=/home/me/usr/opt/chapel-gasnet' '-E' 'QTINC=/usr/lib64/qt-3.3/include' '-E' 'SSH_TTY=/dev/pts/0' '-E' 'CHPL_COMM=gasnet' '-E' 'ACTIVE_SHELLS=1' '-E' 'LOCALOPT=/home/me/usr/opt' '-E' 'SVN_EDITOR=vim' '-E' 'USER=me' '-E' 'LD_LIBRARY_PATH=/home/me/usr/lib' '-E' 'LS_COLORS=rs=0:di=38;5;33:ln=38;5;51:mh=00:pi=40;38;5;11:so=38;5;13:do=38;5;5:bd=48;5;232;38;5;11:cd=48;5;232;38;5;3:or=48;5;232;38;5;9:mi=01;05;37;41:su=48;5;196;38;5;15:sg=48;5;11;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;21;38;5;15:ex=38;5;40:*.tar=38;5;9:*.tgz=38;5;9:*.arc=38;5;9:*.arj=38;5;9:*.taz=38;5;9:*.lha=38;5;9:*.lz4=38;5;9:*.lzh=38;5;9:*.lzma=38;5;9:*.tlz=38;5;9:*.txz=38;5;9:*.tzo=38;5;9:*.t7z=38;5;9:*.zip=38;5;9:*.z=38;5;9:*.Z=38;5;9:*.dz=38;5;9:*.gz=38;5;9:*.lrz=38;5;9:*.lz=38;5;9:*.lzo=38;5;9:*.xz=38;5;9:*.bz2=38;5;9:*.bz=38;5;9:*.tbz=38;5;9:*.tbz2=38;5;9:*.tz=38;5;9:*.deb=38;5;9:*.rpm=38;5;9:*.jar=38;5;9:*.war=38;5;9:*.ear=38;5;9:*.sar=38;5;9:*.rar=38;5;9:*.alz=38;5;9:*.ace=38;5;9:*.zoo=38;5;9:*.cpio=38;5;9:*.7z=38;5;9:*.rz=38;5;9:*.cab=38;5;9:*.jpg=38;5;13:*.jpeg=38;5;13:*.gif=38;5;13:*.bmp=38;5;13:*.pbm=38;5;13:*.pgm=38;5;13:*.ppm=38;5;13:*.tga=38;5;13:*.xbm=38;5;13:*.xpm=38;5;13:*.tif=38;5;13:*.tiff=38;5;13:*.png=38;5;13:*.svg=38;5;13:*.svgz=38;5;13:*.mng=38;5;13:*.pcx=38;5;13:*.mov=38;5;13:*.mpg=38;5;13:*.mpeg=38;5;13:*.m2v=38;5;13:*.mkv=38;5;13:*.webm=38;5;13:*.ogm=38;5;13:*.mp4=38;5;13:*.m4v=38;5;13:*.mp4v=38;5;13:*.vob=38;5;13:*.qt=38;5;13:*.nuv=38;5;13:*.wmv=38;5;13:*.asf=38;5;13:*.rm=38;5;13:*.rmvb=38;5;13:*.flc=38;5;13:*.avi=38;5;13:*.fli=38;5;13:*.flv=38;5;13:*.gl=38;5;13:*.dl=38;5;13:*.xcf=38;5;13:*.xwd=38;5;13:*.yuv=38;5;13:*.cgm=38;5;13:*.emf=38;5;13:*.ogv=38;5;13:*.ogx=38;5;13:*.aac=38;5;45:*.au=38;5;45:*.flac=38;5;45:*.m4a=38;5;45:*.mid=38;5;45:*.midi=38;5;45:*.mka=38;5;45:*.mp3=38;5;45:*.mpc=38;5;45:*.ogg=38;5;45:*.ra=38;5;45:*.wav=38;5;45:*.oga=38;5;45:*.opus=38;5;45:*.spx=38;5;45:*.xspf=38;5;45:' '-E' 'CHPL_HOST_PLATFORM=linux64' '-E' 'LOCALBIN=/home/me/usr/bin' '-E' 'CCACHE_CPP2=' '-E' 'GUESTFISH_PS1=\[\e[1;32m\]><fs>\[\e[0;31m\] ' '-E' 'MAIL=/var/spool/mail/me' '-E' 'PATH=/home/me/usr/opt/chapel-gasnet/bin/linux64:/home/me/usr/opt/chapel-gasnet/util:/home/me/usr/bin:/s/bach/e/proj/rtrt/software/bin:/sbin:/bin:/usr/bin:/usr/local/bin:/usr/bin/X11:/usr/X/bin:/usr/local/java/bin:/s/chopin/f/proj/eclipse:/home/me/usr/opt/atom/bin/' '-E' 'RTE_SDK=/usr/share/dpdk' '-E' 'SSH_SERVERS=work_node1 work_node_2 work_node3' '-E' 'RTE_TARGET=x86_64-default-linuxapp-gcc' '-E' 'PWD=/home/me/TestCode/Chapel/gasnet-helloworld' '-E' 'JAVA_HOME=/usr/lib/jvm/java' '-E' 'XMODIFIERS=@im=none' '-E' 'EDITOR=vim' '-E' 'LANG=en_US.utf8' '-E' 'MODULEPATH=/etc/scl/modulefiles:/usr/share/Modules/modulefiles:/etc/modulefiles:/usr/share/modulefiles' '-E' 'LOADEDMODULES=' '-E' 'KDEDIRS=/usr' '-E' 'GUESTFISH_OUTPUT=\e[0m' '-E' 'S_COLORS=auto' '-E' 'ignoreeof=10' '-E' 'PS1=$ '-E' 'LOCALETC=/home/me/usr/etc' '-E' 'TRASHDIR=/home/me/.local/share/Trash/files' '-E' 'SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass' '-E' 'HISTCONTROL=ignoredups' '-E' 'PS2=> ' '-E' 'TEXINPUTS=~/Documents/LatexLibraries/pgfplots/tex/:~/Documents/LatexLibraries/subfigmat/' '-E' 'CHPL_HOME=/home/me/usr/opt/chapel-gasnet' '-E' 'SHLVL=1' '-E' 'HOME=/home/me' '-E' 'GASNET_SSH_CMD=ssh' '-E' 'LOGNAME=me' '-E' 'GASNET_SSH_OPTIONS=-x -o LogLevel=Error' '-E' 'QTLIB=/usr/lib64/qt-3.3/lib' '-E' 'CVS_RSH=ssh' '-E' 'DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1885/bus' '-E' 'SSH_CONNECTION=some.ip.addr some.ip.addr2' '-E' 'GASNET_SSH_SERVERS=work_node1 work_node_2 work_node3' '-E' 'MODULESHOME=/usr/share/Modules' '-E' 'LESSOPEN=||/usr/bin/lesspipe.sh %s' '-E' 'XDG_RUNTIME_DIR=/run/user/1885' '-E' 'GASNET_SPAWNFN=S' '-E' 'GUESTFISH_RESTORE=\e[0m' '-E' 'CCACHE_HASHDIR=' '-E' 'BASH_FUNC_scl()=() { local CMD=$1;'
connection to work_node1 failed.
Terminated
(Ive changed a number of potentially identifying information. Usernames, hostnames, ip-addresses, paths, et cetera)
It would seem that this BASH_FUNC_scl
function is also partially to blame, but unsetting it produces the same error.
Looking around, in found this xonsh issue on Github which indicated unsetting both scl and module would do the trick. It did not, but the error returns to the short 'unexpected EOF' error. Unsetting module or scl alone did not fix the issue, and produces the same error.
login_node> unset module
login_node> unset scl
login_node> ./a.out -nl 1
bash: -c: line 0: unexpected EOF while looking for matching `''
bash: -c: line 1: syntax error: unexpected end of file
connection to work_node1 failed.
Terminated
There are two ways, I think, to solve this.
Get Gasnet to not push the environment out to the worker nodes. Our cluster already has the environment loaded up when the user logs-in (networked file system), so there is no need to push the entire environment over (If necessary I can add things to the rc or profile scripts to some stuff in order). I like this best because it minimizes the work I have to do. Unfortunately I no little-to-nothing about Gasnet or its use.
Fix the environment such that there is not unmatched '
, character. I do not like this idea, because it requires a lot of work and could mean asking sys-admins to do sys-admins things; something they are often disinclined to do, especially for small projects and corner cases, of which this is both.
Any ideas?
EDIT:
It was suggested I post my existing PS1 and printchplenv
.
PS1='$HOSTNAME> '
> printchplenv
machine info: Linux login_node 4.6.4-301.fc24.x86_64 #1 SMP Tue Jul 12 11:50:00 UTC 2016 x86_64
CHPL_HOME: /home/me/usr/opt/chapel-gasnet *
script location: /home/me/usr/opt/chapel-1.13.1_gasnet/util
CHPL_TARGET_PLATFORM: linux64
CHPL_TARGET_COMPILER: gnu
CHPL_TARGET_ARCH: unknown
CHPL_LOCALE_MODEL: flat
CHPL_COMM: gasnet *
CHPL_COMM_SUBSTRATE: udp
CHPL_GASNET_SEGMENT: everything
CHPL_TASKS: qthreads
CHPL_LAUNCHER: amudprun
CHPL_TIMERS: generic
CHPL_MEM: jemalloc
CHPL_MAKE: gmake
CHPL_ATOMICS: intrinsics
CHPL_NETWORK_ATOMICS: none
CHPL_GMP: gmp
CHPL_HWLOC: hwloc
CHPL_REGEXP: re2
CHPL_WIDE_POINTERS: struct
CHPL_AUX_FILESYS: none
I suspect that the problem is caused by the launcher code mangling the environment string when it encounters some magic shell characters included in your PS1
, such as these:
" ' ` * ? [ ] | & < > ; ( ) # $ \
If this is the case, you should be able to obtain a quick fix by invoking the compiled binary with an overridden PS1
value, e.g.
PS1="" ./a.out -nl 1
The bug has been reported with the Chapel team. In fact, there is already a pull request that aims to resolve the issue.