bashshnewlinecarriage-returnlinefeed

Are shell scripts sensitive to encoding and line endings?


I am making an NW.js app on macOS, and want to run the app in dev mode by double-clicking on an icon. In the first step, I'm trying to make my shell script work.

Using VS Code on Windows (I wanted to gain time), I have created a run-nw file at the root of my project, containing this:

#!/bin/bash

cd "src"
npm install

cd ..
./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &

but I get this output:

$ sh ./run-nw

: command not found  
: No such file or directory  
: command not found  
: No such file or directory  

Usage: npm <command>

where <command> is one of:  (snip commands list)

(snip npm help)

npm@3.10.3 /usr/local/lib/node_modules/npm  
: command not found  
: No such file or directory  
: command not found

Some things I don't understand.

Not able to make it work properly, and suspecting something weird with the file itself, I created a new one directly on the Mac, using vim this time. I entered the exact same instructions, and... now it works without any issues.
A diff on the two files reveals exactly zero difference.

What can be the difference? What can make the first script not work? How can I find out?

Update

Following the accepted answer's recommendations, after the wrong line endings came back, I checked multiple things. It turns out that since I copied my ~/.gitconfig from my Windows machine, I had autocrlf=true, so every time I modified the bash file under Windows, it re-set the line endings to \r\n.
So, in addition to running dos2unix (which you will have to install using Homebrew on a Mac), if you're using Git, check your .gitconfig file.


Solution

  • Yes. Bash scripts are sensitive to line-endings, both in the script itself and in data it processes. They should have Unix-style line-endings, i.e., each line is terminated with a Line Feed character (decimal 10, hex 0A in ASCII).

    DOS/Windows line endings in the script

    With Windows or DOS-style line endings , each line is terminated with a Carriage Return followed by a Line Feed character. You can see this otherwise invisible character in the output of cat -v yourfile:

    $ cat -v yourfile
    #!/bin/bash^M
    ^M
    cd "src"^M
    npm install^M
    ^M
    cd ..^M
    ./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &^M
    

    In this case, the carriage return (^M in caret notation or \r in C escape notation) is not treated as whitespace. Bash interprets the first line after the shebang (consisting of a single carriage return character) as the name of a command/program to run.

    DOS/Windows line endings in input data

    Like above, if you have an input file with carriage returns:

    hello^M
    world^M
    

    then it will look completely normal in editors and when writing it to screen, but tools may produce strange results. For example, grep will fail to find lines that are obviously there:

    $ grep 'hello$' file.txt || grep -x "hello" file.txt
    $
    

    (no match because the line actually ends in ^M)

    Appended text will seem to overwrite the line because the carriage return moves the cursor to the start of the line:

    $ sed -e 's/$/!/' file.txt
    !ello
    !orld
    

    String comparison will fail, even though strings appear to be the same when writing to screen:

    $ a="hello"; read b < file.txt
    $ if [[ "$a" = "$b" ]]
      then echo "Variables are equal."
      else echo "Sorry, $a is not equal to $b"
      fi
    
    Sorry, hello is not equal to hello
    

    Solutions

    The solution is to convert the file to use Unix-style line endings. There are a number of ways this can be accomplished:

    1. Using the dos2unix program:

      dos2unix filename
      
    2. Open the file in a capable text editor (Sublime, Notepad++, not Notepad) and configure it to save files with Unix line endings, e.g., with Vim, run the following command before (re)saving:

      :set fileformat=unix
      
    3. If you have a version of the sed utility that supports the -i or --in-place option, e.g., GNU sed, you could run the following command to strip trailing carriage returns:

      sed -i 's/\r$//' filename
      

      With other versions of sed, you could use output redirection to write to a new file. Be sure to use a different filename for the redirection target (it can be renamed later).

      sed 's/\r$//' filename > filename.unix
      
    4. Similarly, the tr translation filter can be used to delete unwanted characters from its input:

      tr -d '\r' <filename >filename.unix
      

    Cygwin Bash

    With the Bash port for Cygwin, there’s a custom igncr option that can be set to ignore the Carriage Return in line endings (presumably because many of its users use native Windows programs to edit their text files). This can be enabled for the current shell by running set -o igncr.

    Setting this option applies only to the current shell process so it can be useful when sourcing files with extraneous carriage returns. If you regularly encounter shell scripts with DOS line endings and want this option to be set permanently, you could set an environment variable called SHELLOPTS (all capital letters) to include igncr. This environment variable is used by Bash to set shell options when it starts (before reading any startup files).

    Useful utilities

    The file utility is useful for quickly seeing which line endings are used in a text file. Here’s what it prints for for each file type:

    The GNU version of the cat utility has a -v, --show-nonprinting option that displays non-printing characters.

    The dos2unix utility is specifically written for converting text files between Unix, Mac and DOS line endings.

    Useful links

    Wikipedia has an excellent article covering the many different ways of marking the end of a line of text, the history of such encodings and how newlines are treated in different operating systems, programming languages and Internet protocols (e.g., FTP).

    Files with classic Mac OS line endings

    With Classic Mac OS (pre-OS X), each line was terminated with a Carriage Return (decimal 13, hex 0D in ASCII). If a script file was saved with such line endings, Bash would only see one long line like so:

    #!/bin/bash^M^Mcd "src"^Mnpm install^M^Mcd ..^M./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &^M
    

    Since this single long line begins with an octothorpe (#), Bash treats the line (and the whole file) as a single comment.

    Note: In 2001, Apple launched Mac OS X which was based on the BSD-derived NeXTSTEP operating system. As a result, OS X also uses Unix-style LF-only line endings and since then, text files terminated with a CR have become extremely rare. Nevertheless, I think it’s worthwhile to show how Bash would attempt to interpret such files.