fish

How can I set environment variables to non-utf8 values?


In fish I want to set an environment variable to a non-utf8 value like \x80 (the character sequence of length one with the value 0x80, i.e., 128). I already tried

set -x a \x80

fish: Invalid token '\x80'
set -x a \x80
         ^

and

set -x a (string unescape '\x80')

(the latter not creating an error message, but a is set to the empty string, $status is 1)

Linux does not specify a specific encoding for environment variable values, they are just arbitrary null-terminated uint8 strings. So there must be some way to set them in a shell, because the shell cannot not know how other applications are going to decode the values.

In bash and zsh it is easy:

export a=$'\x80'

I want to do this for two reasons:

  1. Many of today's file systems still allow non-unicode file names. Consider one has mounted a read-only file system with some directory named some_dir\x80 and one wants to add this directory to PATH (or any other environment variable).

  2. imagine I want to use an old version of a software that uses latin1 encoding to interpret certain environment variable values and cannot be changed to use unicode.

    For example, my family name is "Döbler" (German name), $'D\xf6bler' in latin-1 encoding (as a unix-quote bash literal). This sequence of bytes is not a legal utf-8 string. So I wonder: How can I hand over my family name as latin-1 encoded environment variable value to that software while using fish as shell?


Solution

  • The \x sequence used to check if the given bytes were valid ascii, while the \X (that's an uppercase "X") sequence did not.

    This was changed in https://github.com/fish-shell/fish-shell/pull/9247, released in fish 3.6.0 in January 2023. Now \x is the same as \X.

    Either upgrade your fish installation (you're at least 20% of all commits ever made to fish behind the current release) or use \X.

    After that, use set --show foo to show the value of the variable called "foo".