While experimenting with inline environment variables, I noticed that although they are correctly applied to command execution, their expansion sometimes retains the previous value—but not always!
GNU bash, version 5.2.21(1)-release (x86_64-pc-linux-gnu)
fr_FR.UTF-8
(floating-point numbers use a comma ,
as the decimal separator)Running the following bash script:
#!/usr/bin/env bash
locale
echo
LC_ALL=C bash -c 'printf "%s %f\t%f\n" "$LC_NUMERIC" 3.141592654 3,141592654'
echo
LC_NUMERIC=C bash -c 'printf "%s %f\t%f\n" "$LC_NUMERIC" 3.141592654 3,141592654'
Produces this output with some unexpected results:
LANG=fr_FR.UTF-8
LANGUAGE=fr_FR:fr_CA:en
LC_CTYPE="fr_FR.UTF-8"
LC_NUMERIC=fr_FR.UTF-8
LC_TIME=fr_FR.UTF-8
LC_COLLATE="fr_FR.UTF-8"
LC_MONETARY=fr_FR.UTF-8
LC_MESSAGES="fr_FR.UTF-8"
LC_PAPER=fr_FR.UTF-8
LC_NAME=fr_FR.UTF-8
LC_ADDRESS=fr_FR.UTF-8
LC_TELEPHONE=fr_FR.UTF-8
LC_MEASUREMENT=fr_FR.UTF-8
LC_IDENTIFICATION=fr_FR.UTF-8
LC_ALL=
bash: line 1: printf: 3,141592654: invalid number
fr_FR.UTF-8 3.141593 3.000000
bash: ligne 1 : printf: 3,141592654: nombre non valable
C 3.141593 3.000000
LC_ALL=C
), the floating-point numbers are correctly interpreted using a dot .
as the decimal separator, but the printed value of $LC_NUMERIC
remains fr_FR.UTF-8
instead of C
.LC_NUMERIC=C
), the printed value correctly reflects the inline environment assignment.Additionally, there’s another strange behaviour with inline environment variables:
Running:
LANG=C LC_NUMERIC=C printf '%s\t%f\t%f\n' "$LC_NUMERIC" 3.141592654 3,141592654
Outputs:
bash: printf: 3,141592654: invalid number
fr_FR.UTF-8 3.141593 3.000000
Even though LC_NUMERIC=C
is used to correctly parse 3.141592654
, the expansion of $LC_NUMERIC
still prints its original value (fr_FR.UTF-8
).
This may not necessarily be a bug, but the behaviour seems counter-intuitive and inconsistent as well, making it difficult to predict how inline environment variables will affect both variable expansion and command execution.
Any insights on this would be greatly appreciated!
EDIT
Further investigations shared with the bug-bash public mailing-list show that it is a problem related to locale and the use of LC_ALL
.
Given this ...
#!/usr/bin/env bash locale echo LC_ALL=C bash -c 'printf "%s %f\t%f\n" "$LC_NUMERIC" 3.141592654 3,141592654' echo LC_NUMERIC=C bash -c 'printf "%s %f\t%f\n" "$LC_NUMERIC" 3.141592654 3,141592654'
... I observe the appearances of $LC_NUMERIC
are expanded by the child shells, in which the outer variable assignment is in effect, not by the shell in which you are issuing the command with inline variable assignment. Given that context, we can explain the observations:
- In the first test (
LC_ALL=C
), the floating-point numbers are correctly interpreted using a dot.
as the decimal separator, but the printed value of$LC_NUMERIC
remainsfr_FR.UTF-8
instead ofC
.
Yes and yes.
The LC_NUMERIC
variable is significant to Bash, but it is nevertheless a regular environment variable, not a shell parameter. Bash does not manage its value. In particular, Bash does not adjust it based on the value of LC_ALL
, so it is natural that your child process inherits the parent's value for this variable, and that that's the value printed in your test.
But that value is mooted by the fact that LC_ALL
is set. When set, this variable overrides all the category-specific locale variables, so when you're operating under the effect of LC_ALL=C
, you get the period character (.
) as the decimal separator regardless of the value of LC_NUMERIC
.
- In the second test (
LC_NUMERIC=C
), the printed value correctly reflects the inline environment assignment.
Yes. In this example you pass that value for LC_NUMERIC
in the child process's environment, so, naturally, that is the value to which it expands in that process. And that is sufficient to get the period used as the decimal separator.
Note well that none of this has anything in particular to do with the variable assignments being inline.
Additionally, there’s another strange behaviour with inline environment variables [...]
Not really. This one is the common issue that @chepner was referencing in comments. There are numerous dupes, but as long as I'm answering the above, I'll address this, too. Here ...
LANG=C LC_NUMERIC=C printf '%s\t%f\t%f\n' "$LC_NUMERIC" 3.141592654 3,141592654
... the first field of the output reflects how $LC_NUMERIC
was expanded by the shell in which that command was issued. The inline variable assignment does not affect that shell, not even temporarily. It affects only the environment of the executed command (printf
in this case). But it does affect that command's environment, so it has its normal effect on printf
's choice of formatting characters.
And note here that whereas LC_ALL
that you tested before is an all-category override, the LANG
variable that you're using here is an all-category default. Thus, if you specified a different LANG
, you would still get numeric formatting according to the locale specified by LC_NUMERIC
.