I have got the log of my application with a field that contains strange characters.
I see these characters only when I use less
command.
I tried to copy the result of my line of code in a text file and what I see is
CTP_OUT=^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
I'd like to know if there is a way to find these null characters. I have tried with a grep
command but it didn't show anything
I hardly believe it, I might write an answer involving cat
!
The characters you are observing are non-printable characters which are often written in Caret notation. The Caret notation of a character is a way to visualize non-printable characters. As mentioned in the OP, ^@
is the representation of NULL
.
If your file has non-printable characters, you can visualize them using cat -vET
:
-E, --show-ends
: display$
at end of each line
-T, --show-tabs
: displayTAB
characters as^I
-v, --show-nonprinting
: use^
andM-
notation, except forLFD
andTAB
source:
man cat
I've added the -E
and -T
flag to it, to convert everything non-printable.
As grep
will not output the non-printable characters itself in any form, you have to pipe its output to cat
to see them. The following example shows all lines containing non-printable characters
Show all lines with non-printable characters:
$ grep -E '[^[:print:]]' --color=never file | cat -vET
Here, the ERE [^[:print:]]
selects all non-printable characters.
Show all lines with NULL
:
$ grep -Pa '\x00' --color=never file | cat -vET
Be aware that we need to make use of the Perl regular expressions here as they understand the hexadecimal and octal notation.
Various control characters can be written in C language style:
\n
matches a newline,\t
a tab,\r
a carriage return,\f
a form feed, etc.
More generally,
\nnn
, wherennn
is a string of three octal digits, matches the character whose native code point isnnn
. You can easily run into trouble if you don't have exactly three digits. So always use three, or since Perl 5.14, you can use\o{...}
to specify any number of octal digits.
Similarly,
\xnn
, wherenn
are hexadecimal digits, matches the character whose native ordinal isnn
. Again, not using exactly two digits is a recipe for disaster, but you can use\x{...}
to specify any number of hex digits.
An example:
$ printf 'foo\012\011\011bar\014\010\012foobar\012\011\000\013\000car\012\011\011\011\012' > test.txt
$ cat test.txt
foo
bar
foobar
car
If we now use grep
alone, we get the following:
$ grep -Pa '\x00' --color=never test.txt
car
But piping it to cat
allows us to visualize the control characters:
$ grep -Pa '\x00' --color=never test.txt | cat -vET
^I^@^K^@car$
Why --color=never
: If your grep is tuned to have --color=auto
or --color=always
it will add extra control characters to be interpreted as color for the terminal. And this might confuse you by the content.
$ grep -Pa '\x00' --color=always test.txt | cat -vET
^I^[[01;31m^[[K^@^[[m^[[K^K^[[01;31m^[[K^@^[[m^[[Kcar$