sedhexdump

How to fix hyphen for sed using hd -c


I have a file (c.txt) with the following output. This is only one item in a large file.

7,VE,Bank–Charges

I am trying to run:

sed -i 's/-/ - /g' c.txt

to get the desired output of:

7,VE,Bank – Charges

which does not work. Please note that there are many other hyphen combinations, such as Vehicle–Maintenance etc.

I have run the following command to understand the problem:

cat c.txt | head -n2 | tail -n1 | hd -c

00000000  37 2c 56 45 2c 42 61 6e  6b e2 80 93 43 68 61 72  |7,VE,Bank...Char|
0000000   7   ,   V   E   ,   B   a   n   k 342 200 223   C   h   a   r
00000010  67 65 73 0a                                       |ges.|
0000010   g   e   s  \n                                                
0000014

From this it is clear that the hyphen actually consists of 3 characters (342 200 223). So my question is how could I write a sed command that fixes all hyphen instances in the file of which there are many? Or is sed even usable here or are there other more useful options?


Solution

  • Was able to figure it out thanks to Cyrus.

    sed -i 's/\xe2\x80\x93/ - /g' c.txt