bashcurlcarriage-returntr

Handle the carriage returns of curl ouput when piped


I'm trying to alter the output of the curl command. Let's say for the sake of this question that I want to move 4 spaces to the right said output. Nothing else about the content of the output is supposed to change.

Here is an example of the curl output I get with no processing. The last line is being updated every second or so with up-to-date figures:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
50  426M  50  213M    0     0  74.5M      0  0:00:07  0:00:07 --:--:-- 76.9M

Here is the execution log when piped to cat -vet to show invisible characters, as suggested by GordonDavisson in the comments (note that the last line is longer that the grey frame it is displayed into):

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current$
                                 Dload  Upload   Total   Spent    Left  Speed$
^M  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0^M  1  576M    1  9.9M    0     0  27.2M      0  0:00:21 --:--:--  0:00:21 27.3M^M 15  576M   15 89.0M    0     0  65.1M      0  0:00:08  0:00:01  0:00:07 65.2M^M 29  576M   29  169M    0     0  71.7M      0  0:00:08  0:00:02  0:00:06 71.7M^M 42  576M   42  247M    0     0  73.5M      0  0:00:07  0:00:03  0:00:04 73.5M^M 57  576M   57  328M    0     0  75.2M      0  0:00:07  0:00:04  0:00:03 75.2M^M 70  576M   70  405M    0     0  75.5M      0  0:00:07  0:00:05  0:00:02 79.0M^M 84  576M   84  485M    0     0  76.2M      0  0:00:07  0:00:06  0:00:01 79.2M^M 97  576M   97  564M    0     0  76.5M      0  0:00:07  0:00:07 --:--:-- 78.8M^M100  576M  100  576M    0     0  76.6M      0  0:00:07  0:00:07 --:--:-- 79.1M$

A basic attempt would look like this:

curl -O <url> 2>&1 | while read line; do echo "    $line" ; done

This works well for basic scripts, but not with curl, clearly due to the fact that the last line gets updated continuously (i guess what is happenning if the use of \r char to move back the cursor to the beginning of line and overwrite previous content). With this code, the last line never shows except once at the end of the download with the 100% numbers.

I then tried something like this just to test it, but to my great surprise, nothing was printed anymore:

curl -O <url> 2>&1 | tr '\r' '\n' | while read line; do echo "    $line" ; done

In any case this would not have done the job since each carriage return would have been transformed into a new line, thus printing one new line each time curl updates its output during dowload. I do not want a new line each time curl updates the numbers.

I cannot find what is going on there, and neither how to implement the functionnality.


Solution

  • Input: The first 2 lines of input shown in the question now are LF-separated while the rest of the input is CR-separated but with the final line ending in a LF, e.g. with the CRs and LFs showing:

    $ cat -A file
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current$
                                     Dload  Upload   Total   Spent    Left  Speed$
    ^M  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0^M  1  576M    1  9.9M    0     0  27.2M      0  0:00:21 --:--:--  0:00:21 27.3M^M 15  576M   15 89.0M    0     0  65.1M      0  0:00:08  0:00:01  0:00:07 65.2M^M 29  576M   29  169M    0     0  71.7M      0  0:00:08  0:00:02  0:00:06 71.7M^M 42  576M   42  247M    0     0  73.5M      0  0:00:07  0:00:03  0:00:04 73.5M^M 57  576M   57  328M    0     0  75.2M      0  0:00:07  0:00:04  0:00:03 75.2M^M 70  576M   70  405M    0     0  75.5M      0  0:00:07  0:00:05  0:00:02 79.0M^M 84  576M   84  485M    0     0  76.2M      0  0:00:07  0:00:06  0:00:01 79.2M^M 97  576M   97  564M    0     0  76.5M      0  0:00:07  0:00:07 --:--:-- 78.8M^M100  576M  100  576M    0     0  76.6M      0  0:00:07  0:00:07 --:--:-- 79.1M$
    

    and as it normally appears to the user:

    $ cat file
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
    100  576M  100  576M    0     0  76.6M      0  0:00:07  0:00:07 --:--:-- 79.1M
    

    Output using GNU awk: To keep that layout but indent it all by 4 blanks using GNU awk for multi-char RS and RT:

    $ cat file | awk -v RS='[\r\n]' '{printf "    %s%s", $0, RT}'
          % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                         Dload  Upload   Total   Spent    Left  Speed
        100  576M  100  576M    0     0  76.6M      0  0:00:07  0:00:07 --:--:-- 79.1M
    

    Output using any awk: If you don't have and can't install GNU awk, here's a couple of alternative ways using any awk where RS is a single character:

    $ cat file | awk '{printf "    %s%s", $0, RS} NR==2{RS="\r"}'
          % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                         Dload  Upload   Total   Spent    Left  Speed
        100  576M  100  576M    0     0  76.6M      0  0:00:07  0:00:07 --:--:-- 79.1M
    

    $ cat file | awk -v RS='\r' '{gsub(/\n/,"&    "); printf "    %s%s", $0, RS}'
          % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                         Dload  Upload   Total   Spent    Left  Speed
        100  576M  100  576M    0     0  76.6M      0  0:00:07  0:00:07 --:--:-- 79.1M
    

    Obviously replace cat file with your curl command.

    See Why does my tool output overwrite itself and how do I fix it? for information on handling CRs in general, and why-is-using-a-shell-loop-to-process-text-considered-bad-practice for issues with using a while-read loop.


    Original answer before the OP added sample input/output to the question and I thought they wanted to replace CRs with LFs:

    It's hard to say exactly what's going on without seeing your curl output but try this (if the curl output has no LFs, just CRs):

    curl -O <url> | awk -v RS='\r' '{print "    " $0}'
    

    or this (if the curl output has CRLF or LFCR for newlines):

    curl -O <url> | awk '{sub(/\r$/,""); print "    " $0}'