bash

How to change old date format in a file to a new format


I have a huge file, and it has around 200 lines like this:

started at Wed Jun  5 08:45:01 PM +0330 2024 -- ended at Wed Jun  5 10:35:34 PM +0330 2024.
started at Thu Jun  6 01:30:01 AM +0330 2024 -- ended at Thu Jun  6 03:17:18 AM +0330 2024.
started at Thu Jun  6 07:30:01 AM +0330 2024 -- ended at Thu Jun  6 09:19:19 AM +0330 2024.
started at Thu Jun  6 01:30:01 PM +0330 2024 -- ended at Thu Jun  6 03:19:16 PM +0330 2024.

I'm going to change the lines in the old format which is $(date)'s output to the new one $(date +%Y-%m-%d %H:%M).

How can I do that? is that even possible?

Expected output is:

started at 2024-06-05 20:45:01 -- ended at 2024-06-05 22:35:34.
started at 2024-06-06 01:30:01 -- ended at 2024-06-06 03:17:18.
started at 2024-06-06 07:30:01 -- ended at 2024-06-06 09:19:19.
started at 2024-06-06 13:30:01 -- ended at 2024-06-06 15:19:16.

Solution

  • As whole log file is referenced +0330, I use TZ=Asia/Tehran as this seem match your time zone. The better is to use your own locale settings.

    If your log file do contain exactly two date to be converted by lines, You could try something like:

    sed < datedlogs.txt 's/^started at \(.*\) +0330 \(.*\) -- ended at \(.*\) +0330 \(.*\)\./TZ="Asia\/Tehran" \1 \2\nTZ="Asia\/Tehran" \3 \4/' |
        TZ="Asia/Tehran" date -f - +'%F %T' |
        paste -d + - - |
        sed 's/^\(.*\)+\(.*\)$/started at \1 -- ended at \2/'
    

    Based on your sample, this produce:

    started at 2024-06-05 20:45:01 -- ended at 2024-06-05 22:35:34
    started at 2024-06-06 01:30:01 -- ended at 2024-06-06 03:17:18
    started at 2024-06-06 07:30:01 -- ended at 2024-06-06 09:19:19
    started at 2024-06-06 13:30:01 -- ended at 2024-06-06 15:19:16
    started at 2024-06-06 19:30:01 -- ended at 2024-06-06 21:16:15
    started at 2024-06-07 01:30:01 -- ended at 2024-06-07 03:17:47
    started at 2024-06-07 07:30:01 -- ended at 2024-06-07 09:03:05
    started at 2024-06-07 13:30:01 -- ended at 2024-06-07 15:19:55
    started at 2024-06-07 19:30:01 -- ended at 2024-06-07 21:17:41
    started at 2024-06-08 01:30:01 -- ended at 2024-06-08 03:18:12
    started at 2024-06-08 07:30:01 -- ended at 2024-06-08 09:20:31
    started at 2024-06-08 13:30:01 -- ended at 2024-06-08 15:19:16
    started at 2024-06-08 19:30:01 -- ended at 2024-06-08 21:20:01
    started at 2024-06-09 01:30:01 -- ended at 2024-06-09 03:15:19
    started at 2024-06-09 07:30:01 -- ended at 2024-06-09 09:19:07
    started at 2024-06-09 13:30:01 -- ended at 2024-06-09 15:16:44
    started at 2024-06-09 19:30:01 -- ended at 2024-06-09 21:15:16
    started at 2024-06-10 01:30:01 -- ended at 2024-06-10 03:17:37
    started at 2024-06-10 07:30:01 -- ended at 2024-06-10 09:16:38
    started at 2024-06-10 13:30:01 -- ended at 2024-06-10 15:17:45
    

    Quickly, as this run date command only once!

    ... Or better:

    sed 's/^started at \(.*\) \([+-][0-2][0-9][0-5][0-9]\) \(.*\) -- ended at \(.*\) \([+-][0-2][0-9][0-5][0-9]\) \(.*\)\./TZ="\2" \1 \3\nTZ="\5" \4 \6/' |
        TZ="Asia/Tehran" date -f - +'%F %T' |
        paste -d + - - |
        sed 's/^\(.*\)+\(.*\)$/started at \1 -- ended at \2/'
    

    Where original TZ are extracted from input.