The goal is to extract the numeric part of cost:xxxms
and sort the entire log lines
example of log
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:377ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:507ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:337ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:407ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
example of expected output
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:337ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:377ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:407ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:507ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
we have tried with the following commands but without success
grep 'cost:' log_file.txt | awk -F'cost:' '{print $0, $2}' | awk -F'ms' '{print $0, $1}' | sort -t' ' -k2,2nr
grep 'cost:' log_file.txt | awk -F'cost:' '{gsub("ms", "", $2); print $0, $2}' | sort -t' ' -k2,2nr
grep 'cost:' log_file.txt | awk -F'cost:' '{gsub("ms", "", $2); print $0, $2}' | sort -t' ' -k2,2nr | cut -d' ' -f1-
Using any awk, sort, and cut to implement a Decorate/Sort/Undecorate approach if there can be varying numbers of :
s before cost:
in your input:
$ awk -v OFS='\t' 'match($0,/ cost:[0-9]+ms /) {print substr($0,RSTART+6)+0, $0}' file |
sort -k1,1n | cut -f2-
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:95ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:337ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:377ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:407ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:507ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
The above was run on this sample input, with an additional cost:95ms
line added to the end of the OPs posted sample input as that's necessary to test numeric instead of alphabetic sorting of the cost numbers:
$ cat file
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:377ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:507ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:337ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:407ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN datanode.DataNode (BlockReceiver.java:receivePacket(701)) - Slow BlockReceiver write data to disk cost:95ms (threshold=300ms), volume=/data/sdg/hadoop/hdfs/data