I have a CSV file that I create from an API response. I get it unsorted and the API response does not have a query parameter to sort the returned payload. The CSV looks like below, where column A has some device identifiers and column B has the configuration last sync of these devices. The time is represented in relative time as seen in the screenshot.
How can I naturally sort the CSV file to have the rows ordered in the following sequence seconds
, minutes
, hours
, days
, weeks
, months
, years
, and Never
?
I have tried to sort the entries using natsort
but I didn't get what I am trying to achieve:
import csv
import os
from natsort import natsort, natsort_keygen
natsort_key = natsort_keygen(key=lambda k: k["last_sync"])
fieldnames = ["device", "last_sync"]
with open("Devices_161846.csv", "rt") as csvfile:
csvreader = csv.DictReader(f=csvfile, fieldnames=fieldnames)
next(csvreader)
identities = list(csvreader)
sorted_identities = natsort.natsorted(identities, key=natsort_key)
new_file, ext = os.path.splitext(csvfile.name)
with open(f"{new_file}_Sorted.{ext}", "wt") as f:
csvwriter = csv.DictWriter(f=f, fieldnames=fieldnames)
csvwriter.writeheader()
csvwriter.writerows(sorted_identities)
I expect to get the same result, but sorted from newest to oldest.
I would use parse
from dateparser as a sorting key, this way :
# pip install dateparser
import csv, datetime, dateparser
def parse(st):
if st == "Never": return datetime.datetime.min
else:
if (dt:=dateparser.parse(st.replace("a few", "0"))):
return dt
else:
return datetime.datetime.now()
header = ["device", "last_sync"]
with (
open("/Devices_161846.csv", "r") as inpf,
open("Devices_161846_Sorted.csv", "w", newline="") as outf
):
reader = csv.DictReader(inpf, fieldnames=header)
writer = csv.DictWriter(outf, fieldnames=header)
writer.writeheader()
writer.writerows(
sorted(reader, key=lambda x: parse(x["last_sync"]), reverse=True)
)
Output (Devices_161846_Sorted.csv) :
device,last_sync
j,a few seconds ago
f,29 minutes ago
g,38 minutes ago
c,an hour ago
a,an hour ago
d,13 hours ago
h,a day ago
e,4 days ago
b,17 days ago
i,Never
Input used (Devices_161846.csv) :
a,an hour ago
b,17 days ago
c,an hour ago
d,13 hours ago
e,4 days ago
f,29 minutes ago
g,38 minutes ago
h,a day ago
i,Never
j,a few seconds ago