I have a CSV file with a column of IP addresses, MAC addresses and some other data. I want to sort all of the data by the IP addresses in ascending order
Input:
| IP Address | MAC Address | ID |
| --- | --- | --- |
| 10.10.10.1 | {mac1} | {id1} |
| 10.10.10.112 | {mac2} | {id2} |
| 10.10.10.17 | {mac3} | {id3} |
| 10.10.10.7 | {mac4} | {id4} |
What I expect the output to be:
| IP Address | MAC Address | ID |
| --- | --- | --- |
| 10.10.10.1 | {mac1} | {id1} |
| 10.10.10.7 | {mac4} | {id4} |
| 10.10.10.17 | {mac3} | {id3} |
| 10.10.10.112 | {mac2} | {id2} |
Where the IP addresses are in proper ascending order with their corresponding MAC and ID
I have tried using the pandas Dataframe.sort_values()
which without a key puts the IP addresses in incorrect ascending order because it looks at the IPs as integers and not as IP addresses. I have also tried to use sort_values()
with the parameter key=ipaddress.IPv4Address
but I just get a KeyError
df_new = df.sort_values(df['IP_Address'], key=ipaddress.IPv4Address)
I also tried sorted()
which puts the IP addresses in the correct order and into a list and then when I put that back into the data frame it doesn't sort the corresponding MAC and ID for each IP
This this approach using Series.str.split
and df.apply()
new_df = (df.sort_values(by=["IP Address"], key=lambda x: x.str.split(".")
.apply(lambda y: [int(z) for z in y])))
print(new_df)
IP Address MAC Address ID
0 10.10.10.1 {mac1} {id1}
3 10.10.10.7 {mac4} {id4}
2 10.10.10.17 {mac3} {id3}
1 10.10.10.112 {mac2} {id2}