I have a csv file sitting in an FTP Server. I am able to download the file completely using the ftplib in python, but it is like extra compute and internet burden for me.
My main concern is to read the last line only, from the csv file. Any help would be really appreciated.
Thanks!!!
You cannot read last line with FTP. There's no API for that. But you can read last several bytes of the file. Enough to be sure you have at least one complete (last) line.
Use FTP.size
to tell size of the file. Calculate your estimate of beginning of the last line based on that. And then when downloading, use rest
argument of FTP.retrbinary
to download from there:
filename = "/remote/path/file.csv"
size = ftp.size(filename)
last_line_estimate = max(0, size - 1024)
flo = BytesIO()
ftp.retrbinary("RETR " + filename, flo.write, last_line_estimate)
flo.seek(0)
Now flo
contains the last 1024
bytes of the file.
You can probably safely pass it to any CSV file parsing library. Like Pandas. I do not think it would mind too much about corrupted "first" line of the buffer, particularly if you won't try to access it after parsing.
df = pd.read_csv(flo)
last_line = df.tail(1)
If your particular parsing library does mind, you will have to locate the beginning of the last line and trim the preceding bytes.