pythonlftp

Track LFTP download from the status file


I am invoking lftp command through a python subprocess and in the lftp command using -n flag of pget to set the max connections. Now when the download is in progress, a status file by the name filename.lftp-pget-status gets created and it automatically gets removed once the download is over.

Here is a sample output of the status file for connections = 4

$ cat abc_20200619.gz.lftp-pget-status 
size=1837873446
0.pos=459472896
0.limit=459468363
1.pos=1301863572
1.limit=1378405085
2.pos=1735117533
2.limit=1837873446

I need to track the progress of the download from the status file. I am having trouble in understanding the contents, since the partitions also gets reduced at the end of download. I wrote the below formula to calculate the bytes of data downloaded, but i don't think that is the correct way.

bytes_downloaded = 0.pos + (1.pos-0.limit) + (2.pos-1.limit) + (n.pos-(n-1).limit)

Anyone have any idea of tracking lftp downloads from the status file ?


Solution

  • If we consider the pairs of "pos" and "limit" values as {(Pᵢ, Lᵢ) | 0 ≤ i < n}, where Lₙ₋₁ is the complete size, we can attribute the following.

    The number of remaining bytes can be represented as Σᵢ₌₀ⁿ⁻¹(Lᵢ - Pᵢ). So, the number of downloaded bytes can be represented as Lₙ₋₁ - (Σᵢ₌₀ⁿ⁻¹(Lᵢ - Pᵢ)). This expression can be manipulated in many ways, like (Σᵢ₌₀ⁿ⁻¹(Pᵢ - Lᵢ)) + Lₙ₋₁, but the most useful one programmatically is probably (Σᵢ₌₀ⁿ⁻¹(Pᵢ)) - (Σᵢ₌₀ⁿ⁻²(Lᵢ)). An example algorithm is as follows:

    FILE = open('example.lftp-pget-status', 'r') 
    file_lines = FILE.readlines()
    downloaded_bytes = 0
    
    # first and last lines are removed and numbers extracted for other lines
    file_lines = [int(re.sub(r'.*=', '', line)) for line in file_lines[1:-1]]
    
    downloaded_bytes += sum(file_lines[0::2]) # values with even index (the pos)
    downloaded_bytes -= sum(file_lines[1::2]) # values with odd index (the limit)
    

    This answer is late, but hopefully helpful to anyone else who searches.