bashsftplftp

Delete files in subdirectories via SFTP without SSH which are older than a week


Context

Currently I am pushing data to a SFTP-server which other processes and systems than use for further stuff. All files share a root folder, but they're subdivided into subfolders according to certain categories. This folder structure must not be changed and cannot be altered. After a certain time period (currently 7 days) I need to automatically delete those files.

Unfortunately, the server has strict access rights and I can only access a specific directory via SFTP; SSH etc. is forbidden. The challenge in such an automated process lies within these restrictions:

So far I know that I can delete files in one-liners this way:

echo "rm $_file_name" | sftp $username@$sftp_server

However, the problem I struggle the most with is reading in the files on the SFTP-server in one line and filter this output by the date-criterion.

Question

How can I achieve a CRON-job deleting files only via SFTP in a directory given they are older than a week?


Note: I am aware of questions like here and here; anyhow, these do not share the limitations I have.


Solution

  • After some time I figured out a solution in a stepwise-learning Process:

    Step 1: Retrieving all subdirectories

    First I needed to get all directories the files are stored in. Given the assumption that all relevant directories are subdirectories of \IN, my solution was to get the String-return for that information and iterate over the splitted `String.

    # Get the string the sftp-command returns for listing all directories in /IN.
    sftp_dirs=$(echo $(echo ls | sftp $username@$sftp_server:/IN))
    
    # Then erase the log-information from that string sftp appends to it.
    # This leaves a string which can be split in order to iterate over it.
    process_dirs="${sftp_dirs/'Changing to: /IN sftp> ls '/}"
    
    # Now iterate over each directory and retrieve the files.
    for _dir in $(echo $process_dirs | tr " " "\n")
    do
         # Fill in Step2
    done
    

    Step 2: Retrieving all files and their creation dates

    The files creation timestamps on the SFTP-server are the dates I pushed the files to the server. Thus, getting files stored on the server for more than 7 days can be identified by their creation time.

    The challenge here is to retrieve this information from the echo-output of SFTP since. Since I have all necessary rights to install whatever I need on the server running the CRON-job, I used the help of LFTP for this since I couldn't do it with pure SFTP.

    Finally, the solution is to read an array from LFTPs output.

    # Returns an array with all files stored in $_dir with their respective timestamps.
    # $_dir is the loop-variable from Step 1
    readarray -t _files <<< "$(lftp -u $username, sftp://$sftp_server:/IN/$_dir -e "cls -B --date --time-style long-iso; exit")"
    

    Note that I configured the timestamp to long-iso for further processing.

    Thus, now iterate over all files in order to identify the ones older than 7 days like:

    for _file in "${_files[@]}"
    do
        # Fill in Step3
    done
    

    Step 3: Retrieve and delete old files

    Here straight forward the code:

    # Get the files date in the desired date format
    # (remember that long-iso was chosen in Step 2)
    # _file is the loop-variable from Step 2
    _file_date=$(date -d "${_file:0:10}" +%s)
    _file_name="/IN/"$_dir"/"${_file:17}
    
    # Compare the date-difference
    deletion_date=$(date -d 'now - 7 days' +%s)
    _datediff=$(( (_file_date - deletion_date) / 86400 ))
    

    Finally, if the datediffs results shows that the file is older than 7 days, delete it

    if [ $_datediff -lt 0 ]
    then
        # Pass the file to sftp and issue the delete command.
        echo "rm $_file_name" | sftp $username@$sftp_server
    fi