Context
Currently I am pushing data to a SFTP-server which other processes and systems than use for further stuff. All files share a root folder, but they're subdivided into subfolders according to certain categories. This folder structure must not be changed and cannot be altered. After a certain time period (currently 7 days) I need to automatically delete those files.
Unfortunately, the server has strict access rights and I can only access a specific directory via SFTP; SSH etc. is forbidden. The challenge in such an automated process lies within these restrictions:
So far I know that I can delete files in one-liners this way:
echo "rm $_file_name" | sftp $username@$sftp_server
However, the problem I struggle the most with is reading in the files on the SFTP-server in one line and filter this output by the date-criterion.
Question
How can I achieve a CRON-job deleting files only via SFTP in a directory given they are older than a week?
Note: I am aware of questions like here and here; anyhow, these do not share the limitations I have.
After some time I figured out a solution in a stepwise-learning Process:
Step 1: Retrieving all subdirectories
First I needed to get all directories the files are stored in.
Given the assumption that all relevant directories are subdirectories of \IN
, my solution was to get the String
-return for that information and iterate over the splitted `String.
# Get the string the sftp-command returns for listing all directories in /IN.
sftp_dirs=$(echo $(echo ls | sftp $username@$sftp_server:/IN))
# Then erase the log-information from that string sftp appends to it.
# This leaves a string which can be split in order to iterate over it.
process_dirs="${sftp_dirs/'Changing to: /IN sftp> ls '/}"
# Now iterate over each directory and retrieve the files.
for _dir in $(echo $process_dirs | tr " " "\n")
do
# Fill in Step2
done
Step 2: Retrieving all files and their creation dates
The files creation timestamps on the SFTP-server are the dates I pushed the files to the server. Thus, getting files stored on the server for more than 7 days can be identified by their creation time.
The challenge here is to retrieve this information from the echo-output of SFTP since. Since I have all necessary rights to install whatever I need on the server running the CRON-job, I used the help of LFTP
for this since I couldn't do it with pure SFTP.
Finally, the solution is to read an array from LFTPs output.
# Returns an array with all files stored in $_dir with their respective timestamps.
# $_dir is the loop-variable from Step 1
readarray -t _files <<< "$(lftp -u $username, sftp://$sftp_server:/IN/$_dir -e "cls -B --date --time-style long-iso; exit")"
Note that I configured the timestamp to long-iso
for further processing.
Thus, now iterate over all files in order to identify the ones older than 7 days like:
for _file in "${_files[@]}"
do
# Fill in Step3
done
Step 3: Retrieve and delete old files
Here straight forward the code:
# Get the files date in the desired date format
# (remember that long-iso was chosen in Step 2)
# _file is the loop-variable from Step 2
_file_date=$(date -d "${_file:0:10}" +%s)
_file_name="/IN/"$_dir"/"${_file:17}
# Compare the date-difference
deletion_date=$(date -d 'now - 7 days' +%s)
_datediff=$(( (_file_date - deletion_date) / 86400 ))
Finally, if the datediffs results shows that the file is older than 7 days, delete it
if [ $_datediff -lt 0 ]
then
# Pass the file to sftp and issue the delete command.
echo "rm $_file_name" | sftp $username@$sftp_server
fi