azureshellazure-blob-storageazure-storageazcopy

Filtering out empty files in azure-storage-azcopy sync


I am using azure-storage-azcopy sync to sync between source and destination. This already works ok so far but often I receive empty files on source folder and that also gets copied to destination. How can I check and filter out blank/empty files and skip them from being copied to destination azure storage?

Below command is the one am using so far:

./azcopy sync "/sftp/sourceFolder" "https://mypersonal.blob.core.windows.net/payment" --recursive=false --include="PPP*.*.pgp"

Probably any shell script can also work but I have no idea on it how to achieve this.

Please help.


Solution

  • How can I check and filter out blank/empty files and skip them from being copied to destination Azure storage?

    AFAIK, there is no direct command to exclude empty files by size using azcopy sync command.

    Instead, you can use the below shell script which creates an empty as folder and in the second line it filters empty files and moves the empty files to empty folder after moving you can able to sync between source and destination without empty files.

    sync.sh

    #!/bin/bash
    
    # Create empty folder if it doesn't exist
    mkdir -p empty
    
    # Move empty files from source folder to empty folder
    find /sftp/sourceFolder -type f -empty -exec mv {} empty/ \;
    
    # Sync files from source to destination
    azcopy sync "/sftp/sourceFolder" "https://<storage account name>.blob.core.windows.net/test/important" --recursive=false --include="PPP*.*.pgp"
    

    Output:

    venkatesxxx:~$ bash sync.sh
    INFO: Authenticating to destination using Azure AD
    INFO: Any empty folders will not be processed, because source and/or destination doesn't have full folder support
    INFO: azcopy: A newer version 10.21.2 is available to download
    
    
    Job 7b5bxxxxx has started
    Log file is located at: /home/venkatesan/.azcopy/7b5be0dxxxxxx-21ded154cedc.log
    
    0.0 %, 0 Done, 0 Failed, 2 Pending, 2 Total, 2-sec Throughput (Mb/s): 0
    
    Job 7b5xxxx Summary
    Files Scanned at Source: 2
    Files Scanned at Destination: 1
    Elapsed Time (Minutes): 0.0667
    Number of Copy Transfers for Files: 2
    Number of Copy Transfers for Folder Properties: 0 
    Total Number Of Copy Transfers: 2
    Number of Copy Transfers Completed: 2
    Number of Copy Transfers Failed: 0
    Number of Deletions at Destination: 0
    Total Number of Bytes Transferred: 63
    Total Number of Bytes Enumerated: 63
    Final Job Status: Completed
    

    enter image description here

    Portal: enter image description here