google-cloud-platformairflowsftpgoogle-cloud-networking

How to perform SFTP to a remote server using airflow


We have a scenario where we need to copy a file from GCS (google cloud storage) to a remote SFTP server via airflow without using any intermediate on-premise Unix server.

Is there any way to achieve this task with/without using GCP Compute Service or using docker POD via airflow ?


Solution

  • Yes, you can use GCSToSFTPOperator:

    from airflow.providers.google.cloud.transfers.gcs_to_sftp import GCSToSFTPOperator
    
    copy_file_from_gcs_to_sftp = GCSToSFTPOperator(
        task_id="file-copy-gsc-to-sftp",
        sftp_conn_id=SFTP_CONN_ID,
        source_bucket=os.environ.get("GCP_GCS_BUCKET_1_SRC", "test-gcs-sftp"),
        source_object="file.csv",
        destination_path="/tmp/single-file/",
    )
    

    The operator downloads the file to temp dir on the Airflow worker and upload it to the SFTP server. You can read more about the operator in the docs.