parquetazure-data-factory

Parquet file name in Azure Data Factory


I'm copying data from an Oracle DB to ADLS using a copy activity of Azure Data Factory. The result of this copy is a parquet file that contains the same data of the table that I have copied but the name of this resultant parquet file is like this:

data_32ecaf24-00fd-42d4-9bcb-8bb6780ae152_7742c97c-4a89-4133-93ea-af2eb7b7083f.parquet

And I need that this name is stored like this:

TableName-Timestamp.parquet

How can I do that with Azure Data Factory?

Another question: Is there a way to add hierarchy when this file is being written? For example, I use the same pipeline for writting several tables and I want to create a new folder for each table. I can do that if I create a new Dataset for each table to write, but I want to know if is there a way to do that automatically (Using dynamic content).

Thanks in advance.


Solution

  • You could set a pipeline parameter to achieve it.

    Here's the example I tried copy data from Azure SQL database to ADLS, it also should works for oracle to ADLS.

    Set pipeline parameter: set the Azure SQL/Oracle table name which need to copy to ADLS:

    enter image description here

    Source dataset:

    Add dynamic content to set table name:

    enter image description here

    Source:

    Add dynamic content: set table name with pipeline parameter: enter image description here

    Sink dataset:

    Add dynamic content to set Parquet file name:

    enter image description here

    Sink:

    Add dynamic content to set Parquet file name with pipeline parameter:

    Format: TableName-Timestamp.parquet:

    @concat(pipeline().parameters.tablename,'-',utcnow())
    

    Then execute the pipeline, you will get the Parquet file like TableName-Timestamp.parquet:

    About your another question:

    You could add dynamic content set folder name for each table, just follow this:

    enter image description here

    For example, if we copy the table "test", the result we will get:

    container/test/test-2020-04-20T02:01:36.3679489Z.parquet
    

    Hope this helps.