azure-data-factoryexpressionbuilder

Data Factory -> Transform Data -> Data Flow Filter Rows modified in the last 24 hours


I building Room Mailboxes based off some location/facility data stored in Data Lake G2. I have setup Data Factory and created the Linked services etc.. and can see the data in preview.

I now want to generate a CSV File of all the changes to the location data based off a few filters to the data.

I have 1 filter already that is working which gets rid of row that are not nessary: LocationActive=='true' && LocationReservable=='true'

The next filter I need is to only show me what has been modified in the last 24 hours.

e.g. (this is how it is currently formatting in the source data showing in preview. LocationUpdated: 2020-05-27 08:59:55 Data Factory Image

I am unsure how to build the next filter expression that will filter out anything that has not been modified in the last 24 hours.... something like:

subDays(currentDate(),1)<=toDate(LocationUpdated,'yyyy-MM-dd hh:mm:ss')

Will this work?? Better way? What am I doing wrong?

subDays(currentDate(),1)<=toDate(LocationUpdated,'yyyy-MM-dd hh:mm:ss')


Solution

  • Your expression is correct; if it gives false, then those records have not been updated in 24 hours. You are checking the condition for updated records within 24 hours, and the opposite of that is not updated.

    However, there are also some other ways to check this.

    If the timestamp of the last 24 hours is greater than LocationUpdated, then it is a record that has not been modified in the last 24 hours.

    All you have to do is check for a greater than condition.

    subDays(currentDate(),1) > toDate(LocationUpdated)

    Below is the sample data used.

    enter image description here

    You can also use the between function, which gives you true values between the given bounds.

    The idea is between(last_updated, previous_day, current_day).

    If this gives true, it means it has been updated within 24 hours; otherwise, it has not been updated in 24 hours.

    Expression

    between(toDate(LocationUpdated), currentDate() - 1, currentDate())

    Output:

    enter image description here