hadoopapache-nifihortonworks-dataflow

In Nifi, what is the difference between FirstInFirstOutPrioritizer and OldestFlowFileFirstPrioritizer


User guide https://nifi.apache.org/docs/nifi-docs/html/user-guide.html has the below details on prioritizers, could you please help me understand how these are different and provide any real time example.

FirstInFirstOutPrioritizer: Given two FlowFiles, the one that reached the connection first will be processed first.

OldestFlowFileFirstPrioritizer: Given two FlowFiles, the one that is oldest in the dataflow will be processed first. 'This is the default scheme that is used if no prioritizers are selected.'


Solution

  • Imagine two processors A and B that are both connected to a funnel, and then the funnel connects to processor C.

    Scenario 1 - The connection between the funnel and processor C has first-in-first-out prioritizer.

    In this case, the flow files in the queue between the funnel and connection C will be processed strictly based on the order they reached the queue.

    Scenario 2 - The connection between the funnel and processor C has oldest-flow-file-first prioritizer.

    In this case, there could already be flow files in the queue between the funnel and connection C, but one of the processors transfers a flow to that queue that is older than all the flow files in that queue, it will jump to the front.

    You could imagine that some flow files come from a different portion of the flow that takes way longer to process than other flow files, but they both end up funneled into the same queue, so these flow files from the longer processing part are considered older.