regexminifyapache-nifihortonworks-data-platformhortonworks-dataflow

Regular expression to extract comma separated values?


I am trying to read values from a csv file and then store the values into attributes using ExtractText processor. The file contains only one line, which has 5 values separated by comma. Here is the content of my file:

jdbc:mysql://localhost:3306/test, com.mysql.jdbc.Driver, C:\ProgramFiles\MySQL\mysql-connector.jar, root, root 

I have manually added 5 properties in the ExtractText processor:-

DatabaseConnectionURL
DatabaseDriverClass
DatabaseDriverLocation
DatabaseUser
Password

Now, I want regular expressions for the above 5 attributes that I have defined in the ExtractText processor so that they get the following value:-

DatabaseConnectionURL = jdbc:mysql://localhost:3306/test
DatabaseDriverClass = com.mysql.jdbc.Driver
DatabaseDriverLocation = C:\Program Files\MySQL\mysql-connector.jar
DatabaseUser = root
Password = root

Can you please provide me the regular expression for the above 5 attributes?


Solution

  • Rishab,

    You the ExtractText processor with the following regular expression to capture lines.

    ExtractedData:(^.*$)
    

    Then use updateAttribute with getDelimitedField() expression as demonstrated below to assign values to flow file attributes.

    DatabaseConnectionURL:${ExtractedData:getDelimitedField(1)}
    
    DatabaseDriverClass:${ExtractedData:getDelimitedField(2)}
    
    DatabaseDriverLocation:${ExtractedData:getDelimitedField(3)}
    
    DatabaseUser:${ExtractedData:getDelimitedField(4)}
    
    Password:${ExtractedData:getDelimitedField(5)}
    

    getDelimitedField() "Parses the Subject as a delimited line of text and returns just a single field from that delimited text." and can be used on any configuration property that supports NiFi's expression language. For detailed getDelimitedField() explanation, view the NiFi Expression Language guide.

    https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#getdelimitedfield

    Hope this solution helps solve your problem.

    Don't forget to accept if it worked and let me know if you run into any issues.