javaregexsolrregex-lookaroundsdataimporthandler

solr DIH: RegExTransformer


Currently, I need to apply a transformation on bellow third column:

ACAC | 0 | 01
ACAC | 0 | 0101
ACAC | 0 | 0102
ACAC | 0 | 010201

I need to transform "010201" to "01/02/01".

So first I need to:

  1. trim all ending 0 characters
  2. split each 2 numbers and add "/" character.

The context of this transformation is inside solr data import handler transformers, but it's using java regex library internally.

Is there anyway to get that?

I've tried using this regex:

Currently, I need to apply a transformation on bellow third column:

ACAC | 0 | 01
ACAC | 0 | 0101
ACAC | 0 | 0102
ACAC | 0 | 010201

I need to transform "010201" to "01/02/01".

So first I need to:

  1. trim all ending 0 characters
  2. split each 2 numbers and add "/" character.

The context of this transformation is inside solr data import handler transformers, but it's using java regex library internally.

Is there anyway to get that?

(\d[1-9]{1})

it tokens me:

01/04/01/

And would need:

01/04/01

Replace expression is:

$&/

Any ideas?


Solution

  • You can use

    \d{2}(?=(?:\d{2})+$)
    

    Replace with $0/, see the regex demo.

    Details

    The $0 in the replacement stands for the whole match.

    In the RegExTransformer code, use

    <field column="colname" regex="\d{2}(?=(?:\d{2})+$)" replaceWith="$0/" />