regexapache-sparkregex-groupregexp-replacemetacharacters

Dangling metacharacter * sparksql


Below regex works in Hive but not in Spark.

It throws an error dangling metacharacter * at index 3:

select regexp_extract('a|b||c','^(\\|*(?:(?!\\|\\|\\w(?!\\|\\|)).)*)');

I also tried escaping * with \\* but still it throws dangling metacharacter * at index 3.


Solution

  • You can use

    regexp_replace(col, '^(.*)[|]{2}.*$', '$1')
    

    See the regex demo.

    Regex details: