sqlregexhadoophive

regexp to extract multiple occurrences of a pattern in a string


I'm running hive 0.13. I have a column that has various strings that look like this: a:ABC,b:SDSD,c:213123#a:SDS,b:UIOU,c:89797#a:TYRQQQW,b:UIOUIOYYO,c:546654 a:DFSS,b:TYRTTN,c:12323#a:HJH,b:YTUUUTYUTYT,c:67890 a:TYY,b:OPIUIU,c:86768 They could of any length, one set a,b,c values will always be separated by a '#'.

Now, what I'm trying to do is, extract only the b column, like: b:SDSD,b:UIOU,b:UIOUIOYYO b:TYRTTN,b:YTUUUTYUTYT

What I've been trying to do is something like regexp_replace(column,'^channel:+[A-Z]{3,10},',")

I.e., replace all values that isn't b:... with blank, but this isn't working.

Could someone please correct me or suggest a better way?

Thanks.


Solution

  • [^b]:[^,]*,?
    

    Try this.Replace by empty space.See demo.

    https://regex101.com/r/wU7sQ0/27