regexlooker-studiore2

REGEXP_EXTRACT in google data studio


I'm trying to generate 3 new calculated fields in Google Data Studio / Looker Studio using the REGEXP_EXTRACT function.

Here is the sample data that I have on a Google sheet:

Sample data
Serviços de Impressão > Impressoras > Falha na impressão > Troca de Tonner
Aplicativos e Softwares > Avaliação de Aplicativos e Software > Pacote Office
Computadores e Periféricos > Manutenção > Teclado / Mouse > Aquisição de equipamento
Acessos > Certificado Digital

Each > represents a division and ideally what I would like to do is extract the first three fields and disregard the rest, something like this:

Calculated field 1 Calculated field 2 Calculated field 3
Serviços de Impressão Impressoras Falha na impressão
Aplicativos e Softwares Avaliação de Aplicativos e Software Pacote Office
Computadores e Periféricos Manutenção Teclado / Mouse
Acessos Certificado Digital null

I managed to generate a code to extract the first calculated field using

REGEXP_EXTRACT(Sample data,'^(.+?)>')

but in the second I didn't know how to do it, since I can always have one or more separators > as in the example of the last line.

How can I formulate the codes for calculated fields 2 and 3 please?


Solution

  • The second field can be extracted with

     > (.*?)(?: > |$)
    

    The third field - if any - can be obtained with

     > .*? > (.*?)(?: > |$)
    

    The (?: > |$) non-capturing group matches either space + > + space, or end of string.