I'm trying to generate 3 new calculated fields in Google Data Studio / Looker Studio using the REGEXP_EXTRACT
function.
Here is the sample data that I have on a Google sheet:
Sample data |
---|
Serviços de Impressão > Impressoras > Falha na impressão > Troca de Tonner |
Aplicativos e Softwares > Avaliação de Aplicativos e Software > Pacote Office |
Computadores e Periféricos > Manutenção > Teclado / Mouse > Aquisição de equipamento |
Acessos > Certificado Digital |
Each >
represents a division and ideally what I would like to do is extract the first three fields and disregard the rest, something like this:
Calculated field 1 | Calculated field 2 | Calculated field 3 |
---|---|---|
Serviços de Impressão | Impressoras | Falha na impressão |
Aplicativos e Softwares | Avaliação de Aplicativos e Software | Pacote Office |
Computadores e Periféricos | Manutenção | Teclado / Mouse |
Acessos | Certificado Digital | null |
I managed to generate a code to extract the first calculated field using
REGEXP_EXTRACT(Sample data,'^(.+?)>')
but in the second I didn't know how to do it, since I can always have one or more separators >
as in the example of the last line.
How can I formulate the codes for calculated fields 2 and 3 please?
The second field can be extracted with
> (.*?)(?: > |$)
The third field - if any - can be obtained with
> .*? > (.*?)(?: > |$)
The (?: > |$)
non-capturing group matches either space + >
+ space, or end of string.