I have dummy data related to product code of books:
XXX | AAA | BB | Document Code | Test Type |
---|---|---|---|---|
P01 | BOO | 01 | BK-001-CR | 1. Content Quality Testing - Proofreading - Fact-checking - Final editing 2. Design and Layout Testing |
P01 | BOO | 02 | BK-002-CR | 1. Physical Production Testing 2. Distribution Testing |
P02 | OOK | 01 | BK-001-CR | 1. Content Quality Testing - Proofreading - Fact-checking - Final editing 2. Design and Layout Testing |
P03 | KOO | 01 | BK-001-RT | 1. Content Quality Testing - Proofreading - Fact-checking - Final editing 2. Design and Layout Testing |
as listed in this link.
I want to clean it up into expected output as follows:
Product Code | Document Code | Test Type |
---|---|---|
P01.BOO.01 P02.OOK.01 |
BK-001-CR | Content Quality Testing - Proofreading - Fact-checking - Final editing |
P01.BOO.01 P02.OOK.01 |
BK-001-CR | Design and Layout Testing |
P01.BOO.02 | BK-002-CR | Physical Production Testing |
P01.BOO.02 | BK-002-CR | Distribution Testing |
P03.KOO.01 | BK-001-RT | Content Quality Testing - Proofreading - Fact-checking - Final editing |
P03.KOO.01 | BK-001-RT | Design and Layout Testing |
as listed in this link.
For the output, I have tried with formula in Product Code column:
=ARRAYFORMULA(
TEXTJOIN(" "; TRUE;
FILTER(
IMPORTRANGE("https://docs.google.com/spreadsheets/d/1dIVsBszYwQnELHwu8C7BaamZnQTz2zogzKlrkq72IZQ/edit?gid=0#gid=0"; "Test!B2:B5") & "." &
IMPORTRANGE("https://docs.google.com/spreadsheets/d/1dIVsBszYwQnELHwu8C7BaamZnQTz2zogzKlrkq72IZQ/edit?gid=0#gid=0"; "Test!C2:C5") & "." &
IMPORTRANGE("https://docs.google.com/spreadsheets/d/1dIVsBszYwQnELHwu8C7BaamZnQTz2zogzKlrkq72IZQ/edit?gid=0#gid=0"; "Test!D2:D5");
(REGEXMATCH(IMPORTRANGE("https://docs.google.com/spreadsheets/d/1dIVsBszYwQnELHwu8C7BaamZnQTz2zogzKlrkq72IZQ/edit?gid=0#gid=0"; "Test!F2:F5"); I2) +
REGEXMATCH(IMPORTRANGE("https://docs.google.com/spreadsheets/d/1dIVsBszYwQnELHwu8C7BaamZnQTz2zogzKlrkq72IZQ/edit?gid=0#gid=0"; "Test!B2:B5"); H2)) > 0
)
)
)
In Document Code column:
=QUERY(IMPORTRANGE("https://docs.google.com/spreadsheets/d/1dIVsBszYwQnELHwu8C7BaamZnQTz2zogzKlrkq72IZQ/edit?gid=0#gid=0"; "Test!E2:F5"); "SELECT Col1 WHERE Col2 CONTAINS '" & I2 & "'")
In Test Type column:
=ARRAYFORMULA(
UNIQUE(
FILTER(
TRIM(
FLATTEN(
SPLIT(
REGEXREPLACE(
IMPORTRANGE("https://docs.google.com/spreadsheets/d/1dIVsBszYwQnELHwu8C7BaamZnQTz2zogzKlrkq72IZQ/edit?gid=0#gid=0"; "Test!F2:F5");
"\d+\.";
"~"
);
"~"
)
)
);
(TRIM(FLATTEN(SPLIT(REGEXREPLACE(IMPORTRANGE("https://docs.google.com/spreadsheets/d/1dIVsBszYwQnELHwu8C7BaamZnQTz2zogzKlrkq72IZQ/edit?gid=0#gid=0"; "Test!F2:F5"); "\d+\."; "~"); "~"))) <> "")
)
)
)
However, there is still an error and it still does not match the expected output.
Is there any other formula?
Here's a generalized approach which you may adapt accordingly:
=let(Λ;importrange("1dIVsBszYwQnELHwu8C7BaamZnQTz2zogzKlrkq72IZQ";"B2:F7");Σ;lambda(x;choosecols(Λ;x));a;Σ(1);b;Σ(2);c;Σ(3);d;Σ(4);e;Σ(5);
reduce(tocol(;1);unique(tocol(d;1));lambda(x;y;vstack(x;reduce(tocol(;1);sequence(max(index(ifna(--regexextract(split(xlookup(y;d;e);char(10));"^\d+")))));lambda(f;q;vstack(f;
hstack(join(char(10);filter(a&"."&b&"."&c;d=y));y;regexextract(xlookup(y;d;e);"(?s).*"&q&". (.*?)(?:\n"&q+1&". |$)")))))))))