Want to extract outline values in a google sheet, using a formula
Tried Regex that will work with Arrayformula(IFERROR(REGEXEXTRACT(RANGE,"REGEX")))
almost there but I'm thinking someone has the solution or an easier way to do this?
1(a) Background
1(a)(i) Historical context
1(a)(i)(A) Early influences
1(a)(i)(A)(1) Roman civilization
IIIV.
AA. Background
1.1 Historical context
34 Historical context
a. Early influences
i. Roman civilization
- Background
- Historical context
We are
I. Roman civilization
II. Romans
IV.
IV. I am sad
iii
Expected output
1(a)
1(a)(i)
1(a)(i)(A)
1(a)(i)(A)(1)
IIIV.
AA.
1.1
34
a.
i.
-
-
I.
II.
IV.
IV.
iii
This seems to work but everything gets split over multiple columns, Removing some of the parenthesis from capture groups to prevent split resulted in not extracting "-"
Arrayformula(IFERROR(REGEXEXTRACT(RANGE,"REGEX")))
Regex:
^(([A-Za-z]|\d+)((\([A-Za-z\d]+\))|(\.\d+))*\.?|\-)\s+([^\n]+)
Possible solutions?
concatenating the output Arrayformula([working CONCATENATE formula for variable columns](IFERROR(REGEXEXTRACT(RANGE,"REGEX")))
I think the capture groups are needed for the regex to work, if thats true and if the output with google sheets by default uses a split() any ()into a new column, then perhaps using a substitute to replace any "(" or ")" with ~
prior to the regexextract
and then after extracting these values
Then I was thinking to just use something like Regexreplace(Range," .+","")
but this would extract out the first word of a sentence.
Can you test:
=arrayformula(ifna(regexextract(A:A&"","^.*?[^a-zA-Z](?: |$)"),ifna(regexextract(A:A&"","^[a-z]*(?: |$)"))))