regexgoogle-sheetsregex-replace

Extract text from a cell without extracting punctuation in Google Sheets


In Column F, I have a list of states and their abbreviations.

enter image description here

I want to use a regex function to extract the state and the state code so the output looks like the contents in Columns G and H.

How do I do this? Thanks!

When I type in:

=regexextract(F3,"(\w+)(\s\w+)?")

The function works for states with one word names like Alaska and Florida, but separates the contents of multi-word states into multiple columns. it also fails to capture all three words in "District of Columbia". I could fix it to capture 3 words, but it wouldn't resolve the fundamental problem of keeping the output in a single cell in a single column.

enter image description here

I am at a loss of how to extract the state code from the quotation marks.


Solution

  • You can capture word characters optionally repeated by a space and again word characters in the first group, and capture all between the double quotes in the second group.

    =regexextract(F2,"(\w+(?:\s+\w+)*),""([^""]*)""")
    

    enter image description here