excelexcel-formulaextractmultiple-occurrence

MS Excel: Extract between the Nth and Nth Character in a string


Using an MS Excel formula (No VBA/Macro), I would like to extract between the Nth and Nth characters and/or letters in a string. Example: I have text in Columns A2 and A3, I would like to extract text located between the 4th space and 9th space in each of the following strings.

Column A2: Johnny went to the store and bought an apple and some grapes

Column A3: We were not expecting to go home so early but we had no other choice due to rain

Results:

Column A2: store and bought an apple

Column A3: to go home so early


Solution

  • With Microsoft 365 or Excel 2019 you can use TEXTJOIN() in combination with FILTERXML():

    enter image description here

    Formula in B1:

    =TEXTJOIN(" ",,FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[position()>4][position()<6]"))
    

    The xpath expressions first selects all elements from the 5th word onwards and consequently only returns the first 5 elements from that respective array. Therefor //s[position()>4][position()<6] can also be written as //s[position()>4 and position()<10].