maybe someone of you can help me or maybe point me in the right direction how to solve this problem in excel.
I have a sheet with cells containing mixed values like:
texxtext W.1234.56 text
text text U.3853.23
P.3853.34 texttext
I just need the specific ArticleNumber (X.XXXX.XX) in between. Like this RegEx Pattern ([A-Za-z].\d\d\d\d.\d\d)
Is there a way to achieve this with Formular Functions?
Thanks in advance!
[A-Za-z].\d\d\d\d.\d\d
PY():
If applicable, you can use the PY()
function for this:
Formula in B1
:
=PY(
xl("A1:A3").replace(r'.*\b([A-Za-z]\.\d{4}\.\d{2})\b.*', r'\1', regex=True)[0].values
)
‡ At time of writing PY()
is only available within the beta-channel of a ms365 subscription.
SEARCH():
Otherwise, try something more simple first to see if it would be sufficient to use wildcard in a regular SEARCH()
pattern:
=MID(A1:A3,SEARCH("?.????.??",A1:A3),9)
‡ With this method I assume there is no, or a very slim, chance that there are other substring in your text that follow that exact pattern.
FILTERXML():
Another, more intricate way would be to use FILTERXML()
. That way you can utilize xpath-patterns that can check for the pattern at hand:
=FILTERXML("<t><s>"&SUBSTITUTE(A1:A3," ","</s><s>")&"</s></t>","//s[string-length()=9][concat(substring(., 2, 1),substring(.,7,1))='..'][translate(substring(.,1,1),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','')=''][concat(substring(.,2,4),substring(.,8))*0=0]")
Here the and-ing structure of predicates mean:
[string-length()=9]
- Check that a node is 9 characters long;[concat(substring(., 2, 1),substring(.,7,1))='..']
- When we concatenate the 2nd and 7th character it equals '..';[translate(substring(.,1,1),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','')='']
- If we translate the character at the 1st index from any (uppercase) alphachar to nothing it equals an empty string;[concat(substring(.,2,4),substring(.,8))*0=0]
- If we concat the remaining characters and multiply this by zero it would equal zero.‡ Here I have assumed a space-delimited string to begin with. We use the spaces to split the text into nodes. This is not as straightforward nomore if punctuation comes into play.
‡‡ More information on FILTERXML()
here