I want to return all the words which start and end with letters or numbers. They may contain at most one period .
OR hypen -
in the word.
So, ab.ab
is valid but ab.
is not valid.
import re
reg = r"[\d\w]+([-.][\d\w]+)?"
s = "sample text"
print(re.findall(reg, s))
It is not working because of the parenthesis. How can I apply the ?
on combination of [-.][\d\w]+
If ab.
is not valid and should not be matched and the period or the hyphen should not be at the start or at the end, you could match one or more times a digit or a character followed by an optional part that matches a dot or a hyphen followed by one or more times a digit or a character.
(?<!\S)[a-zA-Z\d]+(?:[.-][a-zA-Z\d]+)?(?!\S)
Explanation
(?<!\S)
Negative lookbehind to assert that what is on the left is not a non whitespace character[a-zA-Z\d]+
Match one or more times a lower/uppercase character or a digit(?:[.-][a-zA-Z\d]+)?
An optional non capturing group that would match a dot or a hypen followed by or more times a lower/uppercase character or a digit(?!\S
Negative lookahead that asserts that what is on the right is not a non whitespace character.