I want to match if one string is contained in another, regardless the order of characters. For example, if I have a string submarine I want to be able to detect marines as a match.
The way I'm currently handling this is through lists:
def match(x, y):
x, y = list(x), list(y)
for i in x:
try:
y.remove(i)
except ValueError:
return False
return True
But this is inefficient when I try to match many combinations.
I thought then to use regex, but didn't make it.
Any ideas?
You can make use of a character class [SEARCH_WORD]
where each character will be searched for independently. By setting the +
quantifier after it, you will look for 1 or more characters, and by adding \b
word boundaries, you will only match whole words:
r'\b[submarine]+\b'
See the regex demo and the IDEONE demo:
import re
s = "I have a string submarine I want to be able to detect marines as a match"
kw = "submarine"
r = re.compile(r"\b[{0}]+\b".format(kw))
print(r.findall(s))
NOTE: If your input can contain non-word characters, especially characters like ^
, ]
, \
or -
, escape those with re.escape
and use r"(?<!\w)[{0}]+(?!\w)".format(re.escape("submarine"))
.
import re
s = "I have a string ^submarine I want to be able to detect ^marines as a match"
kw = "^submarine"
r = re.compile(r"(?<!\w)[{0}]+(?!\w)".format(re.escape(kw)))
print(r.findall(s))
See IDEONE demo