I'm trying to extract all the #hashtags from the "Tags: #tag1 #tag2" line of a multimarkdown plaintext file. (I'm in Python multiline mode.)
I've tried using lookaheads:
^(?=Tags:\s.*)#(\w+)\b
and lookbehinds:
#(\w+)\b(?<=Tags:^\s)
Plain vanilla #(\w+)\b
works, except it picks up any #hashtag that might appear later in the document.
Any hints, help, instruction appreciated.
text = "\n\n#bogus\nTags: #foo #bar\n"
First, you need to get the line:
line = re.findall(r'Tags:.+\n', text)
# line = ['Tags: #foo #bar\n']
Lastly, you need to get the tags from the line:
tags = re.findall(r'#(\w+)', line[0])
# tags = ['foo', 'bar']
tags = re.findall(r'#\w+', line[0])
# tags = ['#foo', '#bar']
Lookbehind won't work since you would need to provide a pattern that doesn't have a fixed width.