from urllib.request import urlopen
import re
urlpath =urlopen("http://blablabla.com/file")
string = urlpath.read().decode('utf-8')
pattern = re.compile('*.docx"')
onlyfiles = pattern.findall(string)
print(onlyfiles)
Target output
['http://blablabla.com/file/1.docx','http://blablabla.com/file/2.docx']
But I got this
[]
I get this error message when trying this.
re.error: nothing to repeat at position 0
The star from this line:
pattern = re.compile('*.docx"')
Apparently seems to be a python known bug:
Check out this related answers: regex error - nothing to repeat
Try this using word or a-z regexp:
pattern = re.compile('\w*.docx"')
# or
pattern = re.compile('[a-zA-Z0-9]*.docx"')