pythonregexurlparse

Split a URL into its components in Python


I have a huge list of URLs that are all like this:

http://www.example.com/site/section1/VAR1/VAR2

Where VAR1 and VAR2 are the dynamic elements of the URL. I want to extract only the VAR1 from this URL string. I've tried to use urlparse, but the output look like this:

ParseResult(scheme='http', netloc='www.example.com', path='/site/section1/VAR1/VAR2', params='', query='', fragment='')

Solution

  • Alternatively, you can apply the split() method:

    >>> url = "http://www.example.com/site/section1/VAR1/VAR2"
    >>> url.split("/")[-2:]
    ['VAR1', 'VAR2']