pythonpandasdataframenumpycode-splitting

String Splitting of an URL which always changes the position of it's values in python


I need to split an url which is changing the positions of it's values very oftenly.

for example:- This is the url with three different positions of request token

01:-https://127.0.0.1/?action=login&type=login&status=success&request_token=oCS44HJQT2ZSCGb39H76CjgXb0s2klwA

02:-https://127.0.0.1/?request_token=43CbEWSxdqztXNRpb2zmypCr081eF92d&action=login&type=login&status=success

03:-https://127.0.0.1/?&action=login&request_token=43CbEWSxdqztXNRpb2zmypCr081eF92d&type=login&status=success

From thses url i need only the value of request token which comes after the '=' with an alphanumeric number like this '43CbEWSxdqztXNRpb2zmypCr081eF92d'.

And to split this url i'm using this code

request_token = driver.current_url.split('=')[1].split('&action')[0]

But it gives me error when the url is not in the specified position.

So can anyone please give me a solution to this url splitting in just a single line in python and it'd be a great blessing for me from my fellow stack members.

Note:- Here i'm using driver.current_url because i'm working in selenium to do the thing.


Solution

  • Assuming you have the URLs as strings then you could use a regular expression to isolate the request tokens.

    import re
    urls = ['https://127.0.0.1/?action=login&type=login&status=success&request_token=oCS44HJQT2ZSCGb39H76CjgXb0s2klwA',
            'https://127.0.0.1/?request_token=43CbEWSxdqztXNRpb2zmypCr081eF92d&action=login&type=login&status=success',
            'https://127.0.0.1/?&action=login&request_token=43CbEWSxdqztXNRpb2zmypCr081eF92d&type=login&status=success']
    for url in urls:
        m = re.match('.*request_token=(.*?)(?:&|$)', url)
        if m:
            print(m.group(1))