From this python code,
...
resp = logout_session.get(logout_url, headers=headers, verify=False, allow_redirects=False)
soup = BeautifulSoup(resp.content, "html.parser")
print(soup.prettify())
I was able to make an API call, and the response content is of this:
<!DOCTYPE html>
<html>
<head>...</head>
<body>
<div class="container">
<div class="title logo" id="header">
<img alt="" id="business-logo-login" src="/customviews/image/business_logo:f0a067275aba3c71c62cffa2f50ac69c/"/>
</div>
<div class="input-group alert alert-success text-center" id="title" role="alert">
Successfully signed out
</div>
<div class="input-group alert text-center">
<a href="/saml-idp/portal/">
Login again
</a>
</div>
<div>
<p>
You will be redirected to https://idpftc.business.com/saml/Gy736KPK3v1aWDPECRZKAn/proxy_logout/ after 5 seconds ...
</p>
<script language="javascript" nonce="">
window.onload = window.setTimeout(function() {
window.location.replace("https://idpftc.business.com/saml/Gy736KPK3v1aWDPECRZKAn/proxy_logout/?SAMLResponse=3VjJkuNIjv2VtKijLJObJIphlWnGfd93Xtoo7vsukvr6ZkRU");}, 5000);
</script>
</div>
</div>
</body>
</html>
Now I want to extract the html link:
https://idpftc.business.com/saml/Gy736KPK3v1aWDPECRZKAn/proxy_logout/?SAMLResponse=3VjJkuNIjv2VtKijLJObJIphlWnGfd93Xtoo7vsukvr6ZkRU
from this content, does anyone know how to do it in python ?
Try:
import re
# resp = requests.get(...)
url = re.search(r'window\.location\.replace\("([^"]+)', resp.text).group(1)
print(url)
Prints:
https://idpftc.business.com/saml/Gy736KPK3v1aWDPECRZKAn/proxy_logout/?SAMLResponse=3VjJkuNIjv2VtKijLJObJIphlWnGfd93Xtoo7vsukvr6ZkRU