How to get a Confluence page_id
given a page_url
. For Eg:
If this is the Display URL: https://confluence.som.yale.edu/display/SC/Finding+the+Page+ID+of+a+Confluence+Page
I want to get its page_id
using Confluence REST API
More details here
Do you use atlassian-python-api?
In that case you can parse your url to get the confluence space (SC
) and page title (Finding the Page ID of a Confluence Page
) then use confluence.get_page_id(space, title)
.
from atlassian import Confluence
page_url = "https://confluence.som.yale.edu/display/SC/Finding+the+Page+ID+of+a+Confluence+Page"
confluence = Confluence(
url='https://confluence.som.yale.edu/',
username=user,
password=pwd)
space, title = page_url.split("/")[-2:]
title = title.replace("+", " ")
page_id = confluence.get_page_id(space, title)
Note that when your title contains a special character (+
or ü
, ä
...) your page url will already contain the id like this: https://confluence.som.yale.edu/pages/viewpage.action?pageId=1234567890
so you might want to check for it first.
EDIT: here a version of what your function could look like:
from atlassian import Confluence
import re
import urllib
# regex pattern to match pageId if already in url
page_id_in_url_pattern = re.compile(r"\?pageId=(\d+)")
def get_page_id_from_url(confluence, url):
page_url = urllib.parse.unquote(url) #unquoting url to deal with special characters like '%'
space, title = page_url.split("/")[-2:]
if re.search(page_id_in_url_pattern, title):
return re.search(page_id_in_url_pattern, title).group(1)
else:
title = title.replace("+", " ")
return confluence.get_page_id(space, title)
if __name__ == "__main__":
from getpass import getpass
user = input('Login: ')
pwd = getpass('Password: ')
page_url = "https://confluence.som.yale.edu/display/SC/Finding+the+Page+ID+of+a+Confluence+Page"
confluence = Confluence(
url='https://confluence.som.yale.edu/',
username=user,
password=pwd)
print(get_page_id_from_url(confluence, page_url))