pythonconfluenceconfluence-rest-apiatlassian-python-api

How to download a Confluence page attachment with Python?


With the atlassian-python-api 1.15.1 module and python 3.6 how can I to download a file attached to a Confluence page ?

The page actions section of the API documentation mentions an API get_attachments_from_content, with which I can successfully obtain a list of all page attachments, with their metadata. There's an example at the end of this question of what I can obtain by printing one of the items in the results key.

What I already tried is to use the wget module to downloaad the attachment:

fname = wget.download(base_server_name + attachment['_links']['download'])

However, the downloaded file is not the one on the page, instead I have a large HTML file which looks like a light login page. Also, I'm not sure using wget is relevant here, I'd prefer a solution with the atlassian python API itself, as it's managing authentication by itself.

"result" key:

{'id': '56427526', 'type': 'attachment', 'status': 'current', 'title': 'main.c', 'metadata': {'mediaType': 'application/octet-stream', 'labels': {'results': [], 'start': 0, 'limit': 200, 'size': 0, '_links': {'self': 'https://foo.bar.com/confluence/rest/api/content/56427526/label'}}, '_expandable': {'currentuser': '', 'properties': '', 'frontend': '', 'editorHtml': ''}}, 'extensions': {'mediaType': 'application/octet-stream', 'fileSize': 363, 'comment': ''}, '_links': {'webui': '/pages/viewpage.action?pageId=14648850&preview=%2F14648850%2F56427526%2Fmain.c', 'download': '/download/attachments/14648850/main.c?version=1&modificationDate=1580726185883&api=v2', 'self': 'https://foo.bar.com/confluence/rest/api/content/56427526'}, '_expandable': {'container': '/rest/api/content/14648850', 'operations': '', 'children': '/rest/api/content/56427526/child', 'restrictions': '/rest/api/content/56427526/restriction/byOperation', 'history': '/rest/api/content/56427526/history', 'ancestors': '', 'body': '', 'version': '', 'descendants': '/rest/api/content/56427526/descendant', 'space': '/rest/api/space/~Tim'}}


Solution

  • While I didn't find a way to download the files directly with the atlassian-python-api module, I managed to do it with the requests module, thanks to this answer. Here's the code used to download all attachments visible in the page:

    from atlassian import Confluence
    import requests
    
    confluence = Confluence(
        url="https://my.server.com/Confluence",
        username='MyUsername',
        password="MyPassword")
    
    attachments_container = confluence.get_attachments_from_content(page_id=12345678, start=0, limit=500)
    attachments = attachments_container['results']
    for attachment in attachments:
            fname = attachment['title']
            download_link = confluence.url + attachment['_links']['download']
            r = requests.get(download_link, auth=(confluence.username, confluence.password))
            if r.status_code == 200:
                with open(fname, "wb") as f:
                    for bits in r.iter_content():
                        f.write(bits)