restgithubcurlazure-devops

How do I get and download the contents of a file in GitHub using the REST API?


I am trying to use the GitHub API to retrieve and download the contents of a file in my GitHub repository to mimic how I am using the Azure DevOps REST API.

My ADO URL is:

https://dev.azure.com/<Org>/<Project>/_apis/git/repositories/<Repository>/items?versionType=branch&version=develop&path=<Path to file>/DEV1.yml&download=true

Above works perfectly fine and if I type that into my browser, it immediately downloads the DEV1.yml file.

I am trying to mimic this implementation using GitHub's REST API. My URL looks like:

https://api.github.com/repos/<Org>/<Repository>/contents/<Path to file>/DEV1.yml&download=true

This GitHub URL does not work in the browser, I assume it's because the repo is private and I need a token. However when I use curl:

curl -H "Authorization: Bearer <PAT>" https://api.github.com/repos/<Org>/<Repository>/contents/<Path to file>/DEV1.yml&download=true

This returns a large base64 encoded JSON object. I attempted to use the URL in place of the existing ADO one and it failed.

Are the 2 above URLs not the same?


Solution

  • Yes, it is different.

    Github REST API download files has two steps.

    1, The first step is get the download url.

    The url format like this:

    https://api.github.com/repos/<Project Name>/<Repository Name>/contents/<File Name>
    

    The response format like this:

    enter image description here

    2, The second step is using the download url to get the file content.

    The url format like this:

    https://raw.githubusercontent.com/<Project Name>/<Repository Name>/main/<File Name>?token=<Random Token that related to Revision Version>
    

    Please notice that the first step can't skip, otherwise you will be unable to get the revision token.

    I can achieve your requirement using python:

    import requests
    
    #Define required information
    project_name = "xxx"
    repository_name = "xxx"
    # repository_name = "xxx"
    branch_name = "xxx"
    File_name = "xxx"
    PAT = "xxx"
    url = "https://api.github.com/repos/"+project_name+"/"+repository_name+"/contents/"+File_name
    
    #downoad file from github
    payload = {}
    headers = {
        'Authorization': 'token '+PAT
    }
    
    #download file
    file_content = requests.request("GET", ((requests.request("GET", url, headers=headers, data=payload)).json())['download_url'], headers=headers, data=payload)
    print(file_content.text)
    

    Successfully get the latest contents(I am also based on private repository):

    enter image description here