github3.py

Recommended way to list all repos/commits for a given user using github3.py


I'm building a GitHub application to pull commit information from our internal repos. I'm using the following code to iterate over all commits:

gh = login(token=gc.ACCESS_TOKEN)
for repo in gh.iter_repos():
    for commit in repo.iter_commits():
        print(commit.__dict__)
        print(commit.additions)
        print(commit.author)
        print(commit.commit)
        print(commit.committer)
        print(commit.deletions)
        print(commit.files)
        print(commit.total)

The additions/deletions/total values are all coming back as 0, and the files attribute is always []. When I click on the url, I can see that this is not the case. I've verified through curl calls that the API indeed has record of these attributes.

Reading more in the documentation, it seems that iter_commits is deprecated in favor of iter_user_commits. Might this be the case why it is not returning all information about the commits? However, this method does not return any repositories for me when I use it like this:

gh = login(token=gc.ACCESS_TOKEN)
user = gh.user()
for repo in gh.iter_user_repos(user):

In short, I'm wondering what the recommended method is to get all commits for all the repositories a user has access to.


Solution

  • There's nothing wrong with iter_repos with a logged in GitHub instance.

    In short here's what's happening (this is described in github3.py's documentation): When listing a resource from GitHub's API, not all of the attributes are actually returned. If you want all of the information, you have to request the information for each commit. In short your code should look like this:

    gh = login(token=gc.ACCESS_TOKEN)
    for repo in gh.iter_repos():
        for commit in repo.iter_commits():
            commit.refresh()
            print(commit.additions)
            print(commit.deletions)
            # etc.