pythongithub3.py

How to solve exception 410 in python?


I am working to extract issues data from a repo on Github using Github3.py. The following is a part of my code to extract issues from a repo:

I used these libraries in the main code:

from github3 import login
from mysql.connector import IntegrityError
import config as cfg
import project_list
from github3.exceptions import NotFoundError
from github3.exceptions import GitHubException
import datetime
from database import Database
import sys
import re
import time

Then the main code is:

DEBUG = False

def process(url, start):

    re_pattern = re.compile(u'[^\u0000-\uD7FF\uE000-\uFFFF]', re.UNICODE)
    splitted = url.split("/")
    org_name = splitted[3]
    repo_name = splitted[4]

    while True:
        try:
            gh = login(token = cfg.TOKEN)
            repo = gh.repository(org_name, repo_name)

            print("{} =====================".format(repo))
            if start is None:
                i = 1
            else:
                i = int(start)
            if start is None:
                j = 1
            else:
                j = int(start)
            Database.connect()
            while True:
                
                try:
                    issue = repo.issue(i)
                    
                    issue_id = issue.id
                    issue_number = issue.number
                    status_issue = str(issue.state)                    
                    close_author = str(issue.closed_by)
                    com_count = issue.comments_count
                    title = re_pattern.sub(u'\uFFFD', issue.title)
                    created_at = issue.created_at
                    closed_at = issue.closed_at
                    now = datetime.datetime.now()
                    reporter = str(issue.user)
                    body_text = issue.body_text
                    body_html = issue.body_html

                    if body_text is None:
                        body_text = ""
                    if body_html is None:
                        body_html = ""

                    body_text = re_pattern.sub(u'\uFFFD', body_text)
                    body_html = re_pattern.sub(u'\uFFFD', body_html)
                    Database.insert_issues(issue_id, issue_number, repo_name,status_issue , close_author, com_count, title, reporter, created_at, closed_at, now, body_text, body_html)
                    print("{} inserted.".format(issue_id))

                    if DEBUG == True:
                        break;

                except NotFoundError as e:
                    print("Exception @ {}: {}".format(i, str(e)))
                
                except IntegrityError as e:
                    print("Data was there @ {}".format(str(e)))
                i += 1
                j += 1
        except GitHubException as e:
            print("Exception: {}".format(str(e)))
            time.sleep(1000)
            i -= 1
            j -= 1

if __name__ == "__main__":
    if len(sys.argv) == 1:
        sys.exit("Please specify project name: python issue-github3.py <project name>")

    if len(sys.argv) == 2:
        start = None
        print("Start from the beginning")
    else:
        start = sys.argv[2]

    project = sys.argv[1]
    url = project_list.get_project(project)
    process(url, start)

With the above code, everything is ok for me and I can extract issues from a repo on GitHub.

Problem: Exception: 410 Issues are disabled for this repo occurs after 100 successful issues extraction from a repo.

How could I solve this problem?

As mentioned in the main code, I fixed the exception 404 (i.e., Not found issues) with the library of from github3.exceptions import NotFoundError and the below code:

except NotFoundError as e:
    print("Exception @ {}: {}".format(i, str(e)))

Given the main code, what library and code should I use to fix exception 410?


Solution

  • I found an easy way to fix it but it doesn't solve the problem completely.

    As I mentioned before, the exception occurs after 100 successful issue number (i.e., issue number of 101 is a problem), and as @LhasaDad said above, there is no issue number of 101 in the repo (I checked it manually). So we just need to put 102 instead of None where start = None, then execute the code again.