jsongitcurlgithub

How to clone all repos (including private repos) from GitHub?


I'm trying to clone all of my repos at once to my computer, all of which are private. I've tried countless one-liners and scripts (namely, the ones here and here), but none of them work.

Initially I would get errors back about JSON not being able to parse the response, which I eventually realized was because the response was empty since I had no public repos. When I made a test public repo, it would return a JSON object with the info for that specific repo, but none of the private ones. From what I understand, I need to pass both my username and an access token to GitHub, in which the access token was generated at Settings > Developer settings > Personal access tokens.

I've tried both of the following formats to no avail:

curl -i -u [[USERNAME]]:[[TOKEN]] -s https://api.github.com/users/[[USERNAME]]/repos?per_page=100 [[...]]

curl -i -u [[USERNAME]] -s https://api.github.com/users/[[USERNAME]]/repos?per_page=100&access_token=[[TOKEN]] [[...]]

The [[...]] part that follows is various code snippets like the ones in the links above. I believe these parts are fine, as they clone public repos without any issues, and rather the issue lies in me not being able to see my private repos despite having an access token. It is important to note that when you generate the access token, you define the scopes for what it can do, and I've defined mine with full access to everything, including repo, which should grant it control over private repos.

Additionally, sometimes when I would try the command above, I would get the following response:

 HTTP/1.1 401 Unauthorized
Server: GitHub.com
Date: Fri, 13 Oct 2017 08:08:01 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 93
Status: 401 Unauthorized
X-GitHub-Media-Type: github.v3; format=json
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 41
X-RateLimit-Reset: 1507884238
Access-Control-Expose-Headers: ETag, Link, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval
Access-Control-Allow-Origin: *
Content-Security-Policy: default-src 'none'
Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
X-Content-Type-Options: nosniff
X-Frame-Options: deny
X-XSS-Protection: 1; mode=block
X-Runtime-rack: 0.060685
X-GitHub-Request-Id: D038:4E67:1349CC:2CB494:59E07461

{
  "message": "Bad credentials",
  "documentation_url": "https://developer.github.com/v3"
}

Despite knowing that my credentials are just fine.

Does anyone have any idea what's going wrong for me? I've been running circles around this for hours and have come up empty.


Solution

  • Alright, after days of trolling through random SO posts/Gists and the API docs I figured it out. The breakthrough came from this post in particular, as the issue was how I structured my GET request. While there was nothing wrong with it per se, there are two ways to go about it, one works, one doesn't, and GitHub doesn't document this, go figure ¯\_(ツ)_/¯

    Here is a properly formatted curl command to get all (public AND private) repos for a user:

    curl -iH "Authorization: token [[TOKEN]]" https://api.github.com/user/repos
    

    The [[TOKEN]] part should be your OAuth token. To generate this, read here, or do the following summary:

    The -i flag includes the request headers. The two important things to look for here are:

    The -H flag is needed for the authorization string that follows

    Now, to clone all of your repos, use this one-line command. This gist here has many code snippets that achieve this in php, ruby, python, etc, but I like the bash solution personally:

    for i in `curl -H "Authorization: token [[TOKEN]]" https://api.github.com/user/repos?per_page=100 | grep ssh_url | cut -d ':' -f 2-3|tr -d '",'`; do git clone $i; done
    

    Some notes about the above:

    EDIT: Something extra I came up with in my own messing around using the Ruby version of the logic code, If you (as a user) are part of multiple organizations and don't want to download the repos from some of them, you can create a blacklist by specifying strings that match the name of the organization. For example, I want to code all repos I have access to, but I don't want to clone the repos in "Google" or "Twitter":

    curl -H "Authorization: token [[TOKEN]]" https://api.github.com/user/repos?per_page=100 | ruby -rubygems -e 'require "json"; JSON.load(STDIN.read).each { |repo| %x[git clone #{repo["ssh_url"]}] unless repo["full_name"].include? "Google" or repo["full_name"].include? "Twitter"}'