gitgithubmerge

Is it possible to find all commits that origin/master pointed to at some point in history?


I am currently spelunking in the messy history of a large monorepo that enforced very few conventions about how work is merged in.

  1. Sometimes people rebased onto master and then fast-forward merged their branches with no merge commit.
  2. Sometimes people merged their code directly into master with a merge commit. In this case the first parent is master.
  3. Sometimes people reverse-merged master into their branch, and then fast-forwarded master to the merge commit they created. In this case the first parent belongs to the feature branch and the second parent is master.

Is there any magic variation of git log or other git commands that will show me the "linear" history of origin/master? That is, I want to see all the commits that were at some point pointed to by origin/master, and none of the commits that were only on a feature branch.

Effectively, I am trying to discover the commits that would hypothetically show up in git reflog master with a non-expiring reflog, except on the remote side (GitHub) where reflogs are not kept.

If everyone had picked strategy 1) or 2), then git log --first-parent might be sufficient for what I am attempting to do, but that does not work in this case because of the messy merging strategies.


Additional context:

I am ultimately trying to find the commit at the tip of origin/master roughly 24 hours before the current time, but sorting by commit date does not work because the history includes commits that were committed earlier and then merged into origin/master within the last 24 hours, which is why I was trying to filter commits to only those that had at one point lived on origin/master.


Solution

  • The first quick check you can run: see what local reflog you have for your local origin/master ref:

    git reflog origin/master
    

    This ref gets updated only when you run git fetch (or git pull), so the granularity of updates depend on how frequently you run it, but if the information "what was origin/master state yesterday ?" is useful to you, this should be enough.


    Although you don't have a direct access to the reflog on github, github's API has the "events" api, which contains roughly the same information (and more):

    https://docs.github.com/en/rest/activity/events?apiVersion=2022-11-28

    more specifically:

    You will have to choose a way to query the API and filter the response on your side (e.g: curl + jq ? node ? python ? go ? you name it ...), the events you are interested in will match the following filter:

    The fields you are looking for are: