gitdvc

"dvc push" after several local commits


I work on a project with DVC (Data version control). Let's say I make a lot of local commits. Something like this:

# make changes for experiment 1
dvc add my_data_file
git add my_data_file.dvc
git commit -m "Experiment 1"

# make changes for experiment 2
# which change both code and data
dvc add my_data_file
git add my_data_file.dvc
git commit -m "Experiment 2"

# make changes for experiment 3
# which change both code and data
dvc add my_data_file
git add my_data_file.dvc
git commit -m "Experiment 3"

# Finally I'm done
# push changes:
dvc push
git push

However there is one problem: dvc push will only push data from experiment 3. Is there any way to push data from all local commits (i.e. starting from the first commit diverged from remote branch)?

Currently I see two options:

  1. Tag each commit and push it with dvc push -T
  2. After "expermient 3" commit execute git checkout commit-hash && dvc push for all local commits not yet pushed to remote.

Both these options seem cumbersome and error-prone. Is there any better way to do it?


Solution

  • @NShiny, there is a related ticket:

    support push/pull/metrics/gc, etc across different commits.

    Please, give it a vote so that we know how to prioritize it.

    As a workaround, I would recommend to run dvc install. It installs a pre-push GIt hook and runs dvc push automatically:

    Git pre-push hook executes dvc push before git push to upload files and directories under DVC control to remote.
    

    It means, though you need to run git push after every git commit :(