TL;DR Having an url to a SVN repo, how can I substract all the paths to the files that have been changed since a given revision number?
Long Story: I have a script that downloads some files from a SVN repo. It does so every 'n' hours. But if I only changed one file, I do not need to re-download everything again, just that file. I've tried to check every file with PySVN to see if the revision number has changed, but it takes too long (for a folder with 6 files it takes ~20 seconds). Is there any way I can improve this?
I am working in Python with PySVN. I've seen that pysvn.Client.log has 'changed_paths' attribute, but it seems that I do not know how to handle it :\ This program runs on both Linux and windows, so the solution must be cross-platform (if possible)
use pysvn.Client().log() to find all the changes. For example:
all_logs = pysvn.Client().log( path,
revision_start=pysvn.Revision( opt_revision_kind.head ),
revision_end=pysvn.Revision( opt_revision_kind.number, last_revision ),
discover_changed_paths=True )
This will always return atleast one log for the last_revision. You can either just ignore that or use last_revision+1 and catch the error from svn about missing revision.
pysvn.Client().update() will figure of the smartest way to get changes into your working copy.
Remember that you can checkout only a part of the whole repo by a combination of selecting which folder to start with and then using the depth feature to only get the folder you need. For example:
pysvn.Client().checkout( URL, path, depth=pysvn.depth.files )
Then you only need to use update() to keep the files updated.
Barry Scott author of pysvn.