I am not completely sure if Docker is enough for R development or I should use in in conjunction with Packrat. I have read several posts that state that docker is sufficient. The only place that support this claim is this post. However I was not able to build that example due to errors in the git2r installation.
My overall goal is to have full control of the package versions I use, so my analysis will still work even if the package is later upgraded.
You need both. Think that the docker image is just the final product of your source code, including the Dockerfile and every piece of data used to build the final image.
You should pin the docker (avoid FROM blah:latest
) base image to be sure that the underlying libraries and tools will be always the same. Don't use base images such as debian/testing that may change on every run of apt-get install
.
If you don`t use packrat when you need to rebuild your image you may get a new piece of code from some library that is not working anymore, for instance, think about a deprecated function you may have used.
And of course version your own code, at least tag it to be able to easily go back in time and start a new build again.
This is the minimum you can do because things like broken Dockerhub or CRAN repositories still can happen. Saving a versioned docker image in a private docker registry is just the final step.