We have started to use jenkins from last few months and now the size of home directory is about 50GB. I noticed that size of Jobs and workspace directories are about 20 GB each. How can I clean them? What kind of strategy I should use?
Consider the various Jenkins areas that tend to grow excessively. The key areas are system logs, job logs, artifact storage, and job workspaces. This will detail options to best manage each of these.
System logs may be found in <JENKINS_HOME>/logs
or /var/log/jenkins/jenkins.log
, depending on your installation. By default, Jenkins does not always include log rotation (logrotate), especially if running straight from the war. The solution is to add logrotate
. This Cloudbees post and my S/O response add details.
You can also set the Jenkins System Property hudson.triggers.SafeTimerTask.logsTargetDir
to relocate the logs
outside the <JENKINS_HOME>
.
Each job has an option to [ X ] Discard old builds
. As of LTS 2.222.1, Jenkins introduced a Global Build discarder (pull #4368) with similar options and default actions. This is a global setting, Prior to that, job logs (and artifacts) were retained forever by default (not good).
Advanced options can manage artifact retention (from post-build action, "Archive the artifacts" separately.
The Jobs directory contains a directory for every job (and folders if you use them). Inside the directory is the job config.xml
(a few KB in size), plus a directory builds
. builds
has a numbered directory holding the build logs for each retained build, a copy of the config.xml
at runtime and possibly some additional files of record (changelog.xml, injectedEnvVars.txt). If you choose the Archive the artifacts option
, there's also an archive
directory, which contains the artifacts from that build.
Jenkins System Property, jenkins.model.Jenkins.buildsDir
, lets you relocate the builds
to outside the <JENKINS_HOME>
I would strongly recommend relocating both the system logs and the job / build logs (and artifacts). By moving the system logs and build logs (and artifacts if ticked) outside of <JENKINS_HOME>
, what's left is the really important stuff to back and restore Jenkins and jobs in the event of disaster or migration. Carefully read and understand the steps "to support migration of existing build records" to avoid build-related errors. It also makes it much easier to analyze which job logs consume all the space and why (ie: logs vs artifacts).
Workspaces are where the source code is checked out, and the job (build) is executed. Workspaces should be ephemeral. The Best Practice is to start with an empty workspace and clean up when you are done - use Workspace Cleanup plugin (cleanWS()
), unless necessary.
The OP's indication of workspaces in the Jenkins controller suggests jobs are being run on the master. That's not a good (or secure) practice, except lightweight pipelines always execute on master. Mis-configured job pipelines will also fall back to master. You can set up a node physically running on the same server as the master for better security.
You can use cleanws()
EXCLUDE and INCLUDE patterns to selectively clean the workspace if deleting all is not viable.
There are two Jenkins System Properties to control the location of the workspace
directory. For the master: jenkins.model.Jenkins.workspacesDir
and for the nodes/agents: hudson.model.Slave.workspaceRoot
. Again, as these are ephemeral, get them out of <JENKINS_HOME>
so you can better manage and monitor.
Finally, one more space consideration...
Both maven and npm cache artifacts in a local repository. Typically, that is located in the user's $HOME
directory. If you increment versions often, that content will get stale and bloated. It's a cache, so take a time hit every once in a while and purge it or otherwise manage the content.
However, it's possible to relocate the cache elsewhere through maven and npm settings. Also, if running a maven step, every step has the Advanced option to have a private repository. That is located within the job's workspace. The benefit is that you know what your build uses; there is no contamination. The downside is massive duplication and wasted space if all jobs have private repos and you never clean them out or delete the workspaces, or longer build times every time if you cleaned. Consider using the cleanWS()
or a separate job to purge as needed.