Jenkins is extremely slow when viewing job pages (over 3 minutes, with a cold disk cache). The main page displays fine; the problem is only when viewing pages for individual jobs.
I think that the problem started with a recent update of Jenkins+plugins, but how can I go about troubleshooting a problem like this?
How can I troubleshoot a problem like this?
First, make sure you can reproduce the problem. It helps with testing. If a performance problem only occurs when the cache is cold, then clearing the disk cache (instructions for Linux) can help.
Jenkins' "Manage Plugins" (under the Manage Jenkins section) lets you individually disable and downgrade plugins. If you suspect a particular plugin is causing problems, this can help you confirm.
strace
can show the system calls that Jenkins is doing. First, get the main Jenkins PID:
root@server:~# ps -ef | grep jenkins
jenkins 589 1 0 17:03 ? 00:00:00 /usr/bin/daemon --name=jenkins --inherit --env=JENKINS_HOME=/home/jenkins --output=/var/log/jenkins/jenkins.log --pidfile=/var/run/jenkins/jenkins.pid --umask=027 -- /usr/bin/java -Djava.awt.headless=true -jar /usr/share/jenkins/jenkins.war --webroot=/var/cache/jenkins/war --httpPort=8080 --ajp13Port=-1
jenkins 591 589 7 17:03 ? 00:00:51 /usr/bin/java -Djava.awt.headless=true -jar /usr/share/jenkins/jenkins.war --webroot=/var/cache/jenkins/war --httpPort=8080 --ajp13Port=-1
(The pid is 591 in this case.)
Next, run strace. Because Jenkins is multi-threaded, you'll want to add -f
to trace all threads.
strace -p 591 -f
If you're lucky, you'll find an obvious cause of slowdown. (In my case, one of the threads was repeatedly opening each previous build's build.xml
for the particular job I was trying to view.)
strace
monitors system calls and tells you what a process is doing; jstack
shows the call stack for a process, which helps tell you why it's doing it (what it's trying to accomplish).
jstack
takes a pid and needs to run as the same user as the process you're inspecting. (See here for more details.)
sudo -u jenkins jstack 591
This displays quite a lot of information: stack traces for each of Jenkins' threads, numerous entries for library and framework code such as request handlers and XML, etc. Somewhere in there, though, you should be able to find the stack trace for the particular request handler that's running slow and some portion of the stack trace that indicates what it's trying to do.