I am fairly new to managing my own servers and have a problem where some Apache2 processes hog 100% of the CPU. This lasts for about half an hour and a couple of hours later, it will start again. It will even happen right after restarting Apache or rebooting the server.
The server is used to serve two very low traffic websites. It uses a CMS called Pimcore, which itself is based off of Symfony.
DigitalOcean Droplet 1GB of RAM
Ubuntu 18.04
PHP 7.2 fpm
MySql 14.14
Pimcore CMS (Symfony)
I ran the exact same setup on Ubuntu 17.10 before, and everything went fine. Since I moved my setup to a new server with Ubuntu 18.04 (fresh install of everything) I started seeing these issues.
I suspect it has something to do with some PHP script executing, but I can figure out the exact source of it.
Does anybody have an idea what could be going on?
Apache2 Config:
<IfModule mpm_prefork_module>
StartServers 2
MinSpareServers 3
MaxSpareServers 5
MaxRequestWorkers 20
MaxConnectionsPerChild 3000
MaxClients 15
</IfModule>
Apache Modules:
Loaded Modules:
core_module (static)
so_module (static)
watchdog_module (static)
http_module (static)
log_config_module (static)
logio_module (static)
version_module (static)
unixd_module (static)
access_compat_module (shared)
alias_module (shared)
auth_basic_module (shared)
authn_core_module (shared)
authn_file_module (shared)
authz_core_module (shared)
authz_host_module (shared)
authz_user_module (shared)
autoindex_module (shared)
deflate_module (shared)
dir_module (shared)
env_module (shared)
filter_module (shared)
mime_module (shared)
mpm_prefork_module (shared)
negotiation_module (shared)
php7_module (shared)
proxy_module (shared)
proxy_fcgi_module (shared)
reqtimeout_module (shared)
rewrite_module (shared)
setenvif_module (shared)
socache_shmcb_module (shared)
ssl_module (shared)
status_module (shared)
wsgi_module (shared)
error.log
Fri May 1 11:01:48 2020 (1309): Error Cannot kill process 1069: Success!
[Fri May 01 11:01:49.207718 2020] [core:notice] [pid 923] AH00051: child pid 1309 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:13:15.899518 2020] [core:notice] [pid 923] AH00051: child pid 1333 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:13:15.899963 2020] [core:notice] [pid 923] AH00051: child pid 1383 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:13:15.899975 2020] [core:notice] [pid 923] AH00051: child pid 1406 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:13:15.900004 2020] [core:notice] [pid 923] AH00051: child pid 1305 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:13:15.900059 2020] [mpm_prefork:notice] [pid 923] AH00169: caught SIGTERM, shutting down
[Fri May 01 11:13:16.073253 2020] [mpm_prefork:notice] [pid 1605] AH00163: Apache/2.4.29 (Ubuntu) OpenSSL/1.1.1 mod_wsgi/4.5.17 Python/3.6 configured -- resuming normal operations
[Fri May 01 11:13:16.073329 2020] [core:notice] [pid 1605] AH00094: Command line: '/usr/sbin/apache2'
[Fri May 01 11:14:17.466068 2020] [core:notice] [pid 1605] AH00051: child pid 1613 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:14:17.466137 2020] [core:notice] [pid 1605] AH00051: child pid 1636 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:14:17.466181 2020] [mpm_prefork:notice] [pid 1605] AH00169: caught SIGTERM, shutting down
[Fri May 01 11:14:17.608696 2020] [mpm_prefork:notice] [pid 1685] AH00163: Apache/2.4.29 (Ubuntu) OpenSSL/1.1.1 mod_wsgi/4.5.17 Python/3.6 configured -- resuming normal operations
[Fri May 01 11:14:17.608770 2020] [core:notice] [pid 1685] AH00094: Command line: '/usr/sbin/apache2'
[Fri May 01 11:16:49.360625 2020] [core:notice] [pid 1685] AH00052: child pid 1696 exit signal Segmentation fault (11)
[Fri May 01 11:16:49.360697 2020] [core:notice] [pid 1685] AH00052: child pid 1717 exit signal Segmentation fault (11)
[Fri May 01 11:16:49.360708 2020] [core:notice] [pid 1685] AH00052: child pid 1719 exit signal Segmentation fault (11)
[Fri May 01 11:16:49.360724 2020] [core:notice] [pid 1685] AH00052: child pid 1722 exit signal Segmentation fault (11)
[Fri May 01 11:16:49.360780 2020] [mpm_prefork:notice] [pid 1685] AH00169: caught SIGTERM, shutting down
[Fri May 01 11:17:05.637473 2020] [mpm_prefork:notice] [pid 924] AH00163: Apache/2.4.29 (Ubuntu) OpenSSL/1.1.1 mod_wsgi/4.5.17 Python/3.6 configured -- resuming normal operations
[Fri May 01 11:17:05.651236 2020] [core:notice] [pid 924] AH00094: Command line: '/usr/sbin/apache2'
[Fri May 01 11:25:37.817879 2020] [core:notice] [pid 924] AH00051: child pid 946 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:25:37.817946 2020] [core:notice] [pid 924] AH00051: child pid 948 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:25:37.817960 2020] [core:notice] [pid 924] AH00051: child pid 1022 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:25:37.817972 2020] [core:notice] [pid 924] AH00051: child pid 1055 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:25:37.817984 2020] [core:notice] [pid 924] AH00051: child pid 1056 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:25:37.818020 2020] [mpm_prefork:notice] [pid 924] AH00169: caught SIGTERM, shutting down
[Fri May 01 11:25:37.957502 2020] [mpm_prefork:notice] [pid 1394] AH00163: Apache/2.4.29 (Ubuntu) OpenSSL/1.1.1 mod_wsgi/4.5.17 Python/3.6 configured -- resuming normal operations
[Fri May 01 11:25:37.957577 2020] [core:notice] [pid 1394] AH00094: Command line: '/usr/sbin/apache2'
top Output:
top - 11:34:04 up 17 min, 1 user, load average: 4.00, 3.92, 2.79
Tasks: 91 total, 5 running, 49 sleeping, 0 stopped, 0 zombie
%Cpu(s): 99.7 us, 0.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1008848 total, 112120 free, 483364 used, 413364 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 334772 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1405 www-data 20 0 561180 94488 43856 R 24.9 9.4 2:20.77 /usr/sbin/apache2 -k start
1427 www-data 20 0 550904 75304 36568 R 24.9 7.5 1:56.12 /usr/sbin/apache2 -k start
1429 www-data 20 0 552952 76684 36432 R 24.9 7.6 2:13.89 /usr/sbin/apache2 -k start
1437 www-data 20 0 550904 74748 36568 R 24.9 7.4 1:48.41 /usr/sbin/apache2 -k start
916 mysql 20 0 1410004 206444 17124 S 0.3 20.5 0:01.96 /usr/sbin/mysqld --daemonize --pid-file=/run/mysqld/mysqld.pid
1 root 20 0 159764 8756 6472 S 0.0 0.9 0:01.37 /sbin/init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kthreadd]
3 root 20 0 0 0 0 I 0.0 0.0 0:00.00 [kworker/0:0]
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 [kworker/0:0H]
6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 [mm_percpu_wq]
7 root 20 0 0 0 0 S 0.0 0.0 0:00.07 [ksoftirqd/0]
8 root 20 0 0 0 0 I 0.0 0.0 0:00.08 [rcu_sched]
9 root 20 0 0 0 0 I 0.0 0.0 0:00.00 [rcu_bh]
10 root rt 0 0 0 0 S 0.0 0.0 0:00.00 [migration/0]
11 root rt 0 0 0 0 S 0.0 0.0 0:00.00 [watchdog/0]
12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [cpuhp/0]
13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kdevtmpfs]
14 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 [netns]
15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [rcu_tasks_kthre]
16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kauditd]
17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [khungtaskd]
18 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [oom_reaper]
19 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 [writeback]
20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kcompactd0]
When tracing one of this PIDs with high CPU usage using strace, there's what seems like an endless stream of this:
stat("/var/www/XYZ/XYZ/web/static/css/print.css", {st_mode=S_IFREG|0664, st_size=250, ...}) = 0
sendto(24, "\264\0\0\0\3SELECT id, \n (CASE"..., 184, MSG_DONTWAIT, NULL, 0) = 184
poll([{fd=24, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=24, revents=POLLIN}])
recvfrom(24, "\1\0\0\1\2/\0\0\2\3def\vXYZ\5cache\5"..., 3174, MSG_DONTWAIT, NULL, NULL) = 265
sendto(24, "<\0\0\0\3SELECT sourceId FROM docume"..., 64, MSG_DONTWAIT, NULL, 0) = 64
poll([{fd=24, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=24, revents=POLLIN}])
recvfrom(24, "\1\0\0\1\1]\0\0\2\3def\vXYZ\26docume"..., 2909, MSG_DONTWAIT, NULL, NULL) = 120
sendto(24, "\236\0\0\0\3SELECT id,language FROM doc"..., 162, MSG_DONTWAIT, NULL, 0) = 162
poll([{fd=24, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=24, revents=POLLIN}])
recvfrom(24, "\1\0\0\1\2\32\0\0\2\3def\0\0\0\2id\2id\f?\0\v\0\0\0\3!\0"..., 2789, MSG_DONTWAIT, NULL, NULL) = 95
sendto(24, "\264\0\0\0\3SELECT id, \n (CASE"..., 184, MSG_DONTWAIT, NULL, 0) = 184
poll([{fd=24, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=24, revents=POLLIN}])
recvfrom(24, "\1\0\0\1\2/\0\0\2\3def\vXYZ\5cache\5"..., 2694, MSG_DONTWAIT, NULL, NULL) = 265
sendto(24, "\271\0\0\0\3SELECT id, \n (CASE"..., 189, MSG_DONTWAIT, NULL, 0) = 189
poll([{fd=24, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=24, revents=POLLIN}])
recvfrom(24, "\1\0\0\1\2/\0\0\2\3def\vXYZ\5cache\5"..., 2429, MSG_DONTWAIT, NULL, NULL) = 104
access("/var/www/XYZ/XYZ/app/views/content/default.html.php", F_OK) = -1 ENOENT (No such file or directory)
access("/var/www/XYZ/XYZ/app/Resources/views/content/default.html.php", F_OK) = -1 ENOENT (No such file or directory)
access("/var/www/XYZ/XYZ/src/AppBundle/Resources/public/areas/blockquote/icon.png", F_OK) = -1 ENOENT (No such file or directory)
access("/var/www/XYZ/XYZ/src/AppBundle/Resources/public/areas/horizontal-line/icon.png", F_OK) = -1 ENOENT (No such file or directory)
access("/var/www/XYZ/XYZ/src/AppBundle/Resources/public/areas/gallery-single-images/icon.png", F_OK) = -1 ENOENT (No such file or directory)
...
Ok, so in the end I was able to track down the problem.
Some crawlers for popular SEO tools were requesting non-existent URLs at a very high frequency, resulting in some processes being triggered within Pimcore. That's what was driving up CPU and RAM consumption.
After blocking those crawlers via .htaccess, everything is back to normal.