spamassassin

Spamassassin Bayes Not Working


I'm getting no bayes score on any email and there appears to be no bayes filtering despite my best efforts. I am not a linux and spamassassin guru, so I'm asking for some help.

I have built up these settings in local.cf in attempts to get bayes working:

use_bayes 1 
bayes_auto_learn 1 
bayes_min_ham_num 100 
bayes_auto_learn_threshold_nonspam -0.001 
bayes_auto_learn_threshold_spam 6.0 
allow_user_rules 0
add_header all Status _YESNO_, score=_SCORE_ required=_REQD_ tests=_TESTS_ autolearn=_AUTOLEARN_ version=_VERSION_ 

The spamassassin log file shows one error consistently:

Fri Mar  3 12:36:27 2017 [10104] info: spamd: creating default_prefs: /root/.spamassassin/user_prefs 
Fri Mar  3 12:36:27 2017 [10104] info: spamd: failed to create readable default_prefs: /root/.spamassassin/user_prefs 

and one error sporadically:

warn: plugin: eval failed: bayes: (in learn) locker: safe_lock: cannot create tmp lockfile /root/.spamassassin/bayes.lock.[mydomain.com].27903 for /root/.spamassassin/bayes.lock: Permission denied 

/root/.spamassassin/user_prefs does exist, and I've given everyone permissions to it to attempt to resolve this problem, with no effect:

-rwxrwxrwx 1 root root    1273 Mar  1 14:46 user_prefs* 

My spamd launch command explicitly sets the user as spamd, but the main service is still running as root while children appear to be spawning properly. Here's ps output:

root     10093  0.0  1.5 145988 66952 ?        Ss   02:25   0:05 /usr/sbin/spamd --create-prefs --max-children=5 --username=spamd --helper-home-dir=/var/log/spamassassin/ --syslog=/var/log/spamassassin/spam .log -d --pidfile=/var/run/spamd.pid 
spamd    10104  0.0  1.8 155348 75544 ?        S    02:25   0:22 spamd child 
spamd    23753  0.0  1.7 151732 72000 ?        S    10:30   0:02 spamd child 

My bayes database exists for root:

sa-learn --dump magic 
0.000          0          3          0  non-token data: bayes db version 
0.000          0       1727          0  non-token data: nspam 
0.000          0        111          0  non-token data: nham 
0.000          0     103812          0  non-token data: ntokens 
0.000          0 1484629200          0  non-token data: oldest atime 
0.000          0 1488559525          0  non-token data: newest atime 
0.000          0 1488323169          0  non-token data: last journal sync atime 
0.000          0          0          0  non-token data: last expiry atime 
0.000          0          0          0  non-token data: last expire atime delta 
0.000          0          0          0  non-token data: last expire reduction count 

All of my email headers have either of these, despite me forcing autolearn on with the thresholds shown above:

autolearn=unavailable autolearn_force=no 

or

autolearn=no autolearn_force=no 

Lastly, here's a full snippet of the spamassassin log showing it identifying spam, but not applying any bayes processing, while apparently working as root:

Fri Mar  3 12:55:11 2017 [10104] info: spamd: connection from localhost [::1]:54673 to port 783, fd 6 
Fri Mar  3 12:55:11 2017 [10104] info: spamd: creating default_prefs: /root/.spamassassin/user_prefs 
Fri Mar  3 12:55:11 2017 [10104] info: spamd: failed to create readable default_prefs: /root/.spamassassin/user_prefs 
Fri Mar  3 12:55:11 2017 [10104] info: spamd: processing message <5323541152335322929654419@lo.nbvc12345.xyz> for root:1010 
Fri Mar  3 12:55:11 2017 [10104] info: spamd: identified spam (10.5/3.0) for root:1010 in 0.2 seconds, 8433 bytes. 
Fri Mar  3 12:55:11 2017 [10104] info: spamd: result: Y 10 - HTML_FONT_LOW_CONTRAST,HTML_MESSAGE,MPART_ALT_DIFF,RCVD_IN_BRBL_LASTEXT,RCVD_IN_SBL_CSS,RDNS_NONE,T_REMOTE_IMAGE,URIBL_BLOCKED,URIBL_DBL_SPAM,URIBL_SBL,URIBL_SBL_A scantime=0.2,size=8433,user=root,uid=1010,required_score=3.0,rhost=localhost,raddr=::1,rport=54673,mid=<5323541152335322929654419@lo.nbvc12345.xyz>,autolearn=no autolearn_force=no 
Fri Mar  3 12:55:11 2017 [10093] info: prefork: child states: II 

and here's a full snippet of it missing spam, because of no bayes filtering:

Fri Mar  3 13:01:31 2017 [10104] info: spamd: connection from localhost [::1]:56926 to port 783, fd 6 
Fri Mar  3 13:01:31 2017 [10104] info: spamd: creating default_prefs: /root/.spamassassin/user_prefs 
Fri Mar  3 13:01:31 2017 [10104] info: spamd: failed to create readable default_prefs: /root/.spamassassin/user_prefs 
Fri Mar  3 13:01:31 2017 [10104] info: spamd: processing message <sie819-8bh73y10780523sdgw_ds7fid385303272d2g-8h723se@email.searchresultsnewinfo.com> for root:1010 
Fri Mar  3 13:01:31 2017 [10104] info: spamd: clean message (1.3/3.0) for root:1010 in 0.3 seconds, 8104 bytes. 
Fri Mar  3 13:01:31 2017 [10104] info: spamd: result: . 1 - RDNS_NONE,URIBL_BLOCKED scantime=0.3,size=8104,user=root,uid=1010,required_score=3.0,rhost=localhost,raddr=::1,rport=56926,mid=<sie819-8bh73y10780523sdgw_ds7fid385303272d2g-8h723se@email.searchresultsnewinfo.com>,autolearn=no autolearn_force=no 
Fri Mar  3 13:01:31 2017 [10093] info: prefork: child states: II 

Is the problem that the bayes database is in /root/.spamassassin and the child processes can't access it? Where should it be, or is it something else? At my limited knowledge's end on this. Any help appreciated.


Solution

  • The problem is related to spamd running as root and the spamd children (spamc) running under the user "spamd", with insufficient permissions to access the bayes database in /root/.spamassassin.

    To have spamassassin properly look at the bayes database, it must be in a location accessible to the children, and I had to tell spamassassin where to find it by adding this line to local.cf

    bayes_path /var/spamassassin/bayesdb/bayes
    

    I then needed to create /var/spamassassin/bayesdb (not /var/spamassassin/bayesdb/bayes as the trailing "bayes" is the prefix for the filenames that will be in the bayesdb folder) and make user "spamd" the owner:

    cd /var
    chown -R spamd:some_group spamassassin
    

    I then moved the existing bayes database files from /root/.spamassassin to /var/spamassassin/bayesdb, performed the same ownership operation as above on the files, and the bayes filtering started working properly.

    I did not resolve the issue with the children attempting to create user_prefs in /root/.spamassassin, as it's the same issue with permissions, but doesn't seem to affect bayes working, which was all I was after.