amazon-web-servicesamazon-ec2servicerhel7systemctl

How to debug a failed systemctl service (code=exited, status=217/USER)?


I'm trying to add my first service on rhel7 (which resides in AWS/EC2), but - the service is not configured correctly - as I get:

[ec2-user@ip-172-30-1-96 ~]$ systemctl status clouddirectd.service -l
● clouddirectd.service - CloudDirect Daemon
   Loaded: loaded (/usr/lib/systemd/system/clouddirectd.service; enabled; vendor preset: disabled)
   Active: activating (auto-restart) (Result: exit-code) since Tue 2018-01-09 16:09:42 EST; 8s ago
 Main PID: 10064 (code=exited, status=217/USER)

Jan 09 16:09:42 ip-172-30-1-96.us-west-1.compute.internal systemd[1]: clouddirectd.service: main process exited, code=exited, status=217/USER
Jan 09 16:09:42 ip-172-30-1-96.us-west-1.compute.internal systemd[1]: Unit clouddirectd.service entered failed state.
Jan 09 16:09:42 ip-172-30-1-96.us-west-1.compute.internal systemd[1]: clouddirectd.service failed.

Also:

[ec2-user@ip-172-30-1-96 ~]$ systemctl is-active clouddirectd
activating
[ec2-user@ip-172-30-1-96 ~]$ sudo systemctl list-units --type service --all | grep clouddirectd
  clouddirectd.service                                  loaded    activating auto-restart CloudDirect Daemon

And my unit file is:

[ec2-user@ip-172-30-1-96 ~]$ cat /usr/lib/systemd/system/clouddirectd.service
[Unit]
Description=CloudDirect Daemon
After=network.target

[Service]
Environment=AWS_SHARED_CREDENTIALS_FILE=/etc/sonar/.aws/credentials
#ExecStart=/usr/lib/sonar/clouddirect/virtualenv/bin/python /usr/bin/sonar/clouddirectd -c /etc/sonar/clouddirect/clouddirectd.conf
ExecStart=/usr/lib/sonar/clouddirect/virtualenv/bin/python /usr/bin/clouddirect -c /etc/sonar/clouddirect.conf
# @PERM@ allow group write permission on newly created files
UMask=0007
#User=clouddirectd
User=clouddirect
Group=sonar
KillSignal=SIGINT
TimeoutStopSec=60min
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

Can you suggest how to debug this systemctl service so it won't keep dying and auto restarting?


Solution

  • The error 217 indicate the user did not exist at the time the service tried to start. In your case the user specified in your service is clouddirect.

     Main PID: 10064 (code=exited, status=217/USER)
    
    Jan 09 16:09:42 ip-172-30-1-96.us-west-1.compute.internal systemd[1]: clouddirectd.service: main process exited, code=exited, status=217/USER
    

    This could be caused if that is not the actual user name (for example if it has a typo), it can also be caused if the user is part of some external user store (ex: LDAP or Active Directory) and the service which needs to start that allows the Linux server to access the external user store is not up yet. For example vasd.service starts a product used to allow Linux to authenticate against Active Directory, if vasd.service is not up and you have specified a user that is only available in Active Directory you would want to add that service in your After= line. For example:

    After=network.target vasd.service