Often, a big part of every system administrator’s work is troubleshooting the server when something goes wrong. This is especially true for your system’s services, as they are constantly running and processing information all the time. Services can be dependent on other services and on the server’s system, and there will be situations in your administrator’s life where the system services will fail or refuse to start. Here, in this process, we will show you how to troubleshoot them if something goes wrong.
To Start With: What Do You Need?
To complete this process, you will require a working installation of the CentOS 7 operating system with root privileges and a console-based text editor of your choice.
The Process
In order to show you how to troubleshoot services, we will introduce a random error in the Apache service’s configuration file and then show you how to troubleshoot and fix it:
- Log in as root and type the following command to append content to the httpd.conf:
echo "THIS_IS_AN_ERRORLINE" >> /etc/httpd/conf/httpd.conf
- Next, reload the httpd service and show its output:
systemctl reload httpd.service
systemctl status httpd.service -l - Let’s revert this error line:
sed -i 's/THIS_IS_AN_ERRORLINE//g' /etc/httpd/conf/httpd.conf
- Now, restart the service again:
systemctl reload httpd.service
systemctl status httpd.service
How it works…
In this fairly short process, we showed you how an example service will behave if it contains errors, and what you can do to fix it to get you started. There are a lot of different scenarios where something can go wrong when services malfunction, and it can be a big part of a system administrator’s job to solve those kinds of problem.
So, what have we learned from this experience?
We started this process by introducing a line of text in the main Apache configuration file, which does not contain any valid configuration syntax, and therefore the httpd service cannot interpret it. Then, we used the systemctl reload parameter to reload our server’s configuration file. As said before, not all services have the reload option, so if your service of interest does not support this, use the restart parameter instead. Since Apache will try to reload the configuration file with our current changes, it will refuse to accept the new configuration because of the wrong syntax that we introduced. Since we are just reloading the configuration, the running Apache process will not be affected by this problem and will stay online using its original configuration. The systemctl parameter will print out the following error message, giving us a hint of what to do next:
Job for httpd.service failed. Take a look at systemctl status httpd.service and journalctl -xe for details.
As suggested by the error output, the systemctl status parameter is a very powerful tool to see what’s going on behind the scenes with this service and to try and find out the reason for any failure (here you can also see that Apache is still running). If you start the systemctl status with the -l flag, it prints out an even longer version of the output, which can help you even more.
The output of this command shows us the exact reason for failing the configuration reload, so we can easily track down the cause of the problem (the output has been truncated):
AH00526: Syntax error on line 354 of /etc/httpd/conf/httpd.conf: Invalid command ERRORLINE, perhaps misspelled or defined by a module, is not included in the server configuration.
This output is part of the complete journald log information. If you want to read more about it, please refer to the Tracking system resources with journald process in this chapter. So, with this very useful information from the output, we can easily spot the problem and redo the introduction of ERRORLINE using the sed command and reload the service again; this time everything will work fine.
So, in summary, we can say that the systemctl status command is a very comfortable command that can be tremendously helpful in finding out problems with your service. Most services are very sensitive to syntax errors, and sometimes it can be just a misplaced space character that caused the service to refuse to work. Therefore, system administrators must work precisely all the time.