We will explain the steps to solve problems in Linux servers. It is an article we prepared to give an example. We will continue with a simple scenario. We wanted to tell you where we will look for the problem. The most important thing in the troubleshooting steps is that we create our thinking algorithm from simple to complex.
Example Problem: We cannot access our Linux server, the website or services are not available.
Sample Problem Solution
First, if we cannot connect to our Linux server via ssh, we can try to connect via Vmware Esx or through ILO or IDRAC if it is physical.
When we log in to our server, if we do not have detailed information about the OS, the first place we should look at will be the OS distribution. Because when searching on google for the problem, the relevant OS distro will provide us with a solution quickly.
You can use the “cat /etc/*release” button to learn OS on Linux.
cat /etc/*release
As a second step, if we are unable to monitor our server, the first place we need to look at is the remaining disk space. In the system we are logged in, we check the remaining space by using the “df –kh” command. If there is 100% occupancy of disk space, it is likely because of this the server was inaccessible.
df –kh
If there is no occupancy on the disk, the deactivation of related services and ssh will make us think of a single situation, we should check the firewall on the server.
Generally, the firewalls that are stopped during the first installations on Linux servers are left without “disable” and after the servers “reboot” our services become inaccessible due to the firewall.
For this, we check the firewall with the “systemctl status firewalld” command. You can look at Ubuntu with the “ufw status” command.
systemctl status firewalld
The firewall appears to be disabled. Having SSH access disabled indicates changes to ssh config. First of all, we check whether a service is running with the following commands.
systemctl status sshd
If the SSH service is running, we check that the relevant port is a listener on the network.
Note: We just used this command to view the 22nd port. If there was an Apache service, we should check 80 or 443.
netstat –tulpn | grep :22
As a network, we see that there is no problem at the server layer. In this case, we will check the “sshd” service’s config file. Since everything is a file in Linux, it is possible for us to easily edit the config file for each service.
If we do not know the path of the config file of the sshd service, you can use the following command to find it.
man service name or locate service name
man firewalld
man sshd
Now that we find the location of the config file of the ssh service, we check the necessary settings. When we look at it, there is no problem in the settings related to the service.
vi /etc/ssh/sshd_config
After that, the “journalctl” and the latest log files remain to be checked. We aim to reach the result and solution by doing research on google from outputs in log files.
To check the ssh related logs, we are using the “journalctl -u ssh” command.
journalctl -u ssh
If we haven’t got any results, we use the “journalclt –xe” command to examine the most recent logs.
journalclt –xe
Again, if we could not find any information about the error, we check the “/var/log” file.
Here we are doing a reverse listing. The command “ls –latr” is very important for us to see the files with the latest log files. According to this listing, the most recent “cron” and “messages” log files were written. It is necessary to examine these files and continue the error resolution from here.
ls -latr