In this article, we will simply talk about creating a reputation-based detection system. Reputation determines the reputation(?) of a file, domain, and IP address. The aim of reputation-based detection systems is to detect low-reputation behaviours. You can think of it as opening a low-reputation file, making a request to the IP address. When a request is made to an IP address that is often associated with malware and has lost reputation in a network using this system, the suspicious situation will be noticed. We will use free 3rd party sources for reputation data. These; malwaredomainlist.com, SANS, and abuse.ch.
The system basically consists of 3 stages:
1. Monitoring of network traffic
Clients must forward their network connection logs to the security server for the logs to be reviewed. In this study, the logs were recorded with “tcpdump” and directed in “.pcap” format.
2. Collecting data from 3rd party sources for reputation information
Some of the IP addresses that have been associated with malware in the past can be obtained from the list shared by “abuse.ch”. With the simple script in the image below, data is written to the “IP_list.txt” file.
The data shared at the address must be collected daily to add new ones to the existing data. With the help of “crontab” in Linux systems, the “get_IPs.sh” script can be run automatically every day. We use the “crontab -e” command to define a new job in the “crontab”. By adding “0 18 * * * /home/omer/get_IPs.sh” to the bottom line, we make our script run at 6 pm every day. At this stage, we ensure that the data from a 3rd party source is updated daily.
3. Comparison of addresses and data from sources
In this section, we will extract the IP addresses from the incoming logs, compare them with the 3rd party source data, and print a warning on the screen if there is a match. First of all, we have prepared the following function to extract IP addresses from the logs in the “.pcap” format. With this function, we transferred the IP addresses in pcap to a list.
We have prepared another function that reads the .txt file containing the IP addresses prepared in the 2nd stage and transfers the data to the list.
Then, by taking the intersection of the 2 existing lists, we prepared the code that determines whether there is any communication with the IP addresses with low reputation.
When we run the application, if there are any of the IP addresses in the list in the log records, it prints on the screen.