Article Preview
TopIntroduction
The internet is developing very rapidly, making people’s lives more convenient. While enjoying related services, it is also very important that information can be effectively protected. The integrity, privacy, and availability of information must be taken into consideration. As network security becomes more and more important, many security products, such as Firewalls, Intrusion Detection Systems, Vulnerability Scanning Systems, Update Service Systems, etc. continue to appear, and there are a large number of security data which can be used for auditing, such as router logs, syslog, host logs, etc. However, even if various security measures continue to be adopted, network security incidents have not decreased. Of course, this has a lot to do with the ever-expanding scale of the Internet, but there is no doubt that the situation of network security is becoming more and more serious.
The security guarantee of the information system is a defense system, including protection, detection, reaction, and recovery four levels (NURBOL, 2010). IDS (Intrusion Detection System) refers to a system have intrusion detection function. The IDS is responsible for “supervising early warning” by collecting system programs, operating systems, network packets, applications, etc. Discover the behavior of hazard system security or violation of security strategies. The security policy of intrusion detection system requires the collection of complete data. This is different from the general information system. Sometimes we need to deal with a large number of warnings, which requires high computer performance. However, for intrusion detection technology, the quality of security data generated at home and abroad is very low, and a large number of artificial analyses was required.
At present, IDS has a variety of products, but the basic principles are the same, mainly divided into three modules: The data package sniffing, the alert detection engine and the report of the alert.
There are many problems around the intrusion detection system, we need to solve: Signature generates, attack detection performance measurements, the alert analysis, etc., In particular, the alert analysis has become a hot spot for related research since 2000. People find out for existing safety products: Any single security product is difficult to meet people's safety requirement. The firewall cannot prevent unknown security incidents, the alerts generated by the intrusion detection system have serious false positives and missed reports, and the amount of data from various security data sources is beyond the reach of human ability. The amount of security data generated in a large-scale network is huge, and a 100Mbps access network can often generate more than one hundred thousand alerts per hour (Li Dong et al., 2009). Among the large number of alerts generated every day, actual security incidents are usually overwhelmed by a large number of redundant alerts (ie, false alerts). Many techniques for analyzing the alerts generated by IDS: fuzzy theory, information theory, statistics, data mining machine learning (Shudong Li et al., 2021), pattern recognition, artificial intelligence, etc., their purpose is to discover real attacks from the large number of alerts generated by IDS. How to remove these redundant alerts in real time and improve the quality of alerts is an urgent problem to be solved in large-scale network security protection.
But how do false alerts occur? Taking the open-source intrusion detection system SNORT as an example, the corresponding signature (a set of conditions that the data packet needs) will generate a large number of alerts. Therefore, which will bring two main problems of alert analysis: