参考资料:
intrusion detection and correlation
security information and event management implementation
Gartner magic quadrant for security information and event management(年报)
由于是上班第一天,大佬们比较忙,所以可能需要自己查阅很多资料,虽然是很粗略的看了上面关于警报关联分析的那一本书,还是在这里稍微总结一下:
首先是一个最基本的概念:
A successful intrusion into a computer:
1.surveillance stage 监控阶段
2.exploitation stage
3.masquerading stage 伪装阶段
下面用例子分别从这三个方面来介绍alert correlation:
首先是alert correlation 的大致流程:
然后是我们的例子,也就是攻击被sensor发现并报警的情况:
最后是非常重要的一点,也是在整个alert correlation中必不可少的一个环节或者说工具,那就是Alert Merging:
在后面的步骤介绍中都会看到,这就是alert correlation的核心步骤
从第三章开始就详细讲述了alert correlation的各个步骤,下面我进行很粗略的总结:
Chapter 4 ALERT COLLECTION
- Alert Normalization
顾名思义,就是将所有的数据格式规范化,而书中也提到了各种规范与标准,对应到我们的例子中,就是攻击名字出现了不规范的情况,将其更改如下: Alert Preprocessing
数据预处理是每个数据挖掘过程中必须完成的一个重要步骤,这里的预处理主要包含以下部分:
2.1 Determining the Alert Time
关于时间,The IDMEF standard defines the following three different classes to represent time.
①CreateTime: This is the time when the alert is created by the analyzer.This is the only timestamp considered mandatory by the IDMEF standard.
②DetectTime: This is the time when the event(s) producing an alert are detected by the analyzer. In the case of more than one event, the time the first event was
detected.
③AnalyzerTime: This is the time at the analyzer when the alert is sent to a correlation system (which is called manager in the IDMEF terminology).
至于以上三个标准时间采用哪一个,书中推荐一般为第二个,但是根据不同要求和情况也会选择其他的,具体参照书中标准同时对于时间来说,也有另外一种解决方案,就是an external source such as NTP or SNTP, GPS/GOES/WWV clocks,属于外部绝对时间,但是也很容易造成一些问题,比如:lost alert messages could present a problem for the correlation process,这个问题的阐述书中有,大致就是由于时间的先后关系会造成一部分信息缺失,所以也要考虑,书上给了解决方案,感兴趣也可以看看
2.2 Determining the Alert’s Source and Target
2.3 Determining the Attack’s Name
这两步结果如下:
Chapter 5 ALERT AGGREGATION AND VERIFICATION
这一章主要是讲一些不同alert之间的聚合与验证,这也算是处理当中的核心部分,要仔细理解当中每个公式以及聚合的要求
Alert Fusion
The task of the alert fusion phase is to combine alerts that result from the independent detection of the same attack occurrence by different intrusion detection systems.
大致目标就是把显而易见的重复的报警给融合掉,其中公式如下:
根据以上公式可以看出首先由于时间非常接近,同时属于不同的sensor观测到的,所以alert2与alert3可以合并,但是alert6与alert7尽管时间非常接近,但是由于是同一个sensor观测到的,所以不能聚合,结果如下:Alert Verification
这一小节的主要目的就是要对alert进行验证同时去除掉那些不想关或者说错误的alert,形成“big picture”。
When a sensor outputs an alert, there are three possibilities:①The sensor has correctly identified a successful attack. This alert is most likely relevant (i.e., a true positive).
②The sensor has correctly identified an attack, but the attack failed to meet its objectives. Although some sites might be interested in failed attack attempts, the alert should be differentiated from a successful instance. This kind of alert is called a irrelevant positive or nontextual (reflecting the missing contextual information that the IDS would require to determine a failed attack).
③The sensor incorrectly identified an event as an attack. The alert represents incorrect information (i.e., a false positive).
所以从上面看得出来,虽然都是正确的攻击报警,但是alert1是攻击Windows的IIS Exploit,但是这里却用在了Linux上,所以是属于第二种alert,lose target(注:Microsoft IIS:互联网服务)所以这一步骤过后的结果如下:
那至于其他很多具体的判断方法,要参照书上来得出结果。2.1 Passive Approach
2.2 Active Approach
事实上,这一部分就是根据各种标准定的安全策略,分为积极与消极的策略,具体在书中,自己观看Attack Thread Reconstruction
说白了就是攻击的线性重建,这一部分的主要依据是下面的公式:
根据攻击的来源与目标可以得出下面的结果:Attack Session Reconstruction
这一部分其实就是将相关的逻辑上有连接关系的alert放在一起,公式如下:
依据上面的开放的端口关系,可以得出下面这样的结果:alert5是由于上面的监控与scan提供的开放端口80进行攻击的,故而这之间是有关系的,所以也可以进行聚合Attack Focus Recognition
The task of the attack focus recognition phase is to identify hosts that are either the source or the target of a substantial amount of attacks.
这个就是识别到底是一对多还是多对一的攻击,比如ddos之类的,也算是一个比较重要的环节吧
Chapter 6 HIGH-LEVEL ALERT STRUCTURES
Multistep Correlation
Multistep correlation is used to identify high-level attack patterns that are composed of several individual attacks.
就是将攻击的各个阶段不再单独孤立地进行分析,将一条攻击链组织起来,形成高层次攻击的识别
对于我们的例子,可以看出来由10和11组成的攻击链是能够相连的,所以可以聚合,以下是结果:
For our example, we assume that a multistep scenario has been defined that includes a scanning step, a break-in step, and an escalation of privileges step.Impact Analysis
In the impact analysis phase, the effects of an attack on the proper operation of the network are determined. The idea of this phase is to analyze alerts (and the corresponding attacks) with regard to the context in which they occur. In all previous steps, alerts are evaluated independently of their environment. Only the internal information that is stored within an alert is used to determine whether it can be aggregated with others into higher-level meta-alerts. A notable exception is the alert verification phase, which includes external information to check whether it is possible for a certain attack to have succeeded or not.Alert Prioritizing
顾名思义就是给alert分级,对于我们这个例子来说,只有最终完成了的整个攻击链算是高等级,其他均为低等级Alert Sanitization
sanitzation:环境卫生- -
The process of removing or disguising sensitive information from data is called sanitization. There are two forms of sanitization, anonymizing and pseudonymizing sanitization:The sanitization process should, besides preserving the properties required for analysis, also satisfy the following requirements(1)Understand structure of data.只提取出有用的属性
(2)Preserve data integrity.保持数据完整性
(3)Operate fail-safe.对敏感数据要谨慎处理以上就是alert sanitation的要求,至于如何进行sanitation,书上有相关介绍,同时还有各种自动化的介绍,可以深入探究,但是要注意自动也有一定利弊
Chapter 7 LARGE-SCALE CORRELATION
首先来看一下大规模关联分析的模型:
下面是分层关联模式:Hierarchical Correlation Schema
这一章主要就是讲在当前大规模分布式系统当中的关联分析,也是非常粗略的总结一下
Pattern Specification
这一节主要是模式与图样的规范化1.1 Definitions 将不同机器,不同子系统之间的定义统一
当以下条件满足时,称之为形成了一个pattern:Pattern P is valid, if the following properties hold:
1 Each set of alerts is at least linked to one other set.
2 Every set except one (called the root set) contains exactly one send event as the last event of the host sequence. The root set contains no send event.
3 The connection graph contains no cycles. The connection graph is built by considering each alert set as a vertex and each link between two sets as an edge between the corresponding vertices.
1.2 Attack Specification Language (ASL). 规范化描述语言,比如
![ASL](http://img.blog.csdn.net/20160716161703695) 等等等等
1.3 Language Grammar 通俗点就是连词成句的语法
![这里写图片描述](http://img.blog.csdn.net/20160716161942129)
Pattern Detection
The purpose of the pattern detection process is to identify alert instances that satisfy an attack scenario written in ASL.2.1 Basic Data Structures
Pattern Graph:
Messages:(ID, timestamp, list of (attribute, value)) The ID of the message is set to the identification of the node of the pattern graph. The timestamp denotes the time of occurrence of the original event and the attribute/value list holds the values of the relevant event attributes that have been taken from the original event attributes. The ID of a message defines its type. Different actual message instances with an identical ID are considered to be of the same message type2.2 Constraints 约束分为动态的静态的以及约定好的
2.3 Detection Process
node constraints:
<0, t1, (a1,0)>, <0, t2, (a1,1)>, <1, t3, (a2,0)>, <1, t4, (a2,1)>, and <3, t5, (a3,0)>:
关于这个pattern graph里面的东西还挺复杂的,有空可以好好钻研下
另外还有一个很重要的东西就是Detection Algorithm,例子如下:
3,>1,>1,>0,>0,>2.4 Implementation Issues
自己体会吧。。感觉稍微有点绕
总结:
没有什么比接受新的知识更让人兴奋的了,可惜其中还是有很多我没有看懂的东西,时间还有,同时工作中需要专注于哪个部分,都是以后要注意的地方,继续加油吧~