2018-07-25 23:10:32 +08:00
|
|
|
|
[role="xpack"]
|
|
|
|
|
[[ml-configuring-detector-custom-rules]]
|
2020-07-21 07:33:54 +08:00
|
|
|
|
= Customizing detectors with custom rules
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2021-07-10 04:09:03 +08:00
|
|
|
|
<<ml-ad-rules,Custom rules>> – or _job rules_ as {kib} refers to them – enable you
|
2021-04-15 15:33:03 +08:00
|
|
|
|
to change the behavior of anomaly detectors based on domain-specific knowledge.
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2018-08-13 21:51:13 +08:00
|
|
|
|
Custom rules describe _when_ a detector should take a certain _action_ instead
|
2018-07-25 23:10:32 +08:00
|
|
|
|
of following its default behavior. To specify the _when_ a rule uses
|
|
|
|
|
a `scope` and `conditions`. You can think of `scope` as the categorical
|
|
|
|
|
specification of a rule, while `conditions` are the numerical part.
|
|
|
|
|
A rule can have a scope, one or more conditions, or a combination of
|
2020-02-26 01:30:14 +08:00
|
|
|
|
scope and conditions. For the full list of specification details, see the
|
|
|
|
|
{ref}/ml-put-job.html#put-customrules[`custom_rules` object] in the create
|
|
|
|
|
{anomaly-jobs} API.
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2020-02-26 01:30:14 +08:00
|
|
|
|
[[ml-custom-rules-scope]]
|
2020-07-21 07:33:54 +08:00
|
|
|
|
== Specifying custom rule scope
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2019-07-27 02:07:01 +08:00
|
|
|
|
Let us assume we are configuring an {anomaly-job} in order to detect DNS data
|
|
|
|
|
exfiltration. Our data contain fields "subdomain" and "highest_registered_domain".
|
|
|
|
|
We can use a detector that looks like
|
|
|
|
|
`high_info_content(subdomain) over highest_registered_domain`. If we run such a
|
|
|
|
|
job, it is possible that we discover a lot of anomalies on frequently used
|
|
|
|
|
domains that we have reasons to trust. As security analysts, we are not
|
|
|
|
|
interested in such anomalies. Ideally, we could instruct the detector to skip
|
|
|
|
|
results for domains that we consider safe. Using a rule with a scope allows us
|
|
|
|
|
to achieve this.
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2018-08-01 07:28:09 +08:00
|
|
|
|
First, we need to create a list of our safe domains. Those lists are called
|
2019-07-27 02:07:01 +08:00
|
|
|
|
_filters_ in {ml}. Filters can be shared across {anomaly-jobs}.
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2020-02-26 01:30:14 +08:00
|
|
|
|
You can create a filter in **Anomaly Detection > Settings > Filter Lists** in
|
|
|
|
|
{kib} or by using the {ref}/ml-put-filter.html[put filter API]:
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2019-09-09 22:45:37 +08:00
|
|
|
|
[source,console]
|
2018-07-25 23:10:32 +08:00
|
|
|
|
----------------------------------
|
2018-12-08 04:34:11 +08:00
|
|
|
|
PUT _ml/filters/safe_domains
|
2018-07-25 23:10:32 +08:00
|
|
|
|
{
|
|
|
|
|
"description": "Our list of safe domains",
|
|
|
|
|
"items": ["safe.com", "trusted.com"]
|
|
|
|
|
}
|
|
|
|
|
----------------------------------
|
2018-09-01 02:56:26 +08:00
|
|
|
|
// TEST[skip:needs-licence]
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2019-07-27 02:07:01 +08:00
|
|
|
|
Now, we can create our {anomaly-job} specifying a scope that uses the
|
2020-02-26 01:30:14 +08:00
|
|
|
|
`safe_domains` filter for the `highest_registered_domain` field:
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2019-09-09 22:45:37 +08:00
|
|
|
|
[source,console]
|
2018-07-25 23:10:32 +08:00
|
|
|
|
----------------------------------
|
2018-12-08 04:34:11 +08:00
|
|
|
|
PUT _ml/anomaly_detectors/dns_exfiltration_with_rule
|
2018-07-25 23:10:32 +08:00
|
|
|
|
{
|
|
|
|
|
"analysis_config" : {
|
|
|
|
|
"bucket_span":"5m",
|
|
|
|
|
"detectors" :[{
|
|
|
|
|
"function":"high_info_content",
|
|
|
|
|
"field_name": "subdomain",
|
|
|
|
|
"over_field_name": "highest_registered_domain",
|
|
|
|
|
"custom_rules": [{
|
|
|
|
|
"actions": ["skip_result"],
|
|
|
|
|
"scope": {
|
|
|
|
|
"highest_registered_domain": {
|
|
|
|
|
"filter_id": "safe_domains",
|
|
|
|
|
"filter_type": "include"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}]
|
|
|
|
|
}]
|
|
|
|
|
},
|
|
|
|
|
"data_description" : {
|
|
|
|
|
"time_field":"timestamp"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----------------------------------
|
2018-09-01 02:56:26 +08:00
|
|
|
|
// TEST[skip:needs-licence]
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
|
|
|
|
As time advances and we see more data and more results, we might encounter new
|
2020-02-26 01:30:14 +08:00
|
|
|
|
domains that we want to add in the filter. We can do that in the
|
|
|
|
|
**Anomaly Detection > Settings > Filter Lists** in {kib} or by using the
|
2018-07-25 23:10:32 +08:00
|
|
|
|
{ref}/ml-update-filter.html[update filter API]:
|
|
|
|
|
|
2019-09-09 22:45:37 +08:00
|
|
|
|
[source,console]
|
2018-07-25 23:10:32 +08:00
|
|
|
|
----------------------------------
|
2018-12-08 04:34:11 +08:00
|
|
|
|
POST _ml/filters/safe_domains/_update
|
2018-07-25 23:10:32 +08:00
|
|
|
|
{
|
|
|
|
|
"add_items": ["another-safe.com"]
|
|
|
|
|
}
|
|
|
|
|
----------------------------------
|
2018-09-01 02:56:26 +08:00
|
|
|
|
// TEST[skip:setup:ml_filter_safe_domains]
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2018-08-01 07:28:09 +08:00
|
|
|
|
Note that we can use any of the `partition_field_name`, `over_field_name`, or
|
|
|
|
|
`by_field_name` fields in the `scope`.
|
|
|
|
|
|
2018-07-25 23:10:32 +08:00
|
|
|
|
In the following example we scope multiple fields:
|
|
|
|
|
|
2019-09-09 22:45:37 +08:00
|
|
|
|
[source,console]
|
2018-07-25 23:10:32 +08:00
|
|
|
|
----------------------------------
|
2018-12-08 04:34:11 +08:00
|
|
|
|
PUT _ml/anomaly_detectors/scoping_multiple_fields
|
2018-07-25 23:10:32 +08:00
|
|
|
|
{
|
|
|
|
|
"analysis_config" : {
|
|
|
|
|
"bucket_span":"5m",
|
|
|
|
|
"detectors" :[{
|
|
|
|
|
"function":"count",
|
|
|
|
|
"partition_field_name": "my_partition",
|
|
|
|
|
"over_field_name": "my_over",
|
|
|
|
|
"by_field_name": "my_by",
|
|
|
|
|
"custom_rules": [{
|
|
|
|
|
"actions": ["skip_result"],
|
|
|
|
|
"scope": {
|
|
|
|
|
"my_partition": {
|
|
|
|
|
"filter_id": "filter_1"
|
|
|
|
|
},
|
|
|
|
|
"my_over": {
|
|
|
|
|
"filter_id": "filter_2"
|
|
|
|
|
},
|
|
|
|
|
"my_by": {
|
|
|
|
|
"filter_id": "filter_3"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}]
|
|
|
|
|
}]
|
|
|
|
|
},
|
|
|
|
|
"data_description" : {
|
|
|
|
|
"time_field":"timestamp"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----------------------------------
|
2018-09-01 02:56:26 +08:00
|
|
|
|
// TEST[skip:needs-licence]
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2021-04-15 15:33:03 +08:00
|
|
|
|
Such a detector skips results when the values of all three scoped fields are
|
|
|
|
|
included in the referenced filters.
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2020-02-26 01:30:14 +08:00
|
|
|
|
[[ml-custom-rules-conditions]]
|
2020-07-21 07:33:54 +08:00
|
|
|
|
== Specifying custom rule conditions
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2021-04-15 15:33:03 +08:00
|
|
|
|
Imagine a detector that looks for anomalies in CPU utilization. Given a machine
|
|
|
|
|
that is idle for long enough, small movement in CPU could result in anomalous
|
|
|
|
|
results where the `actual` value is quite small, for example, 0.02. Given our
|
|
|
|
|
knowledge about how CPU utilization behaves we might determine that anomalies
|
|
|
|
|
with such small actual values are not interesting for investigation.
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2021-04-15 15:33:03 +08:00
|
|
|
|
Let us now configure an {anomaly-job} with a rule that skips results where CPU
|
|
|
|
|
utilization is less than 0.20.
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2019-09-09 22:45:37 +08:00
|
|
|
|
[source,console]
|
2018-07-25 23:10:32 +08:00
|
|
|
|
----------------------------------
|
2018-12-08 04:34:11 +08:00
|
|
|
|
PUT _ml/anomaly_detectors/cpu_with_rule
|
2018-07-25 23:10:32 +08:00
|
|
|
|
{
|
|
|
|
|
"analysis_config" : {
|
|
|
|
|
"bucket_span":"5m",
|
|
|
|
|
"detectors" :[{
|
|
|
|
|
"function":"high_mean",
|
|
|
|
|
"field_name": "cpu_utilization",
|
|
|
|
|
"custom_rules": [{
|
|
|
|
|
"actions": ["skip_result"],
|
|
|
|
|
"conditions": [
|
|
|
|
|
{
|
|
|
|
|
"applies_to": "actual",
|
|
|
|
|
"operator": "lt",
|
|
|
|
|
"value": 0.20
|
|
|
|
|
}
|
|
|
|
|
]
|
|
|
|
|
}]
|
|
|
|
|
}]
|
|
|
|
|
},
|
|
|
|
|
"data_description" : {
|
|
|
|
|
"time_field":"timestamp"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----------------------------------
|
2018-09-01 02:56:26 +08:00
|
|
|
|
// TEST[skip:needs-licence]
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2021-04-15 15:33:03 +08:00
|
|
|
|
When there are multiple conditions they are combined with a logical `AND`. This
|
|
|
|
|
is useful when we want the rule to apply to a range. We create a rule with two
|
|
|
|
|
conditions, one for each end of the desired range.
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2021-04-15 15:33:03 +08:00
|
|
|
|
Here is an example where a count detector skips results when the count is
|
|
|
|
|
greater than 30 and less than 50:
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2019-09-09 22:45:37 +08:00
|
|
|
|
[source,console]
|
2018-07-25 23:10:32 +08:00
|
|
|
|
----------------------------------
|
2018-12-08 04:34:11 +08:00
|
|
|
|
PUT _ml/anomaly_detectors/rule_with_range
|
2018-07-25 23:10:32 +08:00
|
|
|
|
{
|
|
|
|
|
"analysis_config" : {
|
|
|
|
|
"bucket_span":"5m",
|
|
|
|
|
"detectors" :[{
|
|
|
|
|
"function":"count",
|
|
|
|
|
"custom_rules": [{
|
|
|
|
|
"actions": ["skip_result"],
|
|
|
|
|
"conditions": [
|
|
|
|
|
{
|
|
|
|
|
"applies_to": "actual",
|
|
|
|
|
"operator": "gt",
|
|
|
|
|
"value": 30
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"applies_to": "actual",
|
|
|
|
|
"operator": "lt",
|
|
|
|
|
"value": 50
|
|
|
|
|
}
|
|
|
|
|
]
|
|
|
|
|
}]
|
|
|
|
|
}]
|
|
|
|
|
},
|
|
|
|
|
"data_description" : {
|
|
|
|
|
"time_field":"timestamp"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----------------------------------
|
2018-09-01 02:56:26 +08:00
|
|
|
|
// TEST[skip:needs-licence]
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2020-02-26 01:30:14 +08:00
|
|
|
|
[[ml-custom-rules-lifecycle]]
|
2020-07-21 07:33:54 +08:00
|
|
|
|
== Custom rules in the lifecycle of a job
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2021-04-15 15:33:03 +08:00
|
|
|
|
Custom rules only affect results created after the rules were applied. Let us
|
|
|
|
|
imagine that we have configured an {anomaly-job} and it has been running for
|
|
|
|
|
some time. After observing its results, we decide that we can employ rules to
|
|
|
|
|
get rid of some uninteresting results. We can use the
|
|
|
|
|
{ref}/ml-update-job.html[update {anomaly-job} API] to do so. However, the rule
|
|
|
|
|
we added will only be in effect for any results created from the moment we
|
|
|
|
|
added the rule onwards. Past results remain unaffected.
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2020-02-26 01:30:14 +08:00
|
|
|
|
[[ml-custom-rules-filtering]]
|
2020-07-21 07:33:54 +08:00
|
|
|
|
== Using custom rules vs. filtering data
|
2018-07-25 23:10:32 +08:00
|
|
|
|
|
2021-04-15 15:33:03 +08:00
|
|
|
|
It might appear like using rules is just another way of filtering the data that
|
|
|
|
|
feeds into an {anomaly-job}. For example, a rule that skips results when the
|
|
|
|
|
partition field value is in a filter sounds equivalent to having a query that
|
|
|
|
|
filters out such documents. However, there is a fundamental difference. When the
|
|
|
|
|
data is filtered before reaching a job, it is as if they never existed for the
|
|
|
|
|
job. With rules, the data still reaches the job and affects its behavior
|
|
|
|
|
(depending on the rule actions).
|
|
|
|
|
|
|
|
|
|
For example, a rule with the `skip_result` action means all data is still
|
|
|
|
|
modeled. On the other hand, a rule with the `skip_model_update` action means
|
|
|
|
|
results are still created even though the model is not updated by data matched
|
|
|
|
|
by a rule.
|