Rules
Basic Functionality
How processors process log messages is defined via configurable rules. Each rule contains a filter that is used to select log messages. Other parameters within the rules define how certain log messages should be transformed. Those parameters depend on the processor for which they were created.
Rule Files
Rules are defined as YAML objects or JSON objects. Rules can be distributed over different files or multiple rules can reside within one file. Each file contains multiple YAML documents or a JSON array of JSON objects. The YAML format is preferred, since it is a superset of JSON and has better readability.
Depending on the filter, a rule can trigger for different types of messages or just for specific log messages. In general, specific rules are being applied first. It depends on the directory where the rule is located if it is considered specific or generic.
Further details can be found in the section for processors.
1filter: 'command: execute' # A comment
2labeler:
3 label:
4 action:
5 - execute
6description: '...'
1filter: 'command: "execute something"'
2labeler:
3 label:
4 action:
5 - execute
6description: '...'
7---
8filter: 'command: "terminate something"'
9labeler:
10 label:
11 action:
12 - execute
13description: '...'
1{
2 "filter": "command: execute",
3 "labeler": {
4 "label": {
5 "action": ["execute"]
6 }
7 }
8 "description": "..."
9}
1[
2 {
3 "filter": "command: execute",
4 "labeler": {
5 "label": {
6 "action": ["execute"]
7 }
8 }
9 "description": "..."
10 },
11 {
12 "filter": "command: execute",
13 "labeler": {
14 "label": {
15 "action": ["execute"]
16 }
17 }
18 "description": "..."
19 }
20]
Log message field value access
All rules reference fields or field values of log messages.
This can be done via the dot notation.
To reference a nested field inside the log event, just give the whole path from the event root
to the desired field.
To reference the field information in the following example you would use the following
notation: more.nested.information.
If you do want to access a specific item inside a list of the event you can extend the dotted
notation with indices.
Given the following example you can access the list element lists with the following
notation: more.nested.sometimes.1.
In case you want to have more than one element then you can slice the list with the pattern
start:stop:step_size, e.g: more.nested.sometimes.0:2 which would return
["inside", "lists"].
This slicing is based on the native
python list slicing.
1{
2 "some": "data",
3 "more": {
4 "nested": {
5 "information": "is here",
6 "sometimes": ["inside", "lists", "of", "elements"]
7 }
8 }
9}
Warning
The dotted field notation is available in all processors, the use of indices to access list
elements is though not available in the Clusterer, Labeler and the
Pseudonymizer.
Filter
The filters are based on the Lucene query language, but contain some additional enhancements.
It is possible to filter for keys and values in log messages.
Dot notation is used to access subfields in log messages.
A filter for {'field': {'subfield': 'value'}} can be specified by
field.subfield': 'value'.
If a key without a value is given it is filtered for the existence of the key.
The existence of a specific field can therefore be checked by a key without a value.
The filter filter: field.subfield would match for every value subfield in
{'field': {'subfield': 'value'}}.
The special key * can be used to always match on any input.
Thus, the filter filter: * would match any input document.
The filter in the following example would match fields ip_address with the
value 192.168.0.1.
Meaning all following transformations done by this rule would be applied only
on log messages that match this criterion.
This example is not complete, since rules are specific to processors and require additional options.
1{ "filter": "ip_address: 192.168.0.1" }
It is possible to use filters with field names that contain white spaces or use special symbols
of the Lucene syntax. However, this has to be escaped.
The filter filter: 'field.a subfield(test): value' must be escaped as
filter: 'field.a\ subfield(test): value'.
Other references to this field do not require such escaping.
This is only necessary for the filter.
It is necessary to escape twice if the file is in the JSON format - once for
the filter itself and once for JSON.
Operators
A subset of Lucene query operators is supported:
NOT: Condition is not true.
AND: Connects two conditions. Both conditions must be true.
OR: Connects two conditions. At least one them must be true.
In the following example log messages are filtered for which event_id: 1 is true and
ip_address: 192.168.0.1 is false.
This example is not complete, since rules are specific to processors and require additional options.
1{ "filter": "event_id: 1 AND NOT ip_address: 192.168.0.1" }
RegEx-Filter
It is possible use regex expressions to match values.
For this, the field with the regex pattern must be added to the optional field
regex_fields in the rule definition.
In the following example the field ip_address is defined as regex field.
It would be filtered for log messages in which the value ip_address starts with
192.168.0..
This example is not complete, since rules are specific to processors and
require additional options.
1filter: 'ip_address: "192\.168\.0\..*"'
2regex_fields:
3- ip_address