-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Add CEF processor to Ingest node #122491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add CEF processor to Ingest node #122491
Conversation
I realize this draft is still in progress, and you likely already have plans for these items.
Additionally, there is a CEF v1 specification (our |
I asked Lee H about micro-benchmarking, and JMH is being used (see https://github.com/elastic/elasticsearch/tree/main/benchmarks#elasticsearch-microbenchmark-suite). So this could add a benchmark under that suite of tests. |
Will this be comparable to the microbenchmarking that is done in the beats processor? |
@joegallo Your opinion on this please? cc: @andrewkroh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My read of this is that you are not following the spec for newlines and carriage returns. (edit: I've elaborated in #122491 (review)).
I would like a corpus of example cef messages added as plain text test fixtures files so that java string escaping and formatting doesn't muddy the waters around the correct behavior. (edit: see my next comment, I want plain text files for some of these test inputs so that it's easier to talk about what bytes are contained in the files, and therefore what those should correspond to in the input and output of the cef processor, without also having to keep the java string literal escaping rules in mind at the same time, since there's overlap between the two). |
|
||
public void testEscapedMessage() { | ||
String message = "CEF:0|security\\compliance|threat\\|->manager|1.0|100|message contains escapes|10|" | ||
+ "spt=1232 msg=Newlines in messages\\\nare allowed.\\\r\\\nAnd so are carriage feeds\\\\newlines\\\\\\=. dpt=4432"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This string in java contains the following substrings, but I'm going to mentally unescape with the java string escaping rules:
messages{BACKSLASH}{CARRIAGERETURN}are allowed
allowed.{BACKSLASH}{LINEFEED}{BACKSLASH}{CARRIAGERETURN}And
That's not the escaping that's written into the spec. My read of the spec is that the existence of actual carriage return and linefeed characters in the input is entirely irrelevant (indeed perhaps non-existent in well formed inputs?) -- rather, the spec says that if you see an actual backlash character followed by an actual n
or r
character then those need to be treated as being carriage returns and linefeeds.
So here the test and the code are checking the wrong thing, or at least that's my read of the code and the java escaping and the cef escaping and the spec.
It's entirely possible that the existing beats implementation has these same bugs, I haven't checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, a reasonable implementation would be to treat carriage returns and line feeds as allowed (maybe they're bad inputs, but we'll pass them through into the output unchanged), but then I question whether the preceding unescaped backslash should have been allowed, or treated as an error, etc...
That is, for example, is this test intended to check how the cef processor handles {BACKSLASH}n
, {CARRIAGERETURN}
, or {BACKSLASH}{CARRIAGERETURN}
? My read of the spec is that we're supposed to handle the first case (and I'm not sure we do), that the second case in undefined (and I think we might allow it?, which would be fine, I suppose), and that the third case is probably an illegal escape and should be rejected (but what we do right now is accept it and emit a carriage return).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
7c8a373 should handle things in a better way..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then there is this scenario of having escaped{BACKSLASH}
followed by character n
.. Meaning the string actually needs a {BACKSLASH}
character and n
needs to be processed as n
.
I tried this approach -35a307d
Please suggest otherwise
Closes - #126201
This PR creates a new CEF ingest node processor. The CEF processor converts a Common Event Format logs into a JSON structure. This processor also maps relevant CEF fields to ECS mappings without a need for additional processors in Ingest pipeline
Encoding rules from the spec
Ensure the following when encoding symbols in CEF:
<space>
.that the pipes in the extension do not need escaping. For example:
Sep 19 08:26:10 host CEF:0|security|threatmanager|1.0|100|detected a
| in message|10|src=10.0.0.1 act=blocked a | dst=1.1.1.1
another backslash (). For example:
Sep 19 08:26:10 host CEF:0|security|threatmanager|1.0|100|detected a
\ in packet|10|src=10.0.0.1 act=blocked a \ dst=1.1.1.1
Equal signs in the header need no escaping. For example:
Sep 19 08:26:10 host CEF:0|security|threatmanager|1.0|100|detected a =
in message|10|src=10.0.0.1 act=blocked a = dst=1.1.1.1
Note that multiple lines are only allowed in the value part of the extensions. For
example:
Sep 19 08:26:10 host CEF:0|security|threatmanager|1.0|100|Detected a
threat. No action needed.|10|src=10.0.0.1 msg=Detected a threat.\n No
action needed
Example
An example CEF parsing would look like
CEF LOG
Parsed CEF content
gradle check
?