Skip to content

Commit 156186d

Browse files
author
Sebastian Wagner
committed
DOC: cymru cap parser: document parsing of new format
1 parent 420a8d4 commit 156186d

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

intelmq/bots/parsers/cymru/parser_cap_program.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -228,6 +228,24 @@ def parse_line_old(self, line, report):
228228
yield event
229229

230230
def parse_line_new(self, line, report):
231+
"""
232+
The format is two following:
233+
category|address|asn|timestamp|optional_information|asninfo
234+
Therefore very similar to CSV, just with the pipe as separator
235+
category: the type (resulting in classification.*) and optional_information needs to be parsed differently per category
236+
address: source.ip
237+
asn: source.asn
238+
timestamp: time.source
239+
optional_information: needs special care.
240+
For some categories it needs parsing, as it contains a mapping of keys to values, whereas the meaning of the keys can differ between the categories
241+
For categories in MAPING_COMMENT, this field only contains one value.
242+
For the category 'bruteforce' *both* situations apply.
243+
Previously, the bruteforce events only had the protocol in the comment,
244+
while most other categories had a mapping. Now, the bruteforce categories also uses
245+
the type-value syntax. So we need to support both formats, the old and the new.
246+
See also https://github.com/certtools/intelmq/issues/1794
247+
asninfo: source.as_name
248+
"""
231249
category, ip, asn, timestamp, notes, asninfo = line.split('|')
232250

233251
# to detect bogous lines like 'hostname: sub.example.comport: 80'

0 commit comments

Comments
 (0)