diff --git a/CHANGELOG.md b/CHANGELOG.md index 5a5ff90258..fbab6cb6ed 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -77,6 +77,7 @@ Please refer to the [NEWS](NEWS.md) for a list of changes which have an affect o ### Documentation - Fix and refresh links to mailing lists (PR#2609 by Kamil Mańkowski) - `Aggregate Bot`: Add illustration graphics (PR#2612 by Sebastian Wagner). +- `scripts/generate-feeds-docs.py`: Remove necessity to specify placeholders for feed name and provider, generate feed code automatically (PR#2653 by Sebastian Wagner). ### Packaging - Replace `/opt/intelmq` example paths in bots with variable `VAR_STATE_PATH` for correct paths in LSB-path setups like with packages (PR#2587 by Sebastian Wagner). diff --git a/docs/admin/beta-features.md b/docs/admin/beta-features.md index 2666225f62..0ebeed7c9a 100644 --- a/docs/admin/beta-features.md +++ b/docs/admin/beta-features.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Beta Features ## Using Supervisor as a Process Manager @@ -45,7 +44,6 @@ process_manager: supervisor After this it is possible to manage bots like before with `intelmqctl` command. - ## Using AMQP Message Broker Starting with IntelMQ 1.2 the AMQP protocol is supported as message queue. To use it, install a broker, for example @@ -183,7 +181,3 @@ However, there are currently a few cavecats: queue - In the logs, you can see the main thread initializing first, then all of the threads which log with the name `[bot-id].[thread-id]`. - - - - diff --git a/docs/admin/common-problems.md b/docs/admin/common-problems.md index 9a2775631f..fd1b441aff 100644 --- a/docs/admin/common-problems.md +++ b/docs/admin/common-problems.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Common Problems ## IntelMQ @@ -42,7 +41,6 @@ errors. This section has been moved to the [Management Guide](management/intelmq.md#orphaned-queues). - ### Multithreading is not available for this bot Multithreading is not available for some bots and AMQP broker is @@ -60,10 +58,8 @@ support Multithreading include: If you think this mapping is wrong, please report a bug. - ## IntelMQ API - ### IntelMQCtlError If the command is not configured correctly, you will see exceptions on @@ -118,4 +114,4 @@ other tweaks. SQLite does not only need write access to the database itself, but also the folder the database file is located in. Please check that the webserver has write permissions to the folder the session file is -located in. \ No newline at end of file +located in. diff --git a/docs/admin/configuration/intelmq-api.md b/docs/admin/configuration/intelmq-api.md index bc39c08637..a790f8a15c 100644 --- a/docs/admin/configuration/intelmq-api.md +++ b/docs/admin/configuration/intelmq-api.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Configuring IntelMQ API Depending on your setup you might have to install `sudo` to make it @@ -98,4 +97,4 @@ Therefore, SELinux needs to be disabled: setenforce 0 ``` -We welcome contributions to provide SELinux policies. \ No newline at end of file +We welcome contributions to provide SELinux policies. diff --git a/docs/admin/configuration/intelmq-manager.md b/docs/admin/configuration/intelmq-manager.md index f48f3876dc..638c5bb1c4 100644 --- a/docs/admin/configuration/intelmq-manager.md +++ b/docs/admin/configuration/intelmq-manager.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Configuring IntelMQ Manager In the file `/usr/share/intelmq-manager/html/js/vars.js` set `ROOT` to the URL of your `intelmq-api` installation - by diff --git a/docs/admin/configuration/intelmq.md b/docs/admin/configuration/intelmq.md index ee3be55a08..e0d887ca2c 100644 --- a/docs/admin/configuration/intelmq.md +++ b/docs/admin/configuration/intelmq.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Configuring IntelMQ ## Directories @@ -156,6 +155,7 @@ Some information can as well be found in Python's documentation on the used If the path `_on_error` exists for a bot, the message is also sent to this queue, instead of (only) dumping the file if configured to do so. + ##### Pipeline **`source_pipeline_broker`** @@ -204,6 +204,7 @@ configured to do so. (required, integer) broker database that the bot will use to connect and send messages (requirement from redis broker). + ##### Miscellaneous **`load_balance`** diff --git a/docs/admin/configuration/redis.md b/docs/admin/configuration/redis.md index 5d20b9a8e8..7f0ee09086 100644 --- a/docs/admin/configuration/redis.md +++ b/docs/admin/configuration/redis.md @@ -1,4 +1,5 @@ # Redis Pipeline (Message broker) + - # Using Elasticsearch as a database for IntelMQ If you wish to run IntelMQ with Elasticsearch or full ELK stack (Elasticsearch, Logstash, Kibana) it is entirely diff --git a/docs/admin/database/mssql.md b/docs/admin/database/mssql.md index 39ed7051bc..8a1825e476 100644 --- a/docs/admin/database/mssql.md +++ b/docs/admin/database/mssql.md @@ -9,4 +9,4 @@ For MSSQL support, the library `pymssql>=2.2` is required. To output data to MSSQL use SQL Output Bot with parameter `engine` set to `mssql`. -For more information see SQL Output Bot documentation page. \ No newline at end of file +For more information see SQL Output Bot documentation page. diff --git a/docs/admin/database/postgresql.md b/docs/admin/database/postgresql.md index 9013b4a0bf..1e375db1bc 100644 --- a/docs/admin/database/postgresql.md +++ b/docs/admin/database/postgresql.md @@ -20,6 +20,7 @@ You have two basic choices to run PostgreSQL: ### PostgreSQL Server Version Any supported version of PostgreSQL should work (v>=13 as of January 2025) [[1]](https://www.postgresql.org/support/versioning/). + ### events table definition (`intelmq_psql_initdb`) IntelMQ comes with the `intelmq_psql_initdb` command line tool designed to help with creating the @@ -71,6 +72,7 @@ get to test if the user `intelmq` can authenticate): ```bash psql -h localhost intelmq-events intelmq < /tmp/initdb.sql ``` + ## EventDB Utilities Some scripts related to the EventDB are located in the @@ -225,4 +227,4 @@ data loss - you need to do this step manually. While null characters (`0`, not SQL "NULL") in TEXT and JSON/JSONB fields are valid, data containing null characters can cause troubles in some combinations of clients, servers and each settings. To prevent unhandled errors and data which -can't be inserted into the database, all null characters are escaped (`u0000`) before insertion. \ No newline at end of file +can't be inserted into the database, all null characters are escaped (`u0000`) before insertion. diff --git a/docs/admin/database/splunk.md b/docs/admin/database/splunk.md index b3342af2d3..f9bcc1d763 100644 --- a/docs/admin/database/splunk.md +++ b/docs/admin/database/splunk.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Sending IntelMQ events to Splunk 1. Go to Splunk and configure in order to be able to receive diff --git a/docs/admin/database/sqlite.md b/docs/admin/database/sqlite.md index 0faa1bfeb9..cdcc812da7 100644 --- a/docs/admin/database/sqlite.md +++ b/docs/admin/database/sqlite.md @@ -17,4 +17,4 @@ sqlite> .read /tmp/initdb.sql Then, set the `database` parameter to the `your-db.db` file path. -To output data to SQLite use SQL Output Bot with parameter `engine` set to `sqlite`. For more information see SQL Output Bot documentation page. \ No newline at end of file +To output data to SQLite use SQL Output Bot with parameter `engine` set to `sqlite`. For more information see SQL Output Bot documentation page. diff --git a/docs/admin/faq.md b/docs/admin/faq.md index 92dce512e8..d8d213de09 100644 --- a/docs/admin/faq.md +++ b/docs/admin/faq.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Frequently asked questions ## How can I improve the speed? @@ -100,4 +99,3 @@ If you installed manually via pip (note that this also deletes all configuration pip3 uninstall intelmq rm -r /opt/intelmq ``` - diff --git a/docs/admin/hardware-requirements.md b/docs/admin/hardware-requirements.md index 064bff8d57..702d43d628 100644 --- a/docs/admin/hardware-requirements.md +++ b/docs/admin/hardware-requirements.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Hardware Requirements Do you ask yourself how much RAM do you need to give your new IntelMQ @@ -11,7 +10,6 @@ virtual machine? The honest answer is simple and pointless: It depends ;) - ## IntelMQ and the messaging queue (broker) IntelMQ uses a messaging queue to move the messages between the bots. diff --git a/docs/admin/installation/dockerhub.md b/docs/admin/installation/dockerhub.md index 9db54343f9..2b27d0b92c 100644 --- a/docs/admin/installation/dockerhub.md +++ b/docs/admin/installation/dockerhub.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Installation from DockerHub This guide provides instruction on how to install IntelMQ and it's components using Docker. @@ -53,8 +52,6 @@ environment variables `INTELMQ_API_USER` for the username and !!! note If you get an **Permission denied** error, you should run `chown -R $USER:$USER example_config` - - ## Docker without docker-compose If not already installed, please install diff --git a/docs/admin/installation/linux-packages.md b/docs/admin/installation/linux-packages.md index e0c2e7fa0d..3135a499b8 100644 --- a/docs/admin/installation/linux-packages.md +++ b/docs/admin/installation/linux-packages.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Installation as Linux package This guide provides instructions on how to install IntelMQ and it's components from Linux distribution's package repository. diff --git a/docs/admin/installation/pypi.md b/docs/admin/installation/pypi.md index 8323a90295..fda4b02193 100644 --- a/docs/admin/installation/pypi.md +++ b/docs/admin/installation/pypi.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Installation from PyPI This guide provides instruction on how to install IntelMQ and it's components using the Python Package Index (PyPI) @@ -92,7 +91,6 @@ sudo -u intelmq /opt/intelmq/venv/bin/pip install intelmq intelmq-api intelmq-ma sudo /opt/intelmq/venv/bin/intelmqsetup ``` - ## Installing IntelMQ API (optional) The `intelmq-api` packages ships: @@ -165,4 +163,4 @@ This file needs to be placed in the correct place for your Apache 2 installation - On Debian and Ubuntu, the file needs to be placed at `/etc/apache2/conf-available.d/manager-apache.conf` and then execute `a2enconf manager-apache`. - On CentOS, RHEL and Fedora, the file needs to be placed at `/etc/httpd/conf.d/` and reload the webserver. -- On openSUSE, the file needs to be placed at `/etc/apache2/conf.d/` and reload the webserver. \ No newline at end of file +- On openSUSE, the file needs to be placed at `/etc/apache2/conf.d/` and reload the webserver. diff --git a/docs/admin/integrations/cifv3.md b/docs/admin/integrations/cifv3.md index 1fcfa7bdcb..bb49a2b681 100644 --- a/docs/admin/integrations/cifv3.md +++ b/docs/admin/integrations/cifv3.md @@ -13,4 +13,4 @@ indicators. CIFv3 can correlate indicators via the UUID attribute. Can be used to submit indicators to a CIFv3 instance by using the [CIFv3 API](https://github.com/csirtgadgets/bearded-avenger-deploymentkit/wiki/REST-API). -Look at the CIFv3 API Output Bot for more information. \ No newline at end of file +Look at the CIFv3 API Output Bot for more information. diff --git a/docs/admin/integrations/misp.md b/docs/admin/integrations/misp.md index 4c955d3f82..a13de3dc96 100644 --- a/docs/admin/integrations/misp.md +++ b/docs/admin/integrations/misp.md @@ -48,4 +48,3 @@ Can be used to directly create MISP events in a MISP instance by using the [MISP API](https://misp.gitbooks.io/misp-book/content/automation/). Look at the Bots documentation page for more information. - diff --git a/docs/admin/intro.md b/docs/admin/intro.md index f144e408e3..16f6195499 100644 --- a/docs/admin/intro.md +++ b/docs/admin/intro.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Intro This guide provides instructions on how to install, configure and manage IntelMQ and it's components. diff --git a/docs/admin/management/intelmq-api.md b/docs/admin/management/intelmq-api.md index 335b75e2f3..80606d6be6 100644 --- a/docs/admin/management/intelmq-api.md +++ b/docs/admin/management/intelmq-api.md @@ -3,15 +3,12 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Managing IntelMQ API - ## Running - For development purposes and testing you can run directly using `hug`: ```bash hug -m intelmq_api.serve -``` \ No newline at end of file +``` diff --git a/docs/admin/upgrade.md b/docs/admin/upgrade.md index 20092220f7..f50baa54cf 100644 --- a/docs/admin/upgrade.md +++ b/docs/admin/upgrade.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Upgrade instructions In order to upgrade your IntelMQ installation it is recommended to follow these five steps: diff --git a/docs/dev/adding-feeds.md b/docs/dev/adding-feeds.md index 78c0056f77..83ee022347 100644 --- a/docs/dev/adding-feeds.md +++ b/docs/dev/adding-feeds.md @@ -1,9 +1,8 @@ - # Adding Feeds Adding a feed doesn't necessarily require any programming experience. There are several collector and parser bots intended for general use. Depending on the data source you are trying to add as a feed, it might be only a matter of creating a working combination of collector bot (such as URL Fetcher) configuration and a parser bot (such as CSV parser) configuration. When you are satisfied with the configurations, add it to the `intelmq/etc/feeds.yaml` file using the following template and open a [pull request](https://github.com/certtools/intelmq/pulls)! @@ -31,12 +30,123 @@ Adding a feed doesn't necessarily require any programming experience. There are If the data source utilizes some unusual way of distribution or uses a custom format for the data it might be necessary to develop specialized bot(s) for this particular data source. Always try to use existing bots before you start developing your own. Please also consider extending an existing bot if your use-case is close enough to it's features. If you are unsure which way to take, start an [issue](https://github.com/certtools/intelmq/issues) and you will receive guidance. +## Howto + +### Choosing the collector + +### Choosing the parser + +### Classification + +### Other static fields + +* Feed accuracy +* TLP +* Event Description + * Target + * Text + * URL +* Protocol + * Application Protocol + * Transport Protocol + +## Example Feeds + +### Simple List + +As an example, let's add the - very simple - feed *Toxic IP Addresses (CIDR)* by StopForumSpam to the documentation. The data URL is https://www.stopforumspam.com/downloads/toxic_ip_cidr.txt and contains a list of IP Network Ranges in CIDR notation, separated by newlines. + +As the resource is available via HTTP, we will use the [HTTP Collector](../user/bots.md#intelmq.bots.collectors.http.collector_http) for the data retrieval and [Generic CSV Parser](../user/bots.md#intelmq.bots.parsers.generic.parser_csv) for parsing. +For the collector, we only specify the module to use (the HTTP collector, as seen on the bots documentation), an estimate on the feed accuracy (as it is a blacklist, not 100%, but still reasonably high), the resource URL to download and the rate limit of 1 hour, as there might be frequent updates. + +For the parser we again specify the module name and the required parameter (columns) to map the input data field to the IntelMQ field `source.network`. Further we add some static field values which are equal for all data lines. + +``` +Stop Forum Spam: + Toxic IP Addresses: + description: IP Networks that are believed will only ever be used for abuse + documentation: https://www.stopforumspam.com/downloads + revision: 2025-09-21 + public: true + bots: + collector: + module: intelmq.bots.collectors.http.collector_http + parameters: + accuracy: 80 + http_url: https://www.stopforumspam.com/downloads/toxic_ip_cidr.txt + rate_limit: 86400 + parser: + module: intelmq.bots.parsers.generic.parser_csv + parameters: + columns: source.network + default_fields: + classification.type: blacklist + protocol.application: http + protocol.transport: tcp + event_description.target: web forums + event_description.text: web forum spam + event_description.url: https://www.stopforumspam.com/ + tlp: white +``` + +### TSV document + +As a next example, let's add a feed for https://hole.cert.pl/domains/v2/domains.csv (7 MB). +Contrary to its file name ending, the separator is not a comma, but a tab character. +The file contains four columns: +``` +PozycjaRejestru AdresDomeny DataWpisu DataWykreslenia +285107 0-1-x.06215785.xyz 2025-04-02T09:02:19+00:00 +332655 d15k2d11r6t6rl.cloudfront.net 2025-06-12T17:06:08+00:00 2025-06-13T13:54:55+00:00 +[...] +``` + +The feeds description is at https://cert.pl/en/warning-list/ and it says the list of blocked domains is updated about every 5 minutes. In IntelMQ we usually don't need such high refresh rates, but setting it to half an hour is reasonable for most use cases. +The list is automatically composed, and the list contains domains for warnings so the accuracy is lower. +As the descriptions says the listed domains are websites, we can again assume the protocol is HTTP/TCP. Although the list is about phishing websites, it's use case is a warning/blacklist and therefore the classification is blacklist. In the event description we explain the kind of blacklist. +The most crucial part is the mapping of da columns to IntelMQ fields. In this case, they are given in Polish. +- `PozycjaRejestru`: Position in the Register. We do not need this in IntelMQ, so we save it as `extra.certpl_register` +- `AdresDomeny`: The domain address, lands in `source.fqdn`. This is the information we case about +- `DataWpisu`: The date of entry, and +- `DataWykreslenia`: The date of deletion + - This is a tricky situation we as have no clear indication at which time the information is current. Based on the feed description, if the deletion date would is not present, the time of fetching the data (`time.observation`) is closest to the meaning of `time.source`. + - Therefore, instead of using the Generic CSV Parser, a custom Parser or a downstream expert is required to accomplish this. + - For simplicity, we map these columns to `extra.first_seen` and `extra.expiration_date`. Both fields are already in use by other bots and feeds. + +```yaml +CERT.PL + Hole Domains v2: + description: Dangerous websites Warning List + documentation: https://cert.pl/en/warning-list/ + revision: 2025-09-23 + public: true + bots: + collector: + module: intelmq.bots.collectors.http.collector_http + parameters: + accuracy: 50 + rate_limit: 1800 + http_url: https://hole.cert.pl/domains/v2/domains.csv + parser: + module: intelmq.bots.parsers.generic.parser_csv + parameters: + columns: extra.certpl_register,source.fqdn,extra.first_seen,extra.expiration_date + default_fields: + classification.type: blacklist + protocol.application: http + protocol.transport: tcp + event_description.target: users + event_description.text: phishing + event_description.url: https://cert.pl/en/warning-list/ + tlp: white +``` + ## Feeds Wishlist This is a list with potentially interesting data sources, which are either currently not supported or the usage is not clearly documented in IntelMQ. If you want to **contribute** new feeds to IntelMQ, this is a great place to start! !!! note - Some of the following data sources might better serve as an expert bot for enriching processed events. + Some of the following data sources might also serve as an expert bot for enriching processed events. - Lists of feeds: - [threatfeeds.io](https://threatfeeds.io) @@ -49,6 +159,7 @@ This is a list with potentially interesting data sources, which are either curre - Some third party intelmq bots: [NRDCS IntelMQ fork](https://github.com/NRDCS/intelmq/tree/certlt/intelmq/bots) - List of potentially interesting data sources: - [Abuse.ch SSL Blacklists](https://sslbl.abuse.ch/blacklist/) + - [aa419 Fake Banks List](https://db.aa419.org/fakebankslist.php) - [AbuseIPDB](https://www.abuseipdb.com/pricing) - [Adblock Plus](https://adblockplus.org/en/subscriptions) - [apivoid IP Reputation API](https://www.apivoid.com/api/ip-reputation/) @@ -76,7 +187,7 @@ This is a list with potentially interesting data sources, which are either curre - [Google Webmaster Alerts](https://www.google.com/webmasters/) - [GPF Comics DNS Blacklist](https://www.gpf-comics.com/dnsbl/export.php) - [Greensnow](https://blocklist.greensnow.co/greensnow.txt) - - [Greynoise](https://developer.greynoise.io/reference/community-api) + - [Greynoise](https://docs.greynoise.io/docs/using-the-greynoise-community-api) - [HP Feeds](https://github.com/rep/hpfeeds) - [IBM X-Force Exchange](https://exchange.xforce.ibmcloud.com/) - [ImproWare AntiSpam](https://antispam.imp.ch/) @@ -84,9 +195,10 @@ This is a list with potentially interesting data sources, which are either curre - [James Brine](https://jamesbrine.com.au/) - [Joewein](http://www.joewein.net) - Maltrail: - - [Malware](https://github.com/stamparm/maltrail/tree/master/trails/static/images/malware) - - [Suspicious](https://github.com/stamparm/maltrail/tree/master/trails/static/images/suspicious) - - [Mass Scanners](https://github.com/stamparm/maltrail/blob/master/trails/static/images/mass_scanner.txt) + - [Malware](https://github.com/stamparm/maltrail/tree/master/trails/static/malware) + - [Suspicious](https://github.com/stamparm/maltrail/tree/master/trails/static/suspicious) + - [Malicious](https://github.com/stamparm/maltrail/tree/master/trails/static/malicious) + - [Mass Scanners](https://github.com/stamparm/maltrail/blob/master/trails/static/mass_scanner.txt) (for whitelisting) - [Malshare](https://malshare.com/) - [MalSilo Malware URLs](https://malsilo.gitlab.io/feeds/dumps/url_list.txt) @@ -110,7 +222,7 @@ This is a list with potentially interesting data sources, which are either curre - [SANS ISC](https://isc.sans.edu/api/) - [ShadowServer Sandbox API](http://www.shadowserver.org/wiki/pmwiki.php/Services/Sandboxapi) - [Shodan search API](https://shodan.readthedocs.io/en/latest/tutorial.html#searching-shodan) - - [Snort](http://labs.snort.org/feeds/ip-filter.blf) + - [Snort](https://www.snort.org/downloads/ip-block-list) - [stopforumspam Toxic IP addresses and domains](https://www.stopforumspam.com/downloads) - [Spamhaus Botnet Controller List](https://www.spamhaus.org/bcl/) - [SteveBlack Hosts File](https://github.com/StevenBlack/hosts) diff --git a/docs/dev/bot-development.md b/docs/dev/bot-development.md index 2aba2adcab..caaf0c080b 100644 --- a/docs/dev/bot-development.md +++ b/docs/dev/bot-development.md @@ -1,95 +1,21 @@ +# Bot Development Guide -# Bot Development +This guide will show you all the necessary steps to develop a new bot for IntelMQ. -Here you should find everything you need to develop a new bot. - -## Steps - -1. Create appropriately placed and named python file. -2. Use correct parent class. -3. Code the functionality you want (with mixins, inheritance, etc). -4. Create appropriately placed test file. -5. Prepare code for testing your bot. -6. Add documentation for your bot. -7. Add changelog and news info. - -## Layout Rules - -``` -intelmq/ - lib/ - bot.py - cache.py - message.py - pipeline.py - utils.py - bots/ - collector/ - / - collector.py - parser/ - / - parser.py - expert/ - / - expert.py - output/ - / - output.py - etc/ - runtime.yaml -``` - -Assuming you want to create a bot for a new 'Abuse.ch' feed. It turns out that here it is necessary to create different -parsers for the respective kind of events (e.g. malicious URLs). Therefore, the usual hierarchy `intelmq/bots/parser//parser.py` would not be suitable because it is necessary to have more parsers for each Abuse.ch Feed. The solution is to use the same hierarchy with an additional "description" in the file name, separated by underscore. Also see the section *Directories and Files naming*. - -Example (including the current ones): - -``` -/intelmq/bots/parser/abusech/parser_domain.py -/intelmq/bots/parser/abusech/parser_ip.py -/intelmq/bots/parser/abusech/parser_ransomware.py -/intelmq/bots/parser/abusech/parser_malicious_url.py -``` - -#### Directories Hierarchy on Default Installation - -- Configuration Files Path: `/opt/intelmq/etc/` -- PID Files Path: `/opt/intelmq/var/run/` -- Logs Files and dumps Path: `/opt/intelmq/var/log/` -- Additional Bot Files Path, e.g. templates or databases: - `/opt/intelmq/var/lib/bots/[bot-name]/` - -#### Directories and Files naming - -Any directory and file of IntelMQ has to follow the Directories and Files naming. Any file name or folder name has to: - -- be represented with lowercase and in case of the name has multiple words, the spaces between them must be removed or replaced by underscores -- be self-explaining what the content contains. - -In the bot directories name, the name must correspond to the feed provider. If necessary and applicable the feed name can and should be used as postfix for the filename. - -Examples: - -``` -intelmq/bots/parser/taichung/parser.py -intelmq/bots/parser/cymru/parser_full_bogons.py -intelmq/bots/parser/abusech/parser_ransomware.py -``` - - -## Guide +## Placing and naming ### Naming your bot class Class name of the bot (ex: PhishTank Parser) must correspond to the type of the bot (ex: Parser) e.g. `PhishTankParserBot` +## Coding + ### Choosing the parent class Please use the correct bot type as parent class for your bot. The `intelmq.lib.bot` module contains the following classes: @@ -205,6 +131,16 @@ and provides the methods: - `cache_flush` - `cache_get_redis_instance` +#### Cache + +Bots can use a Redis database as cache instance. Use the `intelmq.lib.utils.Cache` class to set this up and/or look at existing bots, like the `cymru_whois` expert how the cache can be used. Bots must set a TTL for all keys that are cached to avoid caches growing endless over time. Bots must use the Redis databases >= 10, but not those already used by other bots. Look at `find intelmq -type f -name '*.py' -exec grep -r 'redis_cache_db' {} +` to see which databases are already used. + +The databases < 10 are reserved for the IntelMQ core: + +- 2: pipeline +- 3: statistics +- 4: tests + ### Pipeline Interactions We can call three methods related to the pipeline: @@ -286,6 +222,8 @@ self.logger.debug('Connecting to %r.', host) The bot class itself has error handling implemented. The bot itself is allowed to throw exceptions and **intended to fail**! The bot should fail in case of malicious messages, and in case of unavailable but necessary resources. The bot class handles the exception and will restart until the maximum number of tries is reached and fail then. Additionally, the message in question is dumped to the file `/opt/intelmq/var/log/[bot-id].dump` and removed from the queue. +## Configuration and parameter handling + ### Initialization Maybe it is necessary so setup a Cache instance or load a file into memory. Use the `init` function for this purpose: @@ -409,7 +347,9 @@ BOT = MyParserBot One line can lead to multiple events, thus `parse_line` can't just return one Event. Thus, this function is a generator, which allows to easily return multiple values. Use `yield event` for valid Events and `return` in case of a void result (not parsable line, invalid data etc.). -### Tests +## Tests and documentation + +### Unit Tests In order to do automated tests on the bot, it is necessary to write tests including sample data. Have a look at some existing tests: @@ -455,45 +395,19 @@ When calling the file directly, only the tests in this file for the bot will be See the `testing` section about how to run the tests. -### Cache - -Bots can use a Redis database as cache instance. Use the `intelmq.lib.utils.Cache` class to set this up and/or look at existing bots, like the `cymru_whois` expert how the cache can be used. Bots must set a TTL for all keys that are cached to avoid caches growing endless over time. Bots must use the Redis databases >= 10, but not those already used by other bots. Look at `find intelmq -type f -name '*.py' -exec grep -r 'redis_cache_db' {} +` to see which databases are already used. - -The databases < 10 are reserved for the IntelMQ core: - -- 2: pipeline -- 3: statistics -- 4: tests - ### Documentation -Please document your added/modified code. - -For doc strings, we are using the -[sphinx-napoleon-google-type-annotation](http://www.sphinx-doc.org/en/stable/ext/napoleon.html#type-annotations). - -Additionally, Python's type hints/annotations are used, see PEP484. - +Documentation is an integral part of the development process. -## Testing Pre-releases +IntelMQ uses Python's type hints/type annotations where possible. -The installation procedures is slightly different for the pre-releases. - -### Installation with packages - -For native packages, you can find the unstable packages of the next version here: -[Installation Unstable Native Packages](https://software.opensuse.org/download.html?project=home%3Asebix%3Aintelmq%3Aunstable&package=intelmq). -The unstable repository only has a limited set of packages, so enable the stable repository in parallel. - -### Installation with pip +For doc strings, we are using the +[sphinx-napoleon-google-type-annotation](http://www.sphinx-doc.org/en/stable/ext/napoleon.html#type-annotations) where applicable. -For the installation with pip, use the `--pre` parameter as shown here following command: +#### Bot documentations -```bash -pip3 install --pre intelmq -``` +#### Feed documentation -### Testing +## Getting the code upstream -All other steps are not different per installation variant. -Please report any issues you find in our [Issue Tracker](https://github.com/certtools/intelmq/issues/new). +Entry to the change log and news files diff --git a/docs/dev/data-format.md b/docs/dev/data-format.md index e5b1833476..063dd3c7ca 100644 --- a/docs/dev/data-format.md +++ b/docs/dev/data-format.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Data Format Data passed between bots is called a Message. There are two types of Messages: Report and Event. Report is produced by collector bots and consists of collected raw data (CSV, JSON, HTML, etc) and feed metadata. It is passed to a parser bot which parses Report into a single or multiple Events. Expert bots and output bots handle only Events. @@ -34,26 +33,22 @@ See The first and last ASNs of the original 16-bit integers, namely 0 and 65,535, and the last ASN of the 32-bit numbers, namely 4,294,967,295 are reserved and should not be used by operators. - ### Accuracy Accuracy type. A Float between 0 and 100. - ### Base64 Base64 type. Always gives unicode strings. Sanitation encodes to base64 and accepts binary and unicode strings. - ### Boolean Boolean type. Without sanitation only python bool is accepted. Sanitation accepts string 'true' and 'false' and integers 0 and 1. - ### ClassificationTaxonomy `classification.taxonomy` type. @@ -175,7 +170,6 @@ The following additional conversions are available with the convert function: - `utc_isoformat`: Parse date generated by datetime.isoformat() - `fuzzy` (or None): Use dateutils' fuzzy parser, default if no specific parser is given - ### FQDN Fully qualified domain name type. @@ -186,7 +180,6 @@ dot is not allowed. To prevent values like '10.0.0.1:8080' (#1235), we check for the non-existence of ':'. - ### Float Float type. Without sanitation only python float/integer/long is @@ -194,7 +187,6 @@ accepted. Boolean is explicitly denied. Sanitation accepts strings and everything float() accepts. - ### IPAddress Type for IP addresses, all families. Uses the ipaddress module. @@ -203,7 +195,6 @@ Sanitation accepts integers, strings and objects of ipaddress.IPv4Address and ip Valid values are only strings. 0.0.0.0 is explicitly not allowed. - ### IPNetwork Type for IP networks, all families. Uses the ipaddress module. @@ -213,7 +204,6 @@ If host bits in strings are set, they will be ignored (e.g 127.0.0.1/32). Valid values are only strings. - ### Integer Integer type. Without sanitation only python integer/long is accepted. @@ -221,7 +211,6 @@ Bool is explicitly denied. Sanitation accepts strings and everything int() accepts. - ### JSON JSON type. @@ -230,7 +219,6 @@ Sanitation accepts any valid JSON objects. Valid values are only unicode strings with JSON objects. - ### JSONDict JSONDict type. @@ -239,14 +227,12 @@ Sanitation accepts pythons dictionaries and JSON strings. Valid values are only unicode strings with JSON dictionaries. - ### LowercaseString Like string, but only allows lower case characters. Sanitation lowers all characters. - ### Registry Registry type. Derived from UppercaseString. @@ -254,12 +240,10 @@ Registry type. Derived from UppercaseString. Only valid values: AFRINIC, APNIC, ARIN, LACNIC, RIPE. RIPE-NCC and RIPENCC are normalized to RIPE. - ### String Any non-empty string without leading or trailing whitespace. - ### TLP TLP level type. Derived from UppercaseString. @@ -268,7 +252,6 @@ Only valid values: WHITE, GREEN, AMBER, RED. Accepted for sanitation are different cases and the prefix 'tlp:'. - ### URL URI type. Local and remote. @@ -278,7 +261,6 @@ For local URIs (file) a missing host is replaced by localhost. Valid values must have the host (network location part). - ### UppercaseString Like string, but only allows upper case characters. diff --git a/docs/dev/environment.md b/docs/dev/environment.md index 925e35c29f..c3d5941aff 100644 --- a/docs/dev/environment.md +++ b/docs/dev/environment.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Development Environment ## Directories @@ -147,4 +146,3 @@ source .venv/bin/activate # Use for virtual environment installation intelmqctl start ``` - diff --git a/docs/dev/extensions-packages.md b/docs/dev/extensions-packages.md index f73f659952..de005550eb 100644 --- a/docs/dev/extensions-packages.md +++ b/docs/dev/extensions-packages.md @@ -57,4 +57,4 @@ file would then have the following section: ``` Once you have installed your package, you can run ``intelmqctl list bots`` to check if your bot was -properly registered. \ No newline at end of file +properly registered. diff --git a/docs/dev/guidelines.md b/docs/dev/guidelines.md index 7723514e53..e9e9937e52 100644 --- a/docs/dev/guidelines.md +++ b/docs/dev/guidelines.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Development Guidelines ## Coding-Rules @@ -148,4 +147,4 @@ License and Authors files can be found at the root of repository. - Credit to the authors file must be always retained. When a new contributor (person and/or organization) improves in some way the repository content (code or documentation), he or she might add his name to the list of contributors. -License and authors must be only listed in an external file but not inside the code files. \ No newline at end of file +License and authors must be only listed in an external file but not inside the code files. diff --git a/docs/dev/intro.md b/docs/dev/intro.md index 9b9d049626..b880485455 100644 --- a/docs/dev/intro.md +++ b/docs/dev/intro.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Intro This guide is for developers of IntelMQ. It explains the code architecture, coding guidelines as well as ways you can contribute code or documentation. If you have not done so, please read the @@ -39,4 +38,4 @@ The [IntelMQ-DevArchive](https://lists.cert.at/mailman3/hyperkitty/list/intelmq- ## GitHub The ideal way to propose changes and additions to IntelMQ is to open -a [Pull Request](https://github.com/certtools/intelmq/pulls) on GitHub. \ No newline at end of file +a [Pull Request](https://github.com/certtools/intelmq/pulls) on GitHub. diff --git a/docs/dev/release.md b/docs/dev/release.md index fcd78590cf..743a7d1701 100644 --- a/docs/dev/release.md +++ b/docs/dev/release.md @@ -1,14 +1,41 @@ - # Release procedure -General assumption: You are working on branch maintenance, the next -version is a bug fix release. For feature releases it is slightly -different. +## Make a pre-release + +Consider whether a pre-release would be necessary or good. A pre-release requires less effort than a stable release. It only needs: +- Proper version numbers in `setup.py`, e.g. `1.2.3-alpha1` or `1.2.3-rc1` +- Optional: A git tag and GitHub release (mark it as pre-release) +- Push to PyPI +- Proper version numbers in `debian/changelog`, e.g. `1.2.3~alpha1-1` or `1.2.3~rc1-1` + - The tilde `~` make sure it is considered older than the final `1.2.3-1` + +### Testing Pre-releases + +The installation procedures is slightly different for the pre-releases. + +#### Installation with packages + +For native packages, you can find the unstable packages of the next version here: +[Installation Unstable Native Packages](https://software.opensuse.org/download.html?project=home%3Asebix%3Aintelmq%3Aunstable&package=intelmq). +The unstable repository only has a limited set of packages, so enable the stable repository in parallel. + +#### Installation with pip + +For the installation with pip, use the `--pre` parameter as shown here following command: + +```bash +pip3 install --pre intelmq +``` + +#### Testing + +All other steps are not different per installation variant. +Please report any issues you find in our [Issue Tracker](https://github.com/certtools/intelmq/issues/new). ## Check before @@ -70,7 +97,6 @@ python3 setup.py sdist bdist_wheel * Upload the files including signatures to PyPI with e.g. twine: `twine upload -u __token__ -p $APITOKEN dist/intelmq...` (or set the API Token in `.pypirc`). - ## Documentation Since using mkdocs (see https://docs.intelmq.org) nothing needs to be done anymore. @@ -95,7 +121,7 @@ Releasing a new Docker image is very easy. - Clone [IntelMQ Docker Repository](https://github.com/certat/intelmq-docker) with `git clone https://github.com/certat/intelmq-docker.git --recursive` as this repository contains submodules - If the `intelmq-docker` repository is not updated yet, use `git pull --recurse-submodules` to pull the latest changes from their respective repository. - Run `./build.sh`, check your console if the build was successful. -- Run `./test.sh` - It will run nosetests3 with the exotic flag. All +- Run `./test.sh` - It will run all tests with the exotic flag. All errors/warnings will be displayed. - Change the `build_version` in `publish.sh` to the new version you want to release. diff --git a/docs/dev/structure.md b/docs/dev/structure.md index 48d553e69d..845c764143 100644 --- a/docs/dev/structure.md +++ b/docs/dev/structure.md @@ -1,9 +1,8 @@ - # System Overview In the `intelmq/lib/` directory you can find some libraries: @@ -25,3 +24,87 @@ In the `intelmq/lib/` directory you can find some libraries: ### Code Architecture ![Code Architecture](../static/images/intelmq-arch-schema.png) + +## Directories Hierarchy on Default Installation + +- Configuration Files Path: `/opt/intelmq/etc/` +- PID Files Path: `/opt/intelmq/var/run/` +- Logs Files and dumps Path: `/opt/intelmq/var/log/` +- Additional Bot Files Path, e.g. templates or databases: + `/opt/intelmq/var/lib/bots/[bot-name]/` + +## Repository and software file layout + +This is the directory and file structure of the layout including a brief description of their meanings. +For a better overview, some details are left out. + +* `contrib/` (collection of useful tools related to IntelMQ, but not officially part of it and not necessarily well-tested or maintained) +* `debian/` (Packaging definitions and rules for Debian-based distributions) +* `docs/` (the documentation you are reading right now) + * `admin/` (for IntelMQ system administration) + * `dev/` (for IntelMQ development) + * `user/` (for IntelMQ usage) +* `intelmq/` + * `bin/` + * `intelmqctl.py` (the primary command line interface `intelmqctl`) + * `intelmqdump.py` (for handling bot dump files) + * `bots/` + * `collector/` + * `` + * `collector_.py` + * `parser/` + * `` + * `parser_.py` + * `expert/` + * `` + * `expert_.py` + * `output/` + * `` + * `output_.py` + * `etc/` + * `runtime.yaml` (default configuration) + * `feeds.yaml` (documented supported feeds) + * `harmonization.conf` (default data format fields specification) + * `lib/` (IntelMQ's internal libraries) + * `bot.py` (bot class definitions) + * `cache.py` + * `datatypes.py` + * `exceptions.py` + * `message.py` (message class definitions including Report and Event) + * `pipeline.py` (handling of the message queue aka "pipeline") + * `processmanager.py` + * `upgrades.py` + * `utils.py` (utility functions) + * `mixins/` (Additionally helper classes for bots) + * `cache.py` + * `http.py` + * `sql.py` + * `stomp.py` + * `tests/` (IntelMQ unit tests) + * `assets/` (assets used in multiple tests) + * `bin/` + * `lib/` + * `bots/` + * same structure as in `intelmq/bots/` + +### Bot naming conventions + +Assuming you want to create a bot for a new 'Abuse.ch' feed. +It turns out that here it is necessary to create different parsers for the respective kind of events (e.g. malicious URLs). +The solution is to use one directory for the feed provider (Abuse.ch) and multiple parser files named like the feed name, separated by underscore. + +Example for multiple parses related to one feed provider: + +``` +/intelmq/bots/parser/abusech/parser_domain.py +/intelmq/bots/parser/abusech/parser_ip.py +/intelmq/bots/parser/abusech/parser_ransomware.py +/intelmq/bots/parser/abusech/parser_malicious_url.py +``` + +The same applies to other buts types and also services or protocols. E.g. there are two HTTP collectors (a "normal" one and a stream collector) or two Microsoft collectors (supporting two different APIs). + +Any directory name and file name of IntelMQ has to: + +- be represented with lowercase and in case of the name has multiple words, the spaces between them must be removed or replaced by underscores +- be self-descriptive what the content contains diff --git a/docs/dev/testing.md b/docs/dev/testing.md index a300d63900..01b700dbe1 100644 --- a/docs/dev/testing.md +++ b/docs/dev/testing.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Testing ## Additional test requirements @@ -59,4 +58,4 @@ INTELMQ_TEST_DATABASES=1 INTELMQ_TEST_EXOTIC=1 pytest intelmq/tests/ ## Configuration test files -The tests use the configuration files in your working directory, not those installed in `/opt/intelmq/etc/` or `/etc/`. You can run the tests for a locally changed intelmq without affecting an installation or requiring root to run them. \ No newline at end of file +The tests use the configuration files in your working directory, not those installed in `/opt/intelmq/etc/` or `/etc/`. You can run the tests for a locally changed intelmq without affecting an installation or requiring root to run them. diff --git a/docs/help.md b/docs/help.md index 6ea8f77340..8153449c57 100644 --- a/docs/help.md +++ b/docs/help.md @@ -35,5 +35,3 @@ If your organisation is a member of the [CSIRTs Network](https://csirtsnetwork.e - [Aaron Kaplan](https://github.com/aaronkaplan/) (founder of IntelMQ) - [Institute for Common Good Technology](https://commongoodtechnology.org/) (chairmen Sebastian Wager is an IntelMQ maintainer and developer) - [Intevation GmbH](https://intevation.de/) (Develops and maintains several IntelMQ components) - - diff --git a/docs/index.md b/docs/index.md index ddb66c6bb7..c38bcc608b 100644 --- a/docs/index.md +++ b/docs/index.md @@ -9,7 +9,6 @@ ![IntelMQ](docs/static/images/Logo_Intel_MQ.svg) - # Introduction **IntelMQ** is a solution for IT security teams (CERTs & CSIRTs, SOCs diff --git a/docs/overview.md b/docs/overview.md index a52467c948..63ef30035c 100644 --- a/docs/overview.md +++ b/docs/overview.md @@ -136,4 +136,4 @@ Developed and maintained by [CERT.at](https://cert.at). The list of useful scripts contributed to the IntelMQ universe can be found in the main repository. -→ [Repository: intelmq/contrib](https://github.com/certtools/intelmq/tree/develop/contrib) \ No newline at end of file +→ [Repository: intelmq/contrib](https://github.com/certtools/intelmq/tree/develop/contrib) diff --git a/docs/unsorted/botnet-concept.md b/docs/unsorted/botnet-concept.md index 57eef1569d..0334a7de20 100644 --- a/docs/unsorted/botnet-concept.md +++ b/docs/unsorted/botnet-concept.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - #### Botnet Concept The \"botnet\" represents all currently configured bots which are explicitly enabled. It is, in essence, the graph of diff --git a/docs/unsorted/intelmq-3.0-architecture.md b/docs/unsorted/intelmq-3.0-architecture.md index ac24809cba..3ffc642959 100644 --- a/docs/unsorted/intelmq-3.0-architecture.md +++ b/docs/unsorted/intelmq-3.0-architecture.md @@ -46,9 +46,6 @@ See [#1424](https://github.com/certtools/intelmq/issues/1424) ## UX - - - ### Devops/ Sysadmin perspective #### Docker @@ -81,11 +78,8 @@ _Think about_: shadowserver already created some training material. Build on thi _Category_: OPTIONAL component, but highly needed. - ## Architecture - - ### Message queue _Task_: Create a Kafka MQ backend: add Kafka as a replaceable MQ for IntelMQ 3.0 @@ -97,7 +91,6 @@ _Think about_: Using [Apache Pulsar](https://pulsar.apache.org/) _Category_: SHOULD - ## Notification settings _Task_: Keep notification settings per event: Where to (destination mail/host address), how (protocol, authentication (SSL client certificate), etc), how often/time information (intervals etc.) @@ -108,14 +101,12 @@ See also https://github.com/certtools/intelmq/issues/758 _Category_: this feature should be OPTIONAL but is NEEDED by several users. - ## Configuration parameter handling in Bots and a bot's unified documentation _Task_: Handle bots' configuration parameters by the core, providing type sanitation, checks, default values and documentation. _Background_: Currently every bot needs to handle these issues itself, but many of these checks could be done centrally in a generic way. At upgrades, new configuration might get introduced and the bots need to provide defaults values although they are available in BOTS. Error handling on parameters must be done for every bot on itself. Documentation is not available to the Bots, not available in BOTS and the Manager. There are 3 places for parameters where the available information is spread: BOTS, `Bots.md` and the bots' code. - ## Automatic Monitoring & Management: Handling full load situations _Task_: Create a solution to prevent system over-loading (only for Redis). @@ -124,7 +115,6 @@ _Background_: If too much data is ingested, collected or enriched, the system ca See also: https://github.com/certtools/intelmq/issues/709 - ## Making intelmq plug-able and getting rid of BOTS _Task_: Allow installation of IntelMQ bots, meaning the deprecation of the centralized BOTS file and a generated documentation. @@ -133,14 +123,12 @@ _Background_: Adapting IntelMQ to specific needs also means the development of s See also https://github.com/certtools/intelmq/issues/972 - ## Exposing a plug-in or hooking API _Task_: Provide an hooking API for the core classes. _Background_: Adapting IntelMQ to specific can require adaptions in the Core classes' code. Instead of making the changes/extensions in the core itself, we can provide a hook system allowing to call (or replace?) functions at specific steps. For example custom monitoring. - ## Grouping of events _Task_: Provide possibilities to assign an event to a group of events. diff --git a/docs/user/abuse-contacts.md b/docs/user/abuse-contacts.md index daa9dc4742..1111924335 100644 --- a/docs/user/abuse-contacts.md +++ b/docs/user/abuse-contacts.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Abuse-contact look-ups The right decision whom to contact about a specific incident is vital to get the incident resolved as quick as possible. Different types of events may required different abuse-contact to be selected. For example, issues about a device, e.g. a vulnerability in the operating system or an application, is better sent to the hoster which can inform the server administrator. For website-related issues, like defacements or phishing, the domain owner (maintaining the content of the website) could be the better and more direct contact. Additionally, different CERT's have different approaches and different contact databases. Multiple information sources have different information, and some sources are more accurate than others. IntelMQ can query multiple sources of abuse-contacts and combine them. Internal databases, like a Constituency Portal provide high-quality and first-hand contact information. The RIPE document [Sources of Abuse Contact Information for Abuse Handlers](https://www.ripe.net/publications/docs/ripe-658) contains a good summary of the complex of themes. diff --git a/docs/user/api.md b/docs/user/api.md index 68d9b76a64..209a1dc36b 100644 --- a/docs/user/api.md +++ b/docs/user/api.md @@ -3,7 +3,6 @@ SPDX-License-Identifier: AGPL-3.0-or-later --> - # Using IntelMQ API !!! bug @@ -11,7 +10,6 @@ ## Usage from programs - The IntelMQ API can also be used from programs, not just browsers. To do so, first send a POST-Request with JSON-formatted data to @@ -56,4 +54,4 @@ Here is a full example using **curl**: The same approach also works for *Ansible*, as you can see here: 1. -2. \ No newline at end of file +2. diff --git a/docs/user/bots.md b/docs/user/bots.md index 8b1b956419..4caf100941 100644 --- a/docs/user/bots.md +++ b/docs/user/bots.md @@ -1937,6 +1937,7 @@ If the input data did not contain the field `classification.type`, it is set to Supports multiple different modes: #### Input data is one event + Example: ```json { INTELMQ data... } @@ -1953,6 +1954,7 @@ Configuration: * `multiple_events`: False #### Input data is in JSON stream format + Example: ```json { INTELMQ data... } @@ -1965,6 +1967,7 @@ Configuration: * `multiple_events`: False #### Input data is a list of events + Example: ```json [ @@ -2787,6 +2790,7 @@ For a detailed description of the modes, see below. ### Modes #### IP Network + For each incoming event, the bots chooses one random IP network range (IPv4 or IPv6) from the configured data file. It set's the first IP address of the range as `source.ip` and the network itself as `source.network`. To adapt the `source.asn` field accordingly, use the [ASN Lookup Expert](#asn-lookup). @@ -2795,7 +2799,9 @@ For data consistency `source.network` will only be set if `source.ip` was set or If overwrite is false, `source.ip` was did not exist before but `source.network` existed before, `source.network` will still be overridden. #### Event fields + ##### Mode `random_single_value` + For any possible event field, the bot chooses a random value of the values in the `values` property. --- @@ -5425,7 +5431,6 @@ The parameters marked with 'PostgreSQL' will be sent to libpq via psycopg2. Chec (optional, boolean) Whether an error should cause the bot to fail (raise an exception) or otherwise rollback. If false, the bot eventually waits and re-try (e.g. re-connect) etc. to solve the issue. If true, the bot raises an exception and - depending on the IntelMQ error handling configuration - stops. Defaults to false. - ### STOMP This bot pushes data to any STOMP stream. STOMP stands for Streaming Text Oriented Messaging Protocol. See: diff --git a/docs/user/intro.md b/docs/user/intro.md index cb5944c6cb..0e12e0e1e8 100644 --- a/docs/user/intro.md +++ b/docs/user/intro.md @@ -47,4 +47,4 @@ The User Guide provides information on how to use installed IntelMQ and it's com - Individual bots as well as the complete pipeline can be configured, managed and monitored via: - Web interface called **IntelMQ Manager** (best suited for regular users). - Command line tool called **intelmqctl** (best suited for administrators). - - REST API provided by the **IntelMQ API** extension (best suited for other programs). \ No newline at end of file + - REST API provided by the **IntelMQ API** extension (best suited for other programs). diff --git a/docs/user/manager.md b/docs/user/manager.md index feb28da621..b67b6afae2 100644 --- a/docs/user/manager.md +++ b/docs/user/manager.md @@ -4,6 +4,7 @@ --> # Using IntelMQ Manager + **IntelMQ Manager** is a graphical interface to manage configurations for IntelMQ. It's goal is to provide an intuitive tool to allow non-programmers to specify the data flow in IntelMQ. ## Configuration Pages @@ -60,11 +61,10 @@ that bot and also the last 20 log lines of that single bot. ![Bot Monitor](../static/images/intelmq-manager/monitor2.png) - ## Keyboard Shortcuts Any underscored letter denotes access key shortcut. The needed shortcut-keyboard is different per Browser: - Firefox: ++ctrl+alt++ + Letter -- Chrome & Chromium: ++alt++ + Letter \ No newline at end of file +- Chrome & Chromium: ++alt++ + Letter diff --git a/intelmq/etc/feeds.yaml b/intelmq/etc/feeds.yaml index f132bfe65c..be74b2b205 100644 --- a/intelmq/etc/feeds.yaml +++ b/intelmq/etc/feeds.yaml @@ -13,8 +13,6 @@ providers: parameters: http_url: https://tracker.viriback.com/dump.php rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.generic.csv_parser parameters: @@ -55,8 +53,6 @@ providers: parameters: http_url: https://lists.malwarepatrol.net/cgi/getfile?receipt={{ your API key }}&product=8&list=dansguardian rate_limit: 180000 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.malwarepatrol.parser_dansguardian parameters: @@ -83,8 +79,6 @@ providers: extract_files: false attach_regex: csv rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.zoneh.parser parameters: @@ -104,8 +98,6 @@ providers: parameters: http_url: https://www.openphish.com/feed.txt rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.openphish.parser parameters: @@ -126,8 +118,6 @@ providers: http_password: "{{ your password }}" http_username: "{{ your username }}" rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.openphish.parser_commercial parameters: @@ -144,8 +134,6 @@ providers: parameters: http_url: https://feodotracker.abuse.ch/downloads/ipblocklist.json rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.abusech.parser_feodotracker parameters: @@ -166,8 +154,6 @@ providers: https://urlhaus.abuse.ch/feeds/country//, or https://urlhaus.abuse.ch/feeds/asn// rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.generic.parser_csv parameters: @@ -209,8 +195,6 @@ providers: http_url: https://www.cymru.com/$certname/$certname_{time[%Y%m%d]}.txt http_username: "{{ your username }}" rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.cymru.parser_cap_program parameters: @@ -232,8 +216,6 @@ providers: parameters: http_url: https://www.team-cymru.org/Services/Bogons/fullbogons-ipv4.txt rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.cymru.parser_full_bogons parameters: @@ -255,8 +237,6 @@ providers: parameters: http_url: https://www.team-cymru.org/Services/Bogons/fullbogons-ipv6.txt rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.cymru.parser_full_bogons parameters: @@ -277,8 +257,6 @@ providers: parameters: http_url: https://dataplane.org/sshclient.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dataplane.parser parameters: @@ -298,8 +276,6 @@ providers: parameters: http_url: https://dataplane.org/sshpwauth.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dataplane.parser parameters: @@ -319,8 +295,6 @@ providers: parameters: http_url: https://dataplane.org/sipquery.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dataplane.parser parameters: @@ -340,8 +314,6 @@ providers: parameters: http_url: https://dataplane.org/sipregistration.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dataplane.parser parameters: @@ -361,8 +333,6 @@ providers: parameters: http_url: https://dataplane.org/dnsrd.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dataplane.parser parameters: @@ -382,8 +352,6 @@ providers: parameters: http_url: https://dataplane.org/dnsrdany.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dataplane.parser parameters: @@ -403,8 +371,6 @@ providers: parameters: http_url: https://dataplane.org/dnsversion.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dataplane.parser parameters: @@ -422,8 +388,6 @@ providers: parameters: http_url: https://dataplane.org/proto41.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dataplane.parser parameters: @@ -443,8 +407,6 @@ providers: parameters: http_url: https://dataplane.org/smtpgreet.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dataplane.parser parameters: @@ -464,8 +426,6 @@ providers: parameters: http_url: https://dataplane.org/smtpdata.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dataplane.parser parameters: @@ -485,8 +445,6 @@ providers: parameters: http_url: https://dataplane.org/telnetlogin.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dataplane.parser parameters: @@ -506,8 +464,6 @@ providers: parameters: http_url: https://dataplane.org/vncrfb.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dataplane.parser parameters: @@ -529,8 +485,6 @@ providers: parameters: http_url: https://view.sentinel.turris.cz/greylist-data/greylist-latest.csv rate_limit: 43200 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.turris.parser parameters: @@ -593,7 +547,6 @@ providers: parameters: http_url: https://www.turris.cz/greylist-data/greylist-latest.csv name: Greylist - provider: __PROVIDER__ rate_limit: 43200 signature_url: https://www.turris.cz/greylist-data/greylist-latest.csv.asc verify_pgp_signatures: true @@ -614,8 +567,6 @@ providers: http_url: https://dsi.ut-capitole.fr/blacklists/download/{collection name}.tar.gz extract_files: 'true' rate_limit: 43200 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.generic.parser_csv parameters: @@ -636,8 +587,6 @@ providers: parameters: http_url: http://danger.rulez.sk/projects/bruteforceblocker/blist.php rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.danger_rulez.parser parameters: @@ -658,8 +607,6 @@ providers: parameters: http_url: https://www.spamhaus.org/drop/drop.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.spamhaus.parser_drop parameters: @@ -677,8 +624,6 @@ providers: parameters: http_url: https://www.spamhaus.org/drop/asndrop.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.spamhaus.parser_drop parameters: @@ -697,8 +642,6 @@ providers: parameters: http_url: https://www.spamhaus.org/drop/dropv6.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.spamhaus.parser_drop parameters: @@ -716,8 +659,6 @@ providers: parameters: http_url: "{{ your CERT portal URL }}" rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.spamhaus.parser_cert parameters: @@ -735,8 +676,6 @@ providers: parameters: http_url: https://www.spamhaus.org/drop/edrop.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.spamhaus.parser_drop parameters: @@ -755,8 +694,6 @@ providers: http_url: https://data.phishtank.com/data/{{ your API key }}/online-valid.json.gz extract_files: true rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.phishtank.parser parameters: @@ -777,8 +714,6 @@ providers: parameters: http_url: http://cinsscore.com/list/ci-badguys.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.ci_army.parser parameters: @@ -795,8 +730,6 @@ providers: parameters: http_url: https://lists.blocklist.de/lists/ircbot.txt rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.blocklistde.parser parameters: @@ -814,8 +747,6 @@ providers: parameters: http_url: https://lists.blocklist.de/lists/strongips.txt rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.blocklistde.parser parameters: @@ -833,8 +764,6 @@ providers: parameters: http_url: https://lists.blocklist.de/lists/mail.txt rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.blocklistde.parser parameters: @@ -853,8 +782,6 @@ providers: parameters: http_url: https://lists.blocklist.de/lists/apache.txt rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.blocklistde.parser parameters: @@ -872,8 +799,6 @@ providers: parameters: http_url: https://lists.blocklist.de/lists/ftp.txt rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.blocklistde.parser parameters: @@ -891,8 +816,6 @@ providers: parameters: http_url: https://lists.blocklist.de/lists/ssh.txt rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.blocklistde.parser parameters: @@ -910,8 +833,6 @@ providers: parameters: http_url: https://lists.blocklist.de/lists/bruteforcelogin.txt rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.blocklistde.parser parameters: @@ -931,8 +852,6 @@ providers: parameters: http_url: https://lists.blocklist.de/lists/bots.txt rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.blocklistde.parser parameters: @@ -950,8 +869,6 @@ providers: parameters: http_url: https://lists.blocklist.de/lists/imap.txt rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.blocklistde.parser parameters: @@ -970,8 +887,6 @@ providers: parameters: http_url: https://lists.blocklist.de/lists/sip.txt rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.blocklistde.parser parameters: @@ -998,8 +913,6 @@ providers: rate_limit: 86400 subject_regex: ^\\[CB-Report#.* Malware infections (\\(Avalanche\\) )?in country folder: INBOX - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.generic.parser_csv parameters: @@ -1062,8 +975,6 @@ providers: auth_by_ssl_client_certificate: false username: "{insert your *n6* login, e.g. someuser@my.example.org}" password: "{insert your *n6* API key}" - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.n6.parser_n6stomp parameters: @@ -1080,8 +991,6 @@ providers: module: intelmq.bots.collectors.alienvault_otx.collector parameters: api_key: "{{ your API key }}" - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.alienvault.parser_otx parameters: @@ -1097,8 +1006,6 @@ providers: parameters: http_url: https://reputation.alienvault.com/reputation.data rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.alienvault.parser parameters: @@ -1118,8 +1025,6 @@ providers: http_timeout_sec: 120 http_user_agent: "{{ your user agent }}" rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.cleanmx.parser parameters: @@ -1138,8 +1043,6 @@ providers: http_timeout_sec: 120 http_user_agent: "{{ your user agent }}" rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.cleanmx.parser parameters: @@ -1156,8 +1059,6 @@ providers: parameters: http_url: https://prod.cyberfeed.net/stream?key={{ your API key }} strip_lines: 'true' - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.anubisnetworks.parser parameters: @@ -1178,8 +1079,6 @@ providers: http_username: __USERNAME__ http_password: __PASSWORD__ rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.bambenek.parser parameters: @@ -1198,8 +1097,6 @@ providers: http_username: __USERNAME__ http_password: __PASSWORD__ rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.bambenek.parser parameters: @@ -1215,8 +1112,6 @@ providers: parameters: http_url: https://faf.bambenekconsulting.com/feeds/dga-feed.txt rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.bambenek.parser parameters: @@ -1237,7 +1132,6 @@ providers: http_url: http://security-research.dyndns.org/pub/malware-feeds/ponmocup-infected-domains-CIF-latest.txt rate_limit: 10800 name: Infected Domains - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dyn.parser parameters: @@ -1257,7 +1151,6 @@ providers: http_url: http://security-research.dyndns.org/pub/malware-feeds/ponmocup-infected-domains-shadowserver.csv rate_limit: 10800 name: Infected Domains - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.generic.parser_csv parameters: @@ -1291,8 +1184,6 @@ providers: parameters: http_url: https://www.dshield.org/block.txt rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dshield.parser_block parameters: @@ -1308,8 +1199,6 @@ providers: parameters: http_url: https://dshield.org/asdetailsascii.html?as={{ AS Number }} rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.dshield.parser_asn parameters: @@ -1326,8 +1215,6 @@ providers: parameters: http_url: http://vxvault.net/URL_List.php rate_limit: 3600 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.vxvault.parser parameters: @@ -1351,8 +1238,6 @@ providers: rate_limit: 86400 subject_regex: __REGEX__ folder: INBOX - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.shadowserver.parser parameters: @@ -1372,7 +1257,6 @@ providers: http_password: "{{ your HTTP Authentication password or null }}" http_username: "{{ your HTTP Authentication username or null }}" password: __PASSWORD__ - provider: __PROVIDER__ rate_limit: 3600 search_not_older_than: "{{ relative time or null }}" search_owner: nobody @@ -1426,8 +1310,6 @@ providers: http_password: "{{ your password }}" http_username: "{{ your username }}" rate_limit: 10800 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.fraunhofer.parser_dga parameters: @@ -1444,8 +1326,6 @@ providers: parameters: http_url: https://www.malwareurl.com/ rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.malwareurl.parser parameters: @@ -1465,8 +1345,6 @@ providers: not_older_than: "2 days" rate_limit: 3600 http_timeout_sec: 300 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.microsoft.parser_bingmurls parameters: @@ -1485,8 +1363,6 @@ providers: not_older_than: "2 days" rate_limit: 3600 http_timeout_sec: 300 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.microsoft.parser_ctip parameters: @@ -1502,8 +1378,6 @@ providers: parameters: connection_string: "{{ your connection string }}" container_name: "ctip-infected-summary" - name: __FEED__ - provider: __PROVIDER__ rate_limit: 3600 redis_cache_db: 5 redis_cache_host: 127.0.0.1 @@ -1524,8 +1398,6 @@ providers: parameters: connection_string: "{{ your connection string }}" container_name: "ctip-c2" - name: __FEED__ - provider: __PROVIDER__ rate_limit: 3600 redis_cache_db: 5 redis_cache_host: 127.0.0.1 @@ -1547,8 +1419,6 @@ providers: parameters: http_url: https://www.threatminer.org/ rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.threatminer.parser parameters: @@ -1563,8 +1433,6 @@ providers: collector: module: intelmq.bots.collectors.calidog.collector_certstream parameters: - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.calidog.parser_certstream parameters: @@ -1598,8 +1466,6 @@ providers: parameters: http_url: https://cybercrime-tracker.net/index.php rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.html_table.parser parameters: @@ -1626,8 +1492,6 @@ providers: parameters: http_url: https://precisionsec.com/threat-intelligence-feeds/agent-tesla/ rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.html_table.parser parameters: @@ -1670,8 +1534,6 @@ providers: module: intelmq.bots.collectors.api.collector_api parameters: port: 5001 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.hibp.parser_callback parameters: @@ -1724,8 +1586,6 @@ providers: http_url_formatting: days: -1 rate_limit: 86400 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.cznic.parser_proki parameters: @@ -1780,8 +1640,6 @@ providers: api_key: countries: error_retry_delay: 0 - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.shodan.parser parameters: @@ -1819,8 +1677,6 @@ providers: module: intelmq.bots.collectors.http.collector_http parameters: http_url: http://benkow.cc/export.php - name: __FEED__ - provider: __PROVIDER__ parser: module: intelmq.bots.parsers.generic.parser_csv parameters: @@ -1840,3 +1696,53 @@ providers: - true - false - true + Stop Forum Spam: + Toxic IP Addresses: + description: IP Networks that are believed will only ever be used for abuse + documentation: https://www.stopforumspam.com/downloads + revision: 2025-09-21 + public: true + bots: + collector: + module: intelmq.bots.collectors.http.collector_http + parameters: + accuracy: 80 + http_url: https://www.stopforumspam.com/downloads/toxic_ip_cidr.txt + rate_limit: 86400 + parser: + module: intelmq.bots.parsers.generic.parser_csv + parameters: + columns: source.network + default_fields: + classification.type: blacklist + protocol.application: http + protocol.transport: tcp + event_description.target: web forums + event_description.text: web forum spam + event_description.url: https://www.stopforumspam.com/ + tlp: white +CERT.PL + Hole Domains v2: + description: Dangerous websites Warning List + documentation: https://cert.pl/en/warning-list/ + revision: 2025-09-23 + public: true + bots: + collector: + module: intelmq.bots.collectors.http.collector_http + parameters: + accuracy: 50 + rate_limit: 1800 + http_url: https://hole.cert.pl/domains/v2/domains.csv + parser: + module: intelmq.bots.parsers.generic.parser_csv + parameters: + columns: extra.certpl_register,source.fqdn,extra.first_seen,extra.expiration_date + default_fields: + classification.type: blacklist + protocol.application: http + protocol.transport: tcp + event_description.target: users + event_description.text: phishing + event_description.url: https://cert.pl/en/warning-list/ + tlp: white diff --git a/mkdocs.yml b/mkdocs.yml index 5c4be87a79..37ec7a182e 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -59,10 +59,6 @@ plugins: - glightbox # enlarging images - - redirects: - redirect_maps: # TODO add other redirects from old docs - 'en/latest/dev/data-format.html': 'dev/data-format.md' - extra: version: provider: mike diff --git a/scripts/generate-event-docs.py b/scripts/generate-event-docs.py index 8570460ad8..5c17f56667 100755 --- a/scripts/generate-event-docs.py +++ b/scripts/generate-event-docs.py @@ -19,7 +19,7 @@ @@ -65,12 +65,12 @@ | classification.taxonomy | Should | | time.source | Should | | time.observation | Should | -| source.ip | Should\* | -| source.fqdn | Should\* | -| source.url | Should\* | -| source.account | Should\* | +| source.ip | Should\\* | +| source.fqdn | Should\\* | +| source.url | Should\\* | +| source.account | Should\\* | -\* at least one of them +\\* at least one of them ## Classification @@ -96,7 +96,7 @@ | information-content-security | data-loss | Loss of data, e.g. caused by harddisk failure or physical theft. | | information-content-security | unauthorised-information-access | Unauthorized access to information, e.g. by abusing stolen login credentials for a system or application, intercepting traffic or gaining access to physical documents. | | information-content-security | unauthorised-information-modification | Unauthorised modification of information, e.g. by an attacker abusing stolen login credentials for a system or application or a ransomware encrypting data. | -| information-gathering | scanner | Attacks that send requests to a system to discover weaknesses. This also includes testing processes to gather information on hosts, services and accounts. Examples: fingerd, DNS querying, ICMP, SMTP (EXPN, RCPT, \...), port scanning. | +| information-gathering | scanner | Attacks that send requests to a system to discover weaknesses. This also includes testing processes to gather information on hosts, services and accounts. Examples: fingerd, DNS querying, ICMP, SMTP (EXPN, RCPT, ...), port scanning. | | information-gathering | sniffing | Observing and recording of network traffic (wiretapping). | | information-gathering | social-engineering | Gathering information from a human being in a non-technical way (e.g. lies, tricks, bribes, or threats). This IOC refers to a resource, which has been observed to perform brute-force attacks over a given application protocol. | | intrusion-attempts | brute-force | Multiple login attempts (Guessing/cracking of passwords, brute force). | @@ -160,7 +160,7 @@ - If an event describes IP address where a command and control server is running, the event's `classification.type` is `c2server`. The `malware.name` can have the full name, eg. `zeus_p2p`. - + ## Additional Information Information that do not fit into any of the event fields should be placed in the `extra` namespace.Therefore the keys must be prefixed `extra.` string. There are no other rules on key names and values for additional information. @@ -188,7 +188,7 @@ def main(): # f"[{value['type']}](#{value['type'].lower()})", # value['description']) output += f"""### `{key}`
\n\n""" - output += f"**Type:** [{value['type']}](#{value['type'].lower()})\n\n" + output += f"**Type:** [{value['type']}](../dev/data-format.md#{value['type'].lower()})\n\n" output += value['description'] output += "\n\n" diff --git a/scripts/generate-feeds-docs.py b/scripts/generate-feeds-docs.py index 74f8eb279b..c1b456c77b 100755 --- a/scripts/generate-feeds-docs.py +++ b/scripts/generate-feeds-docs.py @@ -1,6 +1,6 @@ #!/usr/bin/env python3 -# SPDX-FileCopyrightText: 2020 Sebastian Wagner, 2023 Filip Pokorný +# SPDX-FileCopyrightText: 2020 nic.at GmH, 2023 Filip Pokorný, 2025 Institute for Common Good Technology # SPDX-License-Identifier: AGPL-3.0-or-later # This script generates the "feeds.md" documentation page. @@ -9,6 +9,7 @@ import json import os.path +from re import compile as re_compile, IGNORECASE from ruamel.yaml import YAML BASEDIR = os.path.join(os.path.dirname(__file__), '../') @@ -18,7 +19,7 @@ @@ -31,6 +32,8 @@ """ +FEED_SANITATION_PATTERN = re_compile('[^a-z.]', flags=IGNORECASE) + def info(key, value=""): return f"**{key.title()}:** {str(value).strip()}\n\n" @@ -73,14 +76,14 @@ def main(): if bot_info.get('parameters'): output += "parameters:\n" - for key, value in sorted(bot_info['parameters'].items(), key=lambda x: x[0]): - if value == "__FEED__": - value = feed_name - - if value == "__PROVIDER__": - value = provider + if bot == 'collector': + code = f"{FEED_SANITATION_PATTERN.sub('', provider)}-{FEED_SANITATION_PATTERN.sub('', feed_name)}".lower() + output += f" provider: {provider}\n" + output += f" name: {feed_name}\n" + output += f" code: {code}\n" + for key, value in sorted(bot_info['parameters'].items(), key=lambda x: x[0]): # format non-empty lists with double-quotes # single quotes are not conform JSON and not correctly detected/transformed by the manager if isinstance(value, (list, tuple)) and value: