Architecture for Anubis on containerized environments and multiple backends #536
Replies: 3 comments 3 replies
-
My current solution is: |
Beta Was this translation helpful? Give feedback.
-
@joshuaganger I have one ConfigMap for stuff I do not want to get pass anything: apiVersion: v1
kind: ConfigMap
metadata:
name: anubis-config-general
namespace: anubis
data:
botPolicy.yaml: |
bots:
# Generic catchall rule
- name: generic-browser
remote_addresses: [0.0.0.0/0]
action: CHALLENGE
dnsbl: false
# By default, send HTTP 200 back to clients that either get issued a challenge
# or a denial. This seems weird, but this is load-bearing due to the fact that
# the most aggressive scraper bots seem to really really want an HTTP 200 and
# will stop sending requests once they get it.
status_codes:
CHALLENGE: 200
DENY: 200 And one custom ConfigMap for stuff I am actually working with: apiVersion: v1
kind: ConfigMap
metadata:
name: anubis-config-custom
namespace: anubis
data:
botPolicy.yaml: |
## Anubis has the ability to let you import snippets of configuration into the main
## configuration file. This allows you to break up your config into smaller parts
## that get logically assembled into one big file.
##
## Of note, a bot rule can either have inline bot configuration or import a
## bot config snippet. You cannot do both in a single bot rule.
##
## Import paths can either be prefixed with (data) to import from the common/shared
## rules in the data folder in the Anubis source tree or will point to absolute/relative
## paths in your filesystem. If you don't have access to the Anubis source tree, check
## /usr/share/docs/anubis/data or in the tarball you extracted Anubis from.
bots:
# Pathological bots to deny
# This correlates to data/bots/ai-robots-txt.yaml in the source tree
- import: (data)/bots/ai-robots-txt.yaml
- import: (data)/bots/cloudflare-workers.yaml
- import: (data)/bots/headless-browsers.yaml
- import: (data)/bots/us-ai-scraper.yaml
# Allow common "keeping the internet working" routes (well-known, favicon, robots.txt)
- import: (data)/common/keep-internet-working.yaml
# Generic catchall rule
- name: generic-browser
user_agent_regex: >-
Mozilla|Opera
action: CHALLENGE
- name: jellyfin
user_agent_regex: >-
'Ktor client'|JellyfinMediaPlayer|AppleCoreMedia|Dart
action: ALLOW
- name: gitlab
user_agent_regex: >-
git|RenovateBot|Gitlab|gitlab-runner|gitlab-kas
action: ALLOW
- name: authentik
user_agent_regex: >-
goauthentik.io
action: ALLOW
# # Punish any bot with "bot" in the user-agent string
# # This is known to have a high false-positive rate, use at your own risk
- name: generic-bot-catchall
user_agent_regex: (?i:bot|crawler)
action: CHALLENGE
challenge:
difficulty: 16 # impossible
report_as: 4 # lie to the operator
algorithm: slow # intentionally waste CPU cycles and time
dnsbl: false
# By default, send HTTP 200 back to clients that either get issued a challenge
# or a denial. This seems weird, but this is load-bearing due to the fact that
# the most aggressive scraper bots seem to really really want an HTTP 200 and
# will stop sending requests once they get it.
status_codes:
CHALLENGE: 200
DENY: 200 |
Beta Was this translation helpful? Give feedback.
-
I am using one instance of Anubis per SNI. The other rule is used by my "per SNI Anubis instances". |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello Anubis Community,
I'm looking to enable anubis on a containerized kubernetes cluster, but with some specific restrictions
Here's a diagram of the target architecture
Looking at the documentation, especially at the suggested architecture for running alongside nginx, it suggests to use sockets to send request back to nginx for proper routing to the backend, but this is not possible on containerized environments, where sockets across pods is not an option.
Is this architecture possible to achieve today with anubis? If not, is something like this being discussed as future improvements ?
Beta Was this translation helpful? Give feedback.
All reactions