Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 84 additions & 0 deletions statsd-exporter/include/mappings-gen2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -268,6 +268,90 @@ mappings:
instance: "$1"
mount_path: "$2"

# ------------------------------------------------------------

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is relying heavily on ([^.]+) which is fine if you control sanitisation, but brittle otherwise.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goes back to one of my questions in the PR description:

Do we prefer to use * or regex? I thought regex was safer to avoid over-matching, but it is harder to read. The existing mapping includes both.

If we're confident * is a better standard and wouldn't over-match, I'd be happy to remove the regex patterns ([^.]+)

# Cosmos-related metrics
# ------------------------------------------------------------

# The goal with Cosmos metrics is to capture information about how end-users are using Cosmos, so we can plan our roadmap, in particular, the Cosmos 2.0 release.
# We want to avoid capturing redundant information that is already being collected by other parts of the Astro stack.

# There are 2 types of Cosmos metrics:
# 1. Counters
# 2. Durations
# These are identified by the suffix ".counter" or ".duration"

# What is the name of the operator class used to run the task? Did the end-user subclass it?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just be careful if you're using unsanitized values (e.g. users classes) as labels. Statsd is a very out-of-date and terrible way to move telemetry around and using period delimination for "labels" means that if any label has a period in it, you end up in a mess (e.g. airflow emits mapped tasks or task group tasks as dag.task_group.task_name instead of dag.task_name, and it breaks a lot of these rules. TLDR, just sanitize the labels please!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, @stuart23. We are confident the label values we're collecting don't have periods, but we'll add some validation in the Cosmos codebase to avoid this from happening over time.

Are there any other better alternatives than period-delimited regex?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Unfortunately this is the only way for now - the vanilla statsd implementation does not officially support labels (ref) so we have to serialize them into the metric name.

# Which dbt command was used to run the task?
# What execution mode was used? What invocation mode was used?
- match: cosmos\.task\.operator_name\.([^.]+)\.is_subclass\.([^.]+)\.execution_mode\.([^.]+)\.invocation_mode\.([^.]+)\.dbt_command\.([^.]+)\.install_deps\.([^.]+)\.origin\.([^.]+)\.has_callback\.([^.]+)\.status\.([^.]+)\.counter$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if possible, let's avoid the regex -- we should be able to test it. Also as a general point, do we need the names of the labels in the metric itself? This could probably become something similar to the below:

cosmos.task.counter.*.*.*.*.*.*.*.*.*

match_type: regex
name: "cosmos_task_overview"
labels:
operator_name: "$1" # Example: "DbtRunLocalOperator", "DbtProducerWatcherOperator"
is_subclass: "$2" # True or False
execution_mode: "$3" # Example: "local", "virtualenv", "watcher"
invocation_mode: "$4" # Subprocess or DbtRunner
dbt_command: "$5" # Example: "run", "build", "test"
install_deps: "$6" # True or False
origin: "$7" # DbtTaskGroup, DbtDag or StandaloneTask
has_callback: "$8" # True or False
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add is_mapped_task so we are consistent with the introduced argument: https://github.com/astronomer/astronomer-cosmos/pull/2195/changes#r2630935633

status: "$9" # success, failure

# Which database did the user run the transformations against?
# How did the user define the profile, via a ProfileMapping or YAML file?
# If Profile mapping, which class was used to define the profile?
- match: cosmos\.profile\.database\.([^.]+)\.profile_strategy\.([^.]+)\.profile_mapping_class\.([^.]+)\.counter$
match_type: regex
name: "cosmos_profile_overview"
labels:
database: "$1" # postgres, snowflake, databricks, bigquery, etc
profile_strategy: "$2" # yaml_file or mapping
profile_mapping_class: "$3" # None, SnowflakeEncryptedPrivateKeyPemProfileMapping, PostgresUserPasswordProfileMapping, DatabricksTokenProfileMapping, etc

# If using `DbtDag` or `DbtTaskGroup`, how did the user parse the dbt project? If using dbt_ls, was `dbt` in the same Python virtualenv as Airflow?
# Did the user specify a custom load converter?
# What were the source and node behaviors?
# How many dbt nodes were in the project?
# How many dbt nodes were selected by the user?
# How long did the parse operation take?
# How long did the filtering operation take?
- match: cosmos\.rendering\.used_automatic_load_mode\.([^.]+)\.actual_load_mode\.([^.]+)\.invocation_mode\.([^.]+)\.install_deps\.([^.]+)\.uses_node_converter\.([^.]+)\.test_behavior\.([^.]+)\.source_behavior\.([^.]+)\.total_dbt_models\.([^.]+)\.selected_dbt_models\.([^.]+)\.counter$
match_type: regex
name: "cosmos_rendering_overview"
labels:
used_automatic_load_mode: "$1" # True or False
actual_load_mode: "$2" # dbt_ls, dbt_ls_cache, dbt_ls_file, dbt_manifest, custom
invocation_mode: "$3" # subprocess or dbt_runner
install_deps: "$4" # True or False
uses_node_converter: "$5" # True or False
test_behavior: "$6" # after_each, after_all, none, build
source_behavior: "$7" # all, with_tests_or_freshness, none
total_dbt_models: "$8" # Total number of dbt models in the project
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having numerical values as labels isn't really great, because you can't use them in calculations at all, and if they change they end up having an impact on cardinality.

statsd also has a gauge type, could you emit total_dbt_models and selected_dbt_models as gauges please?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally makes sense, thanks a lot, @stuart23 , I'll do this

selected_dbt_models: "$9" # Total number of dbt models selected by the user

# How long did Cosmos take to parse the dbt project?
- match: cosmos\.rendering\.actual_load_mode\.([^.]+)\.dbt_nodes_parsing\.duration$
match_type: regex
name: "cosmos_rendering_dbt_nodes_parsing_duration"
labels:
actual_load_mode: "$1"

# How long did Cosmos take to filter the dbt project after it was parsed?
- match: cosmos\.rendering\.actual_load_mode\.([^.]+)\.dbt_nodes_filtering\.duration$
match_type: regex
name: "cosmos_rendering_dbt_nodes_filtering_duration"
labels:
actual_load_mode: "$1"

# How long did Cosmos take to build the Airflow DAG dynamically, after the dbt project has been parsed and filtered?
- match: cosmos\.rendering\.actual_load_mode\.([^.]+)\.airflow_dag_generation\.duration$
match_type: regex
name: "cosmos_rendering_airflow_dag_generation_duration"
labels:
actual_load_mode: "$1"

# ------------------------------------------------------------

# drop any metric not matched
- match: "."
match_type: regex
Expand Down
2 changes: 1 addition & 1 deletion statsd-exporter/version.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.28.0-3
0.28.0-5