Skip to content

Conversation

@sameeramin
Copy link
Contributor

@sameeramin sameeramin commented Sep 15, 2025

This PR implements enhanced structured logging utilities for Integrated Channels to improve Datadog monitoring and alerting, addressing log level inconsistencies and poor structured data parsing identified in ENT-8024.

Merge checklist:

  • Any new requirements are in the right place (do not manually modify the requirements/*.txt files)
    • base.in if needed in production but edx-platform doesn't install it
    • test-master.in if edx-platform pins it, with a matching version
    • make upgrade && make requirements have been run to regenerate requirements
  • make static has been run to update webpack bundling if any static content was updated
  • ./manage.py makemigrations has been run
    • Checkout the Database Migration Confluence page for helpful tips on creating migrations.
    • Note: This must be run if you modified any models.
      • It may or may not make a migration depending on exactly what you modified, but it should still be run.
    • This should be run from either a venv with all the lms/edx-enterprise requirements installed or if you checked out edx-enterprise into the src directory used by lms, you can run this command through an lms shell.
      • It would be ./manage.py lms makemigrations in the shell.
  • Version bumped
  • Changelog record added
  • Translations updated (see docs/internationalization.rst but also this isn't blocking for merge atm)

Post merge:

  • Tag pushed and a new version released
    • Note: Assets will be added automatically. You just need to provide a tag (should match your version number) and title and description.
  • After versioned build finishes in GitHub Actions, verify version has been pushed to PyPI
    • Each step in the release build has a condition flag that checks if the rest of the steps are done and if so will deploy to PyPi.
      (so basically once your build finishes, after maybe a minute you should see the new version in PyPi automatically (on refresh))
  • PR created in edx-platform to upgrade dependencies (including edx-enterprise)
    • Trigger the 'Upgrade one Python dependency' action against master in edx-platform with new version number to generate version bump PR
    • This must be done after the version is visible in PyPi as make upgrade in edx-platform will look for the latest version in PyPi.
    • Note: the edx-enterprise constraint in edx-platform must also be bumped to the latest version in PyPi.

@sameeramin sameeramin marked this pull request as draft September 15, 2025 20:46
@sameeramin sameeramin marked this pull request as ready for review September 18, 2025 14:21
Copy link
Contributor

@pwnage101 pwnage101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not add a layer of abstraction to functionality which already supports pluggable handlers. Using LOGGER.*() function calls directly inside of production code is also more pythonic, IMO.

@sameeramin
Copy link
Contributor Author

Hey @pwnage101, Thank you for reviewing the PR I absolutely agree with the approach you proposed. Let me refactor my code.

@sameeramin sameeramin marked this pull request as draft September 26, 2025 13:14
Copy link
Contributor

@pwnage101 pwnage101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@pwnage101
Copy link
Contributor

FYI, our current datadog log parsing configuration does NOT support json parsing. You need to come up with a proposal to add json parsing to the Datadog GROK Parser config in a way that is still backwards-compatible. For example, just changing :keyvalue with :json will break log parsing for log lines that format with key-value pairs.

Screenshot 2025-09-29 at 11 42 50 AM Screenshot 2025-09-29 at 11 39 15 AM

@pwnage101
Copy link
Contributor

pwnage101 commented Sep 29, 2025

I minimally tested this approach and it seems to work, but I highly recommend you do your own testing to validate the approach for more scenarios:

-_message_with_keyvalue %{regex("(?<message>.*)"):message_keys:keyvalue}
+_message_with_keyvalue (%{regex(".*(?= \\{.*\\})"):message} %{data:json_keys:json}|%{regex("(?<message>.*)"):message_keys:keyvalue})

This matches existing messages optionally with keyvalue pairs, but if the message ends with JSON, then it will switch to that pattern which uses the json filter.

Please perform testing on the actual Grok parser edit page (but do not hit save until you get peer review on the new regex): https://app.datadoghq.com/logs/pipelines/pipeline/of_y4sdqStWBHmooL2W3UQ/processors/edit/Wz-1KN_ISsyavH79U5dmgg

@pwnage101
Copy link
Contributor

This one is even better, IMO:

_message_with_keyvalue (%{regex("\\{.*\\}")::json}|%{regex("(?<message>.*)"):message_keys:keyvalue})

Supports log lines with only a message:

<preamble> My custom message.
<preamble> {"message": "My custom message."}

And also supports log lines with extra structured data:

<preamble> My custom message. foo=bar bin=baz
<preamble> {"message": "My custom message.", "foo": "bar", "bin": "baz"}

@pwnage101
Copy link
Contributor

Ignore my latest comments, I did not realize the entire log line would be JSON, and would trigger the JSON preprocessing step in the datadog logs pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants