Skip to content

LogFieldSanitizer for BigQuery #68

@petemounce

Description

@petemounce

I can't recall whether I mentioned this, but at some point I ran into errors when I threw something into a log-field that was a list of lists. The following works around it in my particular setup:

"""Sanitize log-fields for backends"""

import structlog
from structlog.types import EventDict, Processor


class LogFieldSanitizer:
    """
    Google Logging can back onto Log Sinks, which in turn are stored in BigQuery.
    BigQuery has at least one limitation; it cannot store lists of lists.
    Reference: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#array_type

    This structlog processor adjusts for that if a log-field is a list of lists.
    """

    def setup(self) -> list[Processor]:
        return [self]

    def __call__(self, logger: structlog.typing.WrappedLogger, method_name: str, event_dict: EventDict) -> EventDict:
        del logger, method_name  # unused

        for key in event_dict:
            if isinstance(event_dict[key], list):
                orig = event_dict[key]
                if any(isinstance(x, list) for x in orig):
                    event_dict[key] = [x for xs in orig for x in xs]

        return event_dict

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions