Skip to content

fix(tracing): calculate traceid in python not clickhouse to avoid nullable checks#52061

Merged
frankh merged 1 commit intomasterfrom
frank/traceid-in-python
Mar 24, 2026
Merged

fix(tracing): calculate traceid in python not clickhouse to avoid nullable checks#52061
frankh merged 1 commit intomasterfrom
frank/traceid-in-python

Conversation

@frankh
Copy link
Copy Markdown
Contributor

@frankh frankh commented Mar 24, 2026

Problem

hogql was wrapping the clickhouse comparison in ifNull checks, which breaks clickhouse's index analysis

assumeNotNull did not fix this - instead calculate in python so it's a constant check which should avoid the ifNull check

Changes

How did you test this code?

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

Publish to changelog?

Docs update

@frankh frankh requested a review from jonmcwest March 24, 2026 11:02
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 24, 2026

Comments Outside Diff (1)

  1. products/tracing/backend/presentation/views.py, line 51 (link)

    P1 query endpoint missing traceId validation

    The trace endpoint now validates trace_id with a try/except (lines 87–91), but the query endpoint passes traceId from the request body directly to TraceSpansQuery without any equivalent validation. When logic.py then calls bytes.fromhex(self.query.traceId) at line 179, an invalid hex string (e.g. "xyz123") will raise an unhandled ValueError, resulting in a 500 error instead of a 400.

    Before this PR, the raw hex was passed as a ClickHouse constant to unhex(), which would silently return empty bytes for invalid input. The migration to Python-side conversion introduces this new failure path.

    Consider adding the same validation guard to the query action:

    traceId = query_data.get("traceId", None)
    if traceId:
        try:
            bytes.fromhex(traceId)
        except ValueError:
            return Response({"error": "Invalid traceId"}, status=status.HTTP_400_BAD_REQUEST)
    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: products/tracing/backend/presentation/views.py
    Line: 51
    
    Comment:
    **`query` endpoint missing `traceId` validation**
    
    The `trace` endpoint now validates `trace_id` with a `try/except` (lines 87–91), but the `query` endpoint passes `traceId` from the request body directly to `TraceSpansQuery` without any equivalent validation. When `logic.py` then calls `bytes.fromhex(self.query.traceId)` at line 179, an invalid hex string (e.g. `"xyz123"`) will raise an unhandled `ValueError`, resulting in a 500 error instead of a 400.
    
    Before this PR, the raw hex was passed as a ClickHouse constant to `unhex()`, which would silently return empty bytes for invalid input. The migration to Python-side conversion introduces this new failure path.
    
    Consider adding the same validation guard to the `query` action:
    
    ```python
    traceId = query_data.get("traceId", None)
    if traceId:
        try:
            bytes.fromhex(traceId)
        except ValueError:
            return Response({"error": "Invalid traceId"}, status=status.HTTP_400_BAD_REQUEST)
    ```
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: products/tracing/backend/presentation/views.py
Line: 51

Comment:
**`query` endpoint missing `traceId` validation**

The `trace` endpoint now validates `trace_id` with a `try/except` (lines 87–91), but the `query` endpoint passes `traceId` from the request body directly to `TraceSpansQuery` without any equivalent validation. When `logic.py` then calls `bytes.fromhex(self.query.traceId)` at line 179, an invalid hex string (e.g. `"xyz123"`) will raise an unhandled `ValueError`, resulting in a 500 error instead of a 400.

Before this PR, the raw hex was passed as a ClickHouse constant to `unhex()`, which would silently return empty bytes for invalid input. The migration to Python-side conversion introduces this new failure path.

Consider adding the same validation guard to the `query` action:

```python
traceId = query_data.get("traceId", None)
if traceId:
    try:
        bytes.fromhex(traceId)
    except ValueError:
        return Response({"error": "Invalid traceId"}, status=status.HTTP_400_BAD_REQUEST)
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: products/tracing/backend/presentation/views.py
Line: 87-91

Comment:
**Redundant computation — result is discarded**

The full `base64.b64encode(bytes.fromhex(trace_id)).decode("ascii")` conversion is computed here purely for validation, but the result is thrown away. `logic.py` then performs the exact same computation again. This is a mild OnceAndOnlyOnce violation.

If you only need to verify the hex is valid, `bytes.fromhex(trace_id)` alone is sufficient — no need to do the base64 step:

```python
        try:
            bytes.fromhex(trace_id)
        except ValueError:
            return Response(status=status.HTTP_400_BAD_REQUEST)
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(tracing): calculate traceid in pytho..." | Re-trigger Greptile

Comment on lines +87 to +91
try:
# verify the trace_id is valid
base64.b64encode(bytes.fromhex(trace_id)).decode("ascii")
except ValueError:
return Response(status=status.HTTP_400_BAD_REQUEST)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Redundant computation — result is discarded

The full base64.b64encode(bytes.fromhex(trace_id)).decode("ascii") conversion is computed here purely for validation, but the result is thrown away. logic.py then performs the exact same computation again. This is a mild OnceAndOnlyOnce violation.

If you only need to verify the hex is valid, bytes.fromhex(trace_id) alone is sufficient — no need to do the base64 step:

        try:
            bytes.fromhex(trace_id)
        except ValueError:
            return Response(status=status.HTTP_400_BAD_REQUEST)
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/tracing/backend/presentation/views.py
Line: 87-91

Comment:
**Redundant computation — result is discarded**

The full `base64.b64encode(bytes.fromhex(trace_id)).decode("ascii")` conversion is computed here purely for validation, but the result is thrown away. `logic.py` then performs the exact same computation again. This is a mild OnceAndOnlyOnce violation.

If you only need to verify the hex is valid, `bytes.fromhex(trace_id)` alone is sufficient — no need to do the base64 step:

```python
        try:
            bytes.fromhex(trace_id)
        except ValueError:
            return Response(status=status.HTTP_400_BAD_REQUEST)
```

How can I resolve this? If you propose a fix, please make it concise.

@frankh frankh merged commit 713c03d into master Mar 24, 2026
191 checks passed
@frankh frankh deleted the frank/traceid-in-python branch March 24, 2026 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants