Skip to content

python: read Py_Version when filename lacks a version#1124

Open
dkratunov wants to merge 6 commits intoopen-telemetry:mainfrom
dkratunov:pyversion-lookup
Open

python: read Py_Version when filename lacks a version#1124
dkratunov wants to merge 6 commits intoopen-telemetry:mainfrom
dkratunov:pyversion-lookup

Conversation

@dkratunov
Copy link
Copy Markdown

@dkratunov dkratunov commented Jan 27, 2026

Some Python distributions ship a statically linked libpython and a python3 binary without a version suffix, so regex-based version parsing fails. One such distribution is the 3.12.12 astral-sh standalone python releases as used by bazel-contrib/rules_python. That interpreter is statically compiled against libpython (no mapping to check) and the final binary is called python3, not python3.12.

When the path does not expose a version, read the Py_Version runtime symbol from the target process and derive major/minor from its hex value.

Some Python distributions ship a statically linked libpython and a `python3` binary without a version suffix, so regex-based version parsing fails. When the path does not expose a version, read the Py_Version runtime symbol from the target process and derive major/minor from its hex value. This restores version detection for standalone/embedded Python builds and keeps the loader from failing to compile due to an undeclared err variable.
@dkratunov dkratunov requested review from a team as code owners January 27, 2026 19:37
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla bot commented Jan 27, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

@christos68k
Copy link
Copy Markdown
Member

@dkratunov Can you sign the CLA ?

@dkratunov
Copy link
Copy Markdown
Author

Sorry, yes, I'm just chasing it internally to be added to the EasyCLA group, we have a CNCF/LF corp CLA signed

@dkratunov
Copy link
Copy Markdown
Author

@christos68k done, sorry about the delay 😄

@dkratunov dkratunov requested a review from fabled March 3, 2026 17:03
@dkratunov
Copy link
Copy Markdown
Author

@fabled ping, this is still relevant :)

@florianl
Copy link
Copy Markdown
Member

@dkratunov can you please address the CI reports around code scanning: https://github.com/open-telemetry/opentelemetry-ebpf-profiler/pull/1124/checks?check_run_id=65674224800

@dkratunov
Copy link
Copy Markdown
Author

Done and my local codeql run is clean

Copy link
Copy Markdown
Contributor

@korniltsev-grafanista korniltsev-grafanista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, clean fix. Overall approach is sound — reading Py_Version from the ELF is reliable since it's a compile-time constant (PY_VERSION_HEX) baked into the binary's data section. The bit-shifting arithmetic correctly extracts major (bits 31:24) and minor (bits 23:16) per the CPython ABI spec.

A few observations:

major == 0 fallback is unreachable when regex matches

In Loader, when matches != nil the regex captures always produce a non-zero major (both regexes require at least one digit for the major, and no Python version has major 0). So the if major == 0 block after the else branch is dead code in normal operation. This is harmless but subtly misleading — a reader might wonder what scenario triggers this. Either remove it, or add a comment like // shouldn't happen — regex match always yields non-zero major; defensive fallback.

Silent int64 → int16 truncation for staticTLSOffset

Tls_offset: int16(d.staticTLSOffset),

staticTLSOffset is int64 internally but stored as int16 in the eBPF map. TLS offsets for thread-locals on x86-64 and aarch64 are small negative numbers that fit in 16 bits in practice, but the truncation is silent. A defensive check would improve debuggability:

if d.staticTLSOffset < math.MinInt16 || d.staticTLSOffset > math.MaxInt16 {
    log.Warnf("staticTLSOffset %d out of int16 range, clamping", d.staticTLSOffset)
}

getTLSOffsetFromAssembly and staticTLSOffset are unmentioned in the PR description

The PR description only covers the Py_Version fallback, but a notable chunk of the diff adds Python 3.13+ TLS offset extraction via assembly analysis and the staticTLSOffset field. These were presumably already reviewed in the upstream merge but it's worth calling out explicitly in the PR description so reviewers know what scope to cover, especially given the procInfoInserted deduplication fix that comes with it.

No test for the version fallback path itself

The new python-3.13.5-tls.json coredump test covers the TLS path (great), but there's no test exercising readPyVersionHex or the loader fallback when the filename is python3. A simple unit test with a minimal ELF fixture (or mocked ELF) would prevent future regressions. That said, if the coredump tooling makes this impractical, a comment in the test file noting the missing coverage is enough.

Minor: path.Base vs filepath.Base

if !strings.HasPrefix(path.Base(info.FileName()), "python") {

path.Base uses / as separator (POSIX path semantics). Since this is a Linux-only tool, this is fine, but using filepath.Base from path/filepath would be more idiomatic Go and self-documenting.


co-reviewed by Claude Opus 4.6

@korniltsev-grafanista
Copy link
Copy Markdown
Contributor

Correction to my earlier review: I accidentally diffed against the wrong base branch (grafana fork's main instead of upstream/main). Points 2 (int16 truncation), 3 (getTLSOffsetFromAssembly unmentioned), and 4 (missing test) were all already in upstream/main and unrelated to this PR. Apologies for the noise.

The only comments that actually apply to this PR's changes are:

major == 0 fallback is unreachable when regex matches

In Loader, when matches != nil the regex captures always produce a non-zero major (both regexes require at least one digit, and no real Python release has major 0). So the if major == 0 block is dead code for the matched path. Harmless but subtly misleading — could add a comment or just gate it more tightly.

Minor: path.Base vs filepath.Base

if !strings.HasPrefix(path.Base(info.FileName()), "python") {

path.Base uses / as separator. Linux-only tool so it works, but filepath.Base from path/filepath would be more idiomatic.


co-reviewed by Claude Opus 4.6

matches = pythonRegex.FindStringSubmatch(info.FileName())
if matches == nil {
}
if matches == nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we move readPyVersionHex invocation under this matches==nil case block and avoid major == 0 check ?

return major*0x100 + minor
}

func readPyVersionHex(ef *pfelf.File) (major uint8, minor uint8, err error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a coredump test hitting this logic?

Copy link
Copy Markdown
Contributor

@fabled fabled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with one clean up nit. Also the coredump test firing the new code path would be appreciated. Approving regardless.

Comment on lines +61 to +64
if versionHex == 0 {
return 0, 0, errors.New("Py_Version is 0")
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this, the major check will detect this case also.

Suggested change
if versionHex == 0 {
return 0, 0, errors.New("Py_Version is 0")
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants