Skip to content

[bug] GithubIDContributions API undercounts PRs due to using dup_actor_login instead of dup_user_login #127

@qerogram

Description

@qerogram

Description

The GithubIDContributions API returns incorrect PR counts for users whose PRs were merged by maintainers without further activity from the author.

Root Cause

In cmd/api/api.go:1263-1271, the PR count query uses dup_actor_login:

select count(distinct id) as prs
from gha_issues
where is_pull_request = true
  and lower(dup_actor_login) = $1

However, dup_actor_login represents the last event actor (e.g., the maintainer who merged), not the PR author. The correct column should be dup_user_login. The same issue exists for the issues count query at line 1253-1261.

Impact

PRs that are "cleanly" merged (approve → merge with no further comments from the author) are not counted for the original author. This affects all CNCF projects.

Reproduction

Query the API:

curl -X POST https://devstats.cncf.io/api/v1 \
  -H "Content-Type: application/json" \
  -d '{"api":"GithubIDContributions","payload":{"github_id":"qerogram"}}'

Response: {"contributions":19,"issues":0,"prs":5}

Image

But I have 6 merged PRs in envoyproxy/envoy:
#42447, #40578, #40534, #39937, #39829, #39702

Verified via Grafana query on https://envoy.devstats.cncf.io/explore:
-- Returns 5 rows (missing #40534)

  select
    distinct id,
    number, 
    title
  from
    gha_issues
  where
    is_pull_request = true
    and lower(dup_actor_login) = 'qerogram'
Image

-- Returns 6 rows (correct)

  select
    distinct id,
    number, 
    title
  from
    gha_issues
  where
    is_pull_request = true
    and lower(dup_user_login) = 'qerogram'
Image

PR #40534 has dup_actor_login = 'mattklein123' (the maintainer who merged it) but dup_user_login = 'qerogram' (the actual author).
Image

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions