Skip to content

Support proportional language attribution via repository.languages #140

@shm11C3

Description

@shm11C3

Summary

Currently, language statistics are attributed only to each repository’s primaryLanguage.
This makes multi-language repositories underrepresented.

Example:

  • A large repository / monorepo contains multiple languages (e.g., Language A 60%, Language B 40%)
  • Current behavior: 100% attributed to the primary language
  • Expected behavior (opt-in): attribute contributions proportionally across languages

This commonly occurs in monorepos and large repositories where multiple languages coexist.

Cause

Language stats are currently attributed to a single language per repository. As a result, other languages used in the same repository are ignored.

Proposed Solution

Introduce a new option to enable proportional language attribution, while keeping the current behavior as the default.

Option design

Use an explicit mode string rather than a boolean for clarity and extensibility:

language_mode: "primary" | "breakdown"
  • "primary" (default): current behavior — attribute all contributions to primaryLanguage
  • "breakdown": use repository.languages and split repo contributions proportionally

Implementation approach

One concrete approach (verified in a fork) is to fetch a language breakdown from the GitHub API (e.g., repository.languages with size per language) and use the ratios to split the repository’s contribution counts.

commitContributionsByRepository(maxRepositories: 100) {
    repository {
        languages(first: 10, orderBy: {field: SIZE, direction: DESC}) {
            edges {
                size
                node { name, color }
            }
        }
    }
    contributions { totalCount }
}

Proof of concept (verified in a fork): https://github.com/shm11C3/github-profile-3d-contrib/blob/aacbea44a59f51e5074d506dc2abae8161435f9e/src/aggregate-user-info.ts#L53-L77

Attribution algorithm

Given:

  • repoContrib = total contributions attributed to a repository
  • langSizes = list of (language, size)

Compute ratio = size / sum(size), then attribute repoContrib * ratio to each language.

Example

  • Repo contributions: 10
  • Language A 60%, Language B 40%
  • => Language A 6, Language B 4

Notes / Considerations

API cost

The node cost increase is minimal — up to +10 nodes per repository (capped by first: 10), well within GitHub's 500,000 node limit per query. No additional HTTP requests are needed compared to the current implementation.

Backward compatibility

  • Default behavior is unchanged (language_mode: "primary")
  • New behavior only when users explicitly set language_mode: "breakdown"
  • The GraphQL query should fetch both primaryLanguage and languages fields, switching the attribution logic based on the mode setting

Fractional contributions

Proportional attribution produces fractional values (e.g., 6.37 contributions). Consider:

  • Using Math.round() when rendering to avoid displaying decimals
  • Accepting that per-language totals may not sum exactly to the repo total due to rounding
  • The type.LangInfo.contributions field type changes from effectively integer to float — verify downstream rendering and sorting are unaffected

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions