Summary
Currently, language statistics are attributed only to each repository’s primaryLanguage.
This makes multi-language repositories underrepresented.
Example:
- A large repository / monorepo contains multiple languages (e.g., Language A 60%, Language B 40%)
- Current behavior: 100% attributed to the primary language
- Expected behavior (opt-in): attribute contributions proportionally across languages
This commonly occurs in monorepos and large repositories where multiple languages coexist.
Cause
Language stats are currently attributed to a single language per repository. As a result, other languages used in the same repository are ignored.
Proposed Solution
Introduce a new option to enable proportional language attribution, while keeping the current behavior as the default.
Option design
Use an explicit mode string rather than a boolean for clarity and extensibility:
language_mode: "primary" | "breakdown"
"primary" (default): current behavior — attribute all contributions to primaryLanguage
"breakdown": use repository.languages and split repo contributions proportionally
Implementation approach
One concrete approach (verified in a fork) is to fetch a language breakdown from the GitHub API (e.g., repository.languages with size per language) and use the ratios to split the repository’s contribution counts.
commitContributionsByRepository(maxRepositories: 100) {
repository {
languages(first: 10, orderBy: {field: SIZE, direction: DESC}) {
edges {
size
node { name, color }
}
}
}
contributions { totalCount }
}
Proof of concept (verified in a fork): https://github.com/shm11C3/github-profile-3d-contrib/blob/aacbea44a59f51e5074d506dc2abae8161435f9e/src/aggregate-user-info.ts#L53-L77
Attribution algorithm
Given:
repoContrib = total contributions attributed to a repository
langSizes = list of (language, size)
Compute ratio = size / sum(size), then attribute repoContrib * ratio to each language.
Example
- Repo contributions: 10
- Language A 60%, Language B 40%
- => Language A 6, Language B 4
Notes / Considerations
API cost
The node cost increase is minimal — up to +10 nodes per repository (capped by first: 10), well within GitHub's 500,000 node limit per query. No additional HTTP requests are needed compared to the current implementation.
Backward compatibility
- Default behavior is unchanged (
language_mode: "primary")
- New behavior only when users explicitly set
language_mode: "breakdown"
- The GraphQL query should fetch both
primaryLanguage and languages fields, switching the attribution logic based on the mode setting
Fractional contributions
Proportional attribution produces fractional values (e.g., 6.37 contributions). Consider:
- Using
Math.round() when rendering to avoid displaying decimals
- Accepting that per-language totals may not sum exactly to the repo total due to rounding
- The
type.LangInfo.contributions field type changes from effectively integer to float — verify downstream rendering and sorting are unaffected
Summary
Currently, language statistics are attributed only to each repository’s
primaryLanguage.This makes multi-language repositories underrepresented.
Example:
This commonly occurs in monorepos and large repositories where multiple languages coexist.
Cause
Language stats are currently attributed to a single language per repository. As a result, other languages used in the same repository are ignored.
Proposed Solution
Introduce a new option to enable proportional language attribution, while keeping the current behavior as the default.
Option design
Use an explicit mode string rather than a boolean for clarity and extensibility:
"primary"(default): current behavior — attribute all contributions toprimaryLanguage"breakdown": userepository.languagesand split repo contributions proportionallyImplementation approach
One concrete approach (verified in a fork) is to fetch a language breakdown from the GitHub API (e.g.,
repository.languageswithsizeper language) and use the ratios to split the repository’s contribution counts.Proof of concept (verified in a fork): https://github.com/shm11C3/github-profile-3d-contrib/blob/aacbea44a59f51e5074d506dc2abae8161435f9e/src/aggregate-user-info.ts#L53-L77
Attribution algorithm
Given:
repoContrib= total contributions attributed to a repositorylangSizes= list of (language, size)Compute
ratio = size / sum(size), then attributerepoContrib * ratioto each language.Example
Notes / Considerations
API cost
The node cost increase is minimal — up to +10 nodes per repository (capped by
first: 10), well within GitHub's 500,000 node limit per query. No additional HTTP requests are needed compared to the current implementation.Backward compatibility
language_mode: "primary")language_mode: "breakdown"primaryLanguageandlanguagesfields, switching the attribution logic based on the mode settingFractional contributions
Proportional attribution produces fractional values (e.g., 6.37 contributions). Consider:
Math.round()when rendering to avoid displaying decimalstype.LangInfo.contributionsfield type changes from effectively integer to float — verify downstream rendering and sorting are unaffected