feat(internal/librarian/golang): generate repo metadata for Go libraries#4055
feat(internal/librarian/golang): generate repo metadata for Go libraries#4055JoeWang1127 wants to merge 4 commits intomainfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4055 +/- ##
==========================================
+ Coverage 82.01% 87.02% +5.00%
==========================================
Files 77 35 -42
Lines 6504 3706 -2798
==========================================
- Hits 5334 3225 -2109
+ Misses 825 317 -508
+ Partials 345 164 -181 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Code Review
This pull request adds functionality to generate .repo-metadata.json files for Go libraries. However, several path traversal vulnerabilities were identified where configuration values such as api.Path and ClientDirectory are used to construct filesystem paths without proper validation. As per repository rules, inputs for code generation templates must be sanitized to prevent injection vulnerabilities. Strict path validation using absolute path prefix checks is recommended for all configuration-derived paths. Additionally, ensure the LibraryInfo struct is properly populated to avoid incomplete files, and consider a small refactoring in version.go to avoid a redundant function call.
| return err | ||
| } | ||
| info := &repometadata.LibraryInfo{} | ||
| outDir := filepath.Join(outdir, api.Path) |
There was a problem hiding this comment.
The api.Path from the library configuration is used to construct outDir via filepath.Join(outdir, api.Path) without validation. If api.Path contains directory traversal sequences (e.g., ../), an attacker could cause the tool to write .repo-metadata.json to an arbitrary location outside the intended output directory.
You should validate that the constructed outDir is within the intended outdir using a prefix check on the absolute paths, similar to the check performed for library.Name on lines 91-98.
References
- Inputs used in code generation, especially for constructing file paths, must be sanitized to prevent injection vulnerabilities like path traversal, even if the input source is considered trusted.
| func apiVersionPath(library *config.Library, apiPath string) string { | ||
| version := filepath.Base(apiPath) | ||
| clientDir := clientDirectory(library, apiPath) | ||
| return filepath.Join(library.Output, library.Name, clientDir, "api"+version) |
There was a problem hiding this comment.
The clientDir (derived from goAPI.ClientDirectory in the configuration) is used to construct the output directory for version files via filepath.Join without validation. An attacker could use directory traversal sequences in ClientDirectory to cause the tool to write version.go files to arbitrary locations on the filesystem.
Validate that the constructed path is within the intended output directory using a prefix check on the absolute paths.
References
- Inputs used in code generation, especially for constructing file paths, must be sanitized to prevent injection vulnerabilities like path traversal, even if the input source is considered trusted.
| if err != nil { | ||
| return err | ||
| } | ||
| info := &repometadata.LibraryInfo{} |
There was a problem hiding this comment.
The info struct is being initialized as empty. This will result in an incomplete .repo-metadata.json file, missing fields like distribution_name and release_level. It should be populated with information from the library configuration.
| info := &repometadata.LibraryInfo{} | |
| info := &repometadata.LibraryInfo{ | |
| Name: library.Name, | |
| ReleaseLevel: library.ReleaseLevel, | |
| DescriptionOverride: library.DescriptionOverride, | |
| } |
| } | ||
| if err := generateREADME(library, api, moduleRoot); err != nil { | ||
| return err | ||
| svcAPI, err := serviceconfig.Find(googleapisDir, api.Path, serviceconfig.LangGo) |
There was a problem hiding this comment.
The api.Path is passed to serviceconfig.Find, which uses it to construct file paths for reading service configurations. Since serviceconfig.validateAPI allows any path starting with google/cloud/, an attacker could use a path like google/cloud/../../../../etc/passwd to cause the tool to attempt to read arbitrary files from the filesystem where the tool is running.
Consider sanitizing or validating api.Path to ensure it does not contain directory traversal sequences before passing it to serviceconfig.Find.
References
- Inputs used in code generation, especially for constructing file paths, must be sanitized to prevent injection vulnerabilities like path traversal, even if the input source is considered trusted.
No description provided.