Conversation
WalkthroughAdds a Git scraper: API types for Git/file rules, schema and deepcopy updates, a go-git-based scraper that clones repos, builds rule-filtered file trees, detects commit/tag/branch changes with per-file strategies (ignore/track/diff), and test coverage + fixtures for integration and unit tests. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Scraper as Git Scraper
participant Repo as Git Repo (go-git)
participant Cache as TempCache
participant Detector as Change Detector
Client->>Scraper: Scrape(ctx, ScraperSpec)
activate Scraper
Scraper->>Repo: Clone / Open repository
activate Repo
Repo-->>Scraper: Repository object (refs, HEAD)
deactivate Repo
Scraper->>Cache: Load previous state (branches, tags, SHAs)
activate Cache
Cache-->>Scraper: previous state
deactivate Cache
Scraper->>Scraper: Build file tree (apply file rules)
Scraper->>Detector: generateCommitChanges(branchRef, rules)
activate Detector
Detector->>Repo: Walk commits, diff trees
Detector-->>Scraper: []ChangeResult (commits with diffs)
deactivate Detector
Scraper->>Detector: generateTagChanges(since)
activate Detector
Detector->>Repo: Resolve tags
Detector-->>Scraper: []ChangeResult (tags)
deactivate Detector
Scraper->>Scraper: generateBranchChanges(current, previous)
Scraper->>Cache: Store current state
activate Cache
Cache-->>Scraper: OK
deactivate Cache
Scraper-->>Client: ScrapeResults (repositories, branches, changes, files)
deactivate Scraper
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (6)
scrapers/git/changes.go (3)
213-224: Consider using the actual tag timestamp for creation events.For newly detected tags, the
CreatedAtis set totime.Now()(line 214), but the actual tag creation time is available intag.when. Using the real timestamp would provide more accurate historical data.♻️ Proposed fix
if _, existed := previousTags[name]; !existed { - now := time.Now() + createdAt := tag.when changes = append(changes, v1.ChangeResult{ ExternalID: repoExternalID, ConfigType: "Git::Repository", ExternalChangeID: fmt.Sprintf("tag/%s/%s", name, tag.sha), ChangeType: "Tag", Summary: fmt.Sprintf("Tag %s created", name), Source: "git", - CreatedAt: &now, + CreatedAt: &createdAt, Details: map[string]any{"tag": name, "sha": tag.sha}, }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/git/changes.go` around lines 213 - 224, The CreatedAt for new tag change events is using time.Now() which loses the actual tag timestamp; update the code that appends to changes (the v1.ChangeResult for "Tag" created when checking previousTags[name]) to use the tag's real time (tag.when) converted to a *time.Time (or pointer) instead of time.Now(), ensuring CreatedAt is set from tag.when (or its pointer) so the change record reflects the tag's actual creation time.
20-23: Consider logging errors instead of silently returning empty results.When
repo.Log()fails, the function silently returns an empty slice, which could hide repository access issues or configuration problems. Consider logging the error or returning it to the caller for better observability.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/git/changes.go` around lines 20 - 23, The function currently swallows errors from repo.Log (the call that assigns iter) and returns an empty changes slice; update the implementation so failures are observable by either (a) logging the error before returning (use the repository/component logger, e.g., log.Printf or your package logger) or (b) changing the function signature to return ([]Change, error) and propagate the error up instead of returning changes; locate the repo.Log call and the local vars iter and changes and implement one of these two fixes (log the error with context mentioning repo.Log/branchRef.Hash() or return the error to the caller).
291-311: Potential performance concern for large repositories.The
isAncestorfunction walks the entire commit history fromheadto findcandidate. For repositories with long histories, this could be slow. Consider caching or limiting the search depth if this becomes a bottleneck.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/git/changes.go` around lines 291 - 311, The isAncestor function currently does a full history walk from head which can be very slow on long repos; update isAncestor(repo *git.Repository, candidate, head plumbing.Hash) to either (a) short-circuit with a maxDepth parameter/constant and stop the iterator after that many commits, or (b) implement a simple ancestry cache (map[plumbing.Hash]bool or an LRU) keyed by head/candidate to avoid repeated full walks; locate the logic in isAncestor, add a depth check inside the iter.ForEach loop (or check/populate the cache before walking), and ensure you return false when the depth limit is reached while still allowing cache hits to return true without walking.scrapers/git/git_suite_test.go (1)
30-32: Directory change assumption may be fragile.The
os.Chdir("../.."))assumes tests are run from thescrapers/git/directory. This can cause issues when running tests from the repository root or other locations. Consider using a more robust method to locate the repository root.♻️ Alternative approach using runtime to find the test file location
func init() { _, filename, _, _ := runtime.Caller(0) dir := filepath.Dir(filename) // Navigate to repo root repoRoot := filepath.Join(dir, "..", "..") os.Chdir(repoRoot) }Or use an environment variable / build tag to specify the root path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/git/git_suite_test.go` around lines 30 - 32, Replace the fragile os.Chdir("../..") call in git_suite_test.go with a deterministic repo-root resolution: move the change into an init() or setup helper, use runtime.Caller to get the current test file path, compute filepath.Dir(filename) and then walk/join to the repository root (or respect an env var override) and call os.Chdir to that resolved path; reference os.Chdir, runtime.Caller, filepath.Dir and the init/test setup function to locate and update the code.fixtures/git.yaml (1)
7-11: Clarify the relationship betweenbranchandbranchesfields.The fixture specifies both
branch: main(line 8) andbranches: [main](lines 10-11). Thebranchfield appears to come fromGitConnectionwhilebranchesis the scraper-specific field for tracking. Consider documenting which field controls what behavior, or removing the redundancy if they serve the same purpose.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@fixtures/git.yaml` around lines 7 - 11, The fixture currently defines both branch and branches which are ambiguous; update the code and fixture so their relationship is explicit: decide whether GitConnection.branch (the single-revision field) or the scraper-specific branches list controls checkout behavior, then either remove the redundant field from the fixture or document and enforce precedence (e.g., GitConnection.branch as default, scraper's branches overrides it). Update the schema/docs and add validation in the loader that checks consistency between GitConnection.branch and the scraper's branches field (or auto-populates one from the other) and adjust the fixture to match the chosen behavior; reference the branch and branches fields and the GitConnection struct / scraper config when making changes.scrapers/git/filetree.go (1)
104-109: Per-file history lookup will scale poorly on large repositories.Running
lastModifyingCommitfor every file can dominate scrape time. Consider a single-pass commit traversal that fills last-modified metadata for unresolved paths instead of N log walks.Also applies to: 121-131
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/git/filetree.go` around lines 104 - 109, Per-file lastModifyingCommit calls (invoked where lastModifyingCommit(repo, commitHash, f.Name) sets entry.LastModified/LastModifiedSHA/LastModifiedBy) will not scale; instead implement a single-pass commit traversal that assigns last-modified metadata for unresolved entries in bulk. Replace the per-file calls with a preprocessing step (e.g., traverseCommitsFillLastModified or fillLastModifiedForEntries) that walks commits once from newest to oldest, checks each commit's tree for paths in the entries collection (use the same key used when calling lastModifyingCommit), and when a path is first seen populate entry.LastModified, entry.LastModifiedSHA and entry.LastModifiedBy; stop updating an entry once populated. Update usages where lastModifyingCommit was called (the spots setting LastModified, LastModifiedSHA, LastModifiedBy) to rely on the populated metadata instead of calling lastModifyingCommit per file.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@go.mod`:
- Line 60: Update the go-git dependency in go.mod from
github.com/go-git/go-git/v5 v5.16.5 to v5.17.0 and ensure module files are
refreshed; specifically, bump the version for the module identifier
github.com/go-git/go-git/v5 (used by code in scrapers/git/) and then run the
module update/tidy step so go.sum is regenerated and builds pick up v5.17.0.
In `@scrapers/git/changes.go`:
- Around line 78-80: Replace the WriteString(fmt.Sprintf(...)) call in the block
that appends per-file patches to diffContent: instead of
diffContent.WriteString(fmt.Sprintf("--- %s ---\n%s\n", d.path, d.patch)), call
fmt.Fprintf on diffContent with the same format and arguments (use diffContent
as the io.Writer and format string "--- %s ---\n%s\n" with d.path and d.patch)
so the code uses fmt.Fprintf directly for efficiency; ensure you import fmt if
not already present and keep the same output formatting and variables
(diffContent, d.path, d.patch).
In `@scrapers/git/git_integration_test.go`:
- Around line 13-18: The test "should scrape a public git repository" (the
ginkgo.It block that builds scrapeConfig via getConfigSpec("git"), creates
scraperCtx with api.NewScrapeContext(DefaultContext).WithScrapeConfig, and calls
Scraper{}.Scrape) depends on live external Git state; change it to either be
gated behind an explicit integration-only flag or to run against a deterministic
local fixture/mirror: add a check at the start of the test that skips unless an
integration environment variable/flag (e.g., TEST_INTEGRATION_GIT) is set, or
replace getConfigSpec("git") with a config pointing to a local test repo/mirror
committed or created in test setup so Scraper.Scrape uses deterministic data.
Ensure the gating flag or local-repo setup is documented in the test comments.
- Around line 119-120: The test currently uses unchecked type assertions on
repo.Config and on configMap["files"], which can panic; change them to checked
assertions: first assert that repo.Config is a map[string]any (e.g. v, ok :=
repo.Config.(map[string]any)) and fail the test with a clear message if not,
then retrieve filesVal, ok := configMap["files"] and assert its presence, then
assert filesSlice, ok := filesVal.([]FileEntry) and fail with t.Fatalf or
require/require.IsType/require.NotNil so failures are reported as test errors
instead of panics; update the uses of configMap and files to use these checked
variables.
In `@scrapers/git/git_suite_test.go`:
- Around line 39-45: getConfigSpec currently returns configs[0] without ensuring
the slice has elements; update getConfigSpec to check the length of the configs
slice returned by v1.ParseConfigs and call Fail with a clear message if
len(configs) == 0 (or propagate an error) before accessing configs[0]; reference
the getConfigSpec function, the configs variable, and the v1.ParseConfigs call
and ensure the error/empty-slice path uses Fail(fmt.Sprintf(...)) so tests don't
panic.
In `@scrapers/git/scraper.go`:
- Around line 87-90: The code is replacing the entire repoResult.Tags map
(created by NewScrapeResult/BaseScraper) which discards user-configured tags;
instead merge the new keys into the existing repoResult.Tags map (e.g., set
repoResult.Tags["url"]=gitClient.URL and
repoResult.Tags["defaultBranch"]=gitClient.Branch) so you preserve any
pre-existing entries; apply the same merging approach for the other occurrence
that assigns repoResult.Tags at lines around the second block (the assignment
near where repoResult.Tags is set again).
- Around line 61-63: loadPreviousBranches currently returns nil so branch
lifecycle deltas are recomputed with no baseline; implement loadPreviousBranches
to load the persisted branch snapshot (same pattern as loadPreviousTags) by
reading the repo's stored state (DB/metadata) and returning the slice of
previous branch names/objects for the given repoExternalID, then ensure all
callers (the branch delta computation sites in scraper.go that call
loadPreviousBranches) use the returned value to compute create/delete deltas;
also update any other occurrences of the placeholder implementation (the other
call sites flagged in the comment) so branch deltas are stable across
consecutive scrapes.
---
Nitpick comments:
In `@fixtures/git.yaml`:
- Around line 7-11: The fixture currently defines both branch and branches which
are ambiguous; update the code and fixture so their relationship is explicit:
decide whether GitConnection.branch (the single-revision field) or the
scraper-specific branches list controls checkout behavior, then either remove
the redundant field from the fixture or document and enforce precedence (e.g.,
GitConnection.branch as default, scraper's branches overrides it). Update the
schema/docs and add validation in the loader that checks consistency between
GitConnection.branch and the scraper's branches field (or auto-populates one
from the other) and adjust the fixture to match the chosen behavior; reference
the branch and branches fields and the GitConnection struct / scraper config
when making changes.
In `@scrapers/git/changes.go`:
- Around line 213-224: The CreatedAt for new tag change events is using
time.Now() which loses the actual tag timestamp; update the code that appends to
changes (the v1.ChangeResult for "Tag" created when checking previousTags[name])
to use the tag's real time (tag.when) converted to a *time.Time (or pointer)
instead of time.Now(), ensuring CreatedAt is set from tag.when (or its pointer)
so the change record reflects the tag's actual creation time.
- Around line 20-23: The function currently swallows errors from repo.Log (the
call that assigns iter) and returns an empty changes slice; update the
implementation so failures are observable by either (a) logging the error before
returning (use the repository/component logger, e.g., log.Printf or your package
logger) or (b) changing the function signature to return ([]Change, error) and
propagate the error up instead of returning changes; locate the repo.Log call
and the local vars iter and changes and implement one of these two fixes (log
the error with context mentioning repo.Log/branchRef.Hash() or return the error
to the caller).
- Around line 291-311: The isAncestor function currently does a full history
walk from head which can be very slow on long repos; update isAncestor(repo
*git.Repository, candidate, head plumbing.Hash) to either (a) short-circuit with
a maxDepth parameter/constant and stop the iterator after that many commits, or
(b) implement a simple ancestry cache (map[plumbing.Hash]bool or an LRU) keyed
by head/candidate to avoid repeated full walks; locate the logic in isAncestor,
add a depth check inside the iter.ForEach loop (or check/populate the cache
before walking), and ensure you return false when the depth limit is reached
while still allowing cache hits to return true without walking.
In `@scrapers/git/filetree.go`:
- Around line 104-109: Per-file lastModifyingCommit calls (invoked where
lastModifyingCommit(repo, commitHash, f.Name) sets
entry.LastModified/LastModifiedSHA/LastModifiedBy) will not scale; instead
implement a single-pass commit traversal that assigns last-modified metadata for
unresolved entries in bulk. Replace the per-file calls with a preprocessing step
(e.g., traverseCommitsFillLastModified or fillLastModifiedForEntries) that walks
commits once from newest to oldest, checks each commit's tree for paths in the
entries collection (use the same key used when calling lastModifyingCommit), and
when a path is first seen populate entry.LastModified, entry.LastModifiedSHA and
entry.LastModifiedBy; stop updating an entry once populated. Update usages where
lastModifyingCommit was called (the spots setting LastModified, LastModifiedSHA,
LastModifiedBy) to rely on the populated metadata instead of calling
lastModifyingCommit per file.
In `@scrapers/git/git_suite_test.go`:
- Around line 30-32: Replace the fragile os.Chdir("../..") call in
git_suite_test.go with a deterministic repo-root resolution: move the change
into an init() or setup helper, use runtime.Caller to get the current test file
path, compute filepath.Dir(filename) and then walk/join to the repository root
(or respect an env var override) and call os.Chdir to that resolved path;
reference os.Chdir, runtime.Caller, filepath.Dir and the init/test setup
function to locate and update the code.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
api/v1/git.goapi/v1/scrapeconfig_types.goapi/v1/types.goapi/v1/zz_generated.deepcopy.gofixtures/git.yamlgo.modscrapers/common.goscrapers/git/changes.goscrapers/git/filetree.goscrapers/git/git_integration_test.goscrapers/git/git_suite_test.goscrapers/git/scraper.goscrapers/git/scraper_test.go
| github.com/flanksource/is-healthy v1.0.82 | ||
| github.com/flanksource/ketall v1.1.9 | ||
| github.com/flanksource/kopper v1.0.13 | ||
| github.com/go-git/go-git/v5 v5.16.5 |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
go-git/go-git v5.16.5 latest version releases
💡 Result:
As of March 1, 2026, the latest github.com/go-git/go-git/v5 release is:
- v5.17.0 (released Feb 25, 2026) — marked Latest on GitHub Releases. [1]
The prior patch you mentioned:
- v5.16.5 (released Feb 9, 2026) — includes the fix for GHSA-37cx-329c-33x3 / CVE-2026-25934 (fixed in 5.16.5). [2][3]
To upgrade:
go get github.com/go-git/go-git/v5@v5.17.0Sources: [1] [2] [3]
🏁 Script executed:
find . -path "*/scrapers/git/*" -name "*.go" -type f | head -20Repository: flanksource/config-db
Length of output: 246
🏁 Script executed:
rg "github.com/go-git/go-git" --type go -A 2 -B 2Repository: flanksource/config-db
Length of output: 1467
Upgrade to go-git v5.17.0 (latest).
The go-git/go-git/v5 dependency promotion from indirect to direct is correct—the code imports it directly in scrapers/git/. However, v5.17.0 was released on Feb 25, 2026 and is now the latest stable version. v5.16.5 (Feb 9, 2026) remains valid with its security fix (CVE-2026-25934), but upgrading to v5.17.0 is recommended.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@go.mod` at line 60, Update the go-git dependency in go.mod from
github.com/go-git/go-git/v5 v5.16.5 to v5.17.0 and ensure module files are
refreshed; specifically, bump the version for the module identifier
github.com/go-git/go-git/v5 (used by code in scrapers/git/) and then run the
module update/tidy step so go.sum is regenerated and builds pick up v5.17.0.
| if d.patch != "" { | ||
| diffContent.WriteString(fmt.Sprintf("--- %s ---\n%s\n", d.path, d.patch)) | ||
| } |
There was a problem hiding this comment.
Fix lint error: Use fmt.Fprintf instead of WriteString(fmt.Sprintf(...)).
The static analysis tool flagged this as QF1012. Using fmt.Fprintf directly is more efficient.
🔧 Proposed fix
if d.patch != "" {
- diffContent.WriteString(fmt.Sprintf("--- %s ---\n%s\n", d.path, d.patch))
+ fmt.Fprintf(&diffContent, "--- %s ---\n%s\n", d.path, d.patch)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if d.patch != "" { | |
| diffContent.WriteString(fmt.Sprintf("--- %s ---\n%s\n", d.path, d.patch)) | |
| } | |
| if d.patch != "" { | |
| fmt.Fprintf(&diffContent, "--- %s ---\n%s\n", d.path, d.patch) | |
| } |
🧰 Tools
🪛 GitHub Actions: Lint
[error] 79-79: QF1012: Use fmt.Fprintf(...) instead of WriteString(fmt.Sprintf(...)) (staticcheck)
🪛 GitHub Check: lint
[failure] 79-79:
QF1012: Use fmt.Fprintf(...) instead of WriteString(fmt.Sprintf(...)) (staticcheck)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scrapers/git/changes.go` around lines 78 - 80, Replace the
WriteString(fmt.Sprintf(...)) call in the block that appends per-file patches to
diffContent: instead of diffContent.WriteString(fmt.Sprintf("--- %s ---\n%s\n",
d.path, d.patch)), call fmt.Fprintf on diffContent with the same format and
arguments (use diffContent as the io.Writer and format string "--- %s ---\n%s\n"
with d.path and d.patch) so the code uses fmt.Fprintf directly for efficiency;
ensure you import fmt if not already present and keep the same output formatting
and variables (diffContent, d.path, d.patch).
| ginkgo.It("should scrape a public git repository", func() { | ||
| scrapeConfig := getConfigSpec("git") | ||
| scraperCtx := api.NewScrapeContext(DefaultContext).WithScrapeConfig(&scrapeConfig) | ||
|
|
||
| scraper := Scraper{} | ||
| results = scraper.Scrape(scraperCtx) |
There was a problem hiding this comment.
This integration test is coupled to live external Git state.
Using a public repo directly makes CI brittle (network outages, branch/history drift, force-pushes). Please gate this as explicit integration-only or run against a deterministic local mirror/fixture repo.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scrapers/git/git_integration_test.go` around lines 13 - 18, The test "should
scrape a public git repository" (the ginkgo.It block that builds scrapeConfig
via getConfigSpec("git"), creates scraperCtx with
api.NewScrapeContext(DefaultContext).WithScrapeConfig, and calls
Scraper{}.Scrape) depends on live external Git state; change it to either be
gated behind an explicit integration-only flag or to run against a deterministic
local fixture/mirror: add a check at the start of the test that skips unless an
integration environment variable/flag (e.g., TEST_INTEGRATION_GIT) is set, or
replace getConfigSpec("git") with a config pointing to a local test repo/mirror
committed or created in test setup so Scraper.Scrape uses deterministic data.
Ensure the gating flag or local-repo setup is documented in the test comments.
| configMap := repo.Config.(map[string]any) | ||
| files := configMap["files"].([]FileEntry) |
There was a problem hiding this comment.
Guard type assertions to avoid panic-driven test failures.
Use checked assertions here so failures are reported as expectations, not panics.
Proposed fix
- configMap := repo.Config.(map[string]any)
- files := configMap["files"].([]FileEntry)
+ configMap, ok := repo.Config.(map[string]any)
+ Expect(ok).To(BeTrue(), "repo config should be a map")
+ files, ok := configMap["files"].([]FileEntry)
+ Expect(ok).To(BeTrue(), "files should be a []FileEntry")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scrapers/git/git_integration_test.go` around lines 119 - 120, The test
currently uses unchecked type assertions on repo.Config and on
configMap["files"], which can panic; change them to checked assertions: first
assert that repo.Config is a map[string]any (e.g. v, ok :=
repo.Config.(map[string]any)) and fail the test with a clear message if not,
then retrieve filesVal, ok := configMap["files"] and assert its presence, then
assert filesSlice, ok := filesVal.([]FileEntry) and fail with t.Fatalf or
require/require.IsType/require.NotNil so failures are reported as test errors
instead of panics; update the uses of configMap and files to use these checked
variables.
| func getConfigSpec(name string) v1.ScrapeConfig { | ||
| configs, err := v1.ParseConfigs("fixtures/" + name + ".yaml") | ||
| if err != nil { | ||
| Fail(fmt.Sprintf("Failed to parse config: %v", err)) | ||
| } | ||
| return configs[0] | ||
| } |
There was a problem hiding this comment.
Add bounds check before accessing configs[0].
The function returns configs[0] without verifying that the slice is non-empty. If the fixture file contains no valid configs, this will cause a panic.
🛡️ Proposed fix
func getConfigSpec(name string) v1.ScrapeConfig {
configs, err := v1.ParseConfigs("fixtures/" + name + ".yaml")
if err != nil {
Fail(fmt.Sprintf("Failed to parse config: %v", err))
}
+ if len(configs) == 0 {
+ Fail(fmt.Sprintf("No configs found in fixtures/%s.yaml", name))
+ }
return configs[0]
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func getConfigSpec(name string) v1.ScrapeConfig { | |
| configs, err := v1.ParseConfigs("fixtures/" + name + ".yaml") | |
| if err != nil { | |
| Fail(fmt.Sprintf("Failed to parse config: %v", err)) | |
| } | |
| return configs[0] | |
| } | |
| func getConfigSpec(name string) v1.ScrapeConfig { | |
| configs, err := v1.ParseConfigs("fixtures/" + name + ".yaml") | |
| if err != nil { | |
| Fail(fmt.Sprintf("Failed to parse config: %v", err)) | |
| } | |
| if len(configs) == 0 { | |
| Fail(fmt.Sprintf("No configs found in fixtures/%s.yaml", name)) | |
| } | |
| return configs[0] | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scrapers/git/git_suite_test.go` around lines 39 - 45, getConfigSpec currently
returns configs[0] without ensuring the slice has elements; update getConfigSpec
to check the length of the configs slice returned by v1.ParseConfigs and call
Fail with a clear message if len(configs) == 0 (or propagate an error) before
accessing configs[0]; reference the getConfigSpec function, the configs
variable, and the v1.ParseConfigs call and ensure the error/empty-slice path
uses Fail(fmt.Sprintf(...)) so tests don't panic.
| previousTags := loadPreviousTags(ctx, repoExternalID) | ||
| previousBranches := loadPreviousBranches(ctx, repoExternalID) | ||
|
|
There was a problem hiding this comment.
Branch lifecycle delta is currently non-stateful.
Because loadPreviousBranches always returns nil, branch changes are recomputed without a real baseline each run.
Proposed fix
- // Store current tags in repo config for next run
- repoConfig["tags"] = currentTags
+ // Store current state in repo config for next run
+ repoConfig["tags"] = currentTags
+ repoConfig["branches"] = branchNames(trackedBranches)
@@
-func loadPreviousBranches(_ api.ScrapeContext, _ string) []string {
- // FIXME: implement branch tracking from TempCache
- return nil
-}
+func loadPreviousBranches(ctx api.ScrapeContext, repoExternalID string) []string {
+ configMap := loadCachedConfig(ctx, repoExternalID, "Git::Repository")
+ if configMap == nil {
+ return nil
+ }
+
+ raw, ok := configMap["branches"]
+ if !ok {
+ return nil
+ }
+
+ switch v := raw.(type) {
+ case []string:
+ return append([]string(nil), v...)
+ case []any:
+ out := make([]string, 0, len(v))
+ for _, item := range v {
+ if s, ok := item.(string); ok {
+ out = append(out, s)
+ }
+ }
+ return out
+ default:
+ return nil
+ }
+}If you want, I can also draft a small regression test that verifies branch create/delete deltas are stable across consecutive scrapes.
Also applies to: 100-103, 118-120, 267-270
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scrapers/git/scraper.go` around lines 61 - 63, loadPreviousBranches currently
returns nil so branch lifecycle deltas are recomputed with no baseline;
implement loadPreviousBranches to load the persisted branch snapshot (same
pattern as loadPreviousTags) by reading the repo's stored state (DB/metadata)
and returning the slice of previous branch names/objects for the given
repoExternalID, then ensure all callers (the branch delta computation sites in
scraper.go that call loadPreviousBranches) use the returned value to compute
create/delete deltas; also update any other occurrences of the placeholder
implementation (the other call sites flagged in the comment) so branch deltas
are stable across consecutive scrapes.
| repoResult.Tags = v1.JSONStringMap{ | ||
| "url": gitClient.URL, | ||
| "defaultBranch": gitClient.Branch, | ||
| } |
There was a problem hiding this comment.
Preserve BaseScraper tags instead of replacing them.
NewScrapeResult already carries base tags, but these assignments overwrite the entire map and discard user-configured tags.
Proposed fix
- repoResult.Tags = v1.JSONStringMap{
- "url": gitClient.URL,
- "defaultBranch": gitClient.Branch,
- }
+ if repoResult.Tags == nil {
+ repoResult.Tags = v1.JSONStringMap{}
+ }
+ repoResult.Tags["url"] = gitClient.URL
+ repoResult.Tags["defaultBranch"] = gitClient.Branch
@@
- branchResult.Tags = v1.JSONStringMap{
- "branch": branchName,
- "url": gitClient.URL,
- }
+ if branchResult.Tags == nil {
+ branchResult.Tags = v1.JSONStringMap{}
+ }
+ branchResult.Tags["branch"] = branchName
+ branchResult.Tags["url"] = gitClient.URLAlso applies to: 144-147
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scrapers/git/scraper.go` around lines 87 - 90, The code is replacing the
entire repoResult.Tags map (created by NewScrapeResult/BaseScraper) which
discards user-configured tags; instead merge the new keys into the existing
repoResult.Tags map (e.g., set repoResult.Tags["url"]=gitClient.URL and
repoResult.Tags["defaultBranch"]=gitClient.Branch) so you preserve any
pre-existing entries; apply the same merging approach for the other occurrence
that assigns repoResult.Tags at lines around the second block (the assignment
near where repoResult.Tags is set again).
Tags whose target commit is older than the oldest commit in the depth window are now excluded from change detection.
- Use Name <email> format for LastModifiedBy in file tree - Promote go-git/go-git to direct dependency - Update integration test assertions for Commit change type - Add tag and author/committer assertions to integration tests - Switch test fixture to mission-control repo
c4caa36 to
cee32dd
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (3)
scrapers/git/scraper.go (2)
87-90:⚠️ Potential issue | 🟠 MajorAvoid replacing
Tags; merge into existing map to preserve BaseScraper tags.These assignments overwrite pre-existing tags and can drop user-configured metadata.
🔧 Proposed fix
- repoResult.Tags = v1.JSONStringMap{ - "url": gitClient.URL, - "defaultBranch": gitClient.Branch, - } + if repoResult.Tags == nil { + repoResult.Tags = v1.JSONStringMap{} + } + repoResult.Tags["url"] = gitClient.URL + repoResult.Tags["defaultBranch"] = gitClient.Branch @@ - branchResult.Tags = v1.JSONStringMap{ - "branch": branchName, - "url": gitClient.URL, - } + if branchResult.Tags == nil { + branchResult.Tags = v1.JSONStringMap{} + } + branchResult.Tags["branch"] = branchName + branchResult.Tags["url"] = gitClient.URLAlso applies to: 144-147
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/git/scraper.go` around lines 87 - 90, The current code assigns repoResult.Tags = v1.JSONStringMap{...}, which overwrites existing BaseScraper tags and loses metadata; instead, merge the new keys into the existing repoResult.Tags map (creating it if nil) and set "url" to gitClient.URL and "defaultBranch" to gitClient.Branch so existing tags are preserved; apply the same change for the other occurrence that sets repoResult.Tags (the block around the second occurrence) so both places merge rather than replace.
61-63:⚠️ Potential issue | 🟠 MajorBranch lifecycle deltas are still non-stateful.
loadPreviousBranchesstill returnsnil, and current branches are not persisted inrepoConfig, so create/delete branch changes are recomputed without a stable baseline.🔧 Proposed fix
// Store current tags in repo config for next run repoConfig["tags"] = currentTags + repoConfig["branches"] = branchNames(trackedBranches) @@ -func loadPreviousBranches(_ api.ScrapeContext, _ string) []string { - // FIXME: implement branch tracking from TempCache - return nil +func loadPreviousBranches(ctx api.ScrapeContext, repoExternalID string) []string { + configMap := loadCachedConfig(ctx, repoExternalID, "Git::Repository") + if configMap == nil { + return nil + } + + raw, ok := configMap["branches"] + if !ok { + return nil + } + + switch v := raw.(type) { + case []string: + return append([]string(nil), v...) + case []any: + out := make([]string, 0, len(v)) + for _, item := range v { + if s, ok := item.(string); ok { + out = append(out, s) + } + } + return out + default: + return nil + } }Also applies to: 118-120, 267-270
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/git/scraper.go` around lines 61 - 63, loadPreviousBranches currently returns nil and current branch state isn't persisted, so branch create/delete deltas are recomputed without a stable baseline; update loadPreviousBranches to read the saved branch list from repoConfig (or storage used by loadPreviousTags), and after computing currentBranches in the scraper persist the current branch list back into repoConfig (the same place loadPreviousBranches reads from) so subsequent runs have a stable baseline; ensure the same fix is applied wherever loadPreviousBranches is called (e.g., alongside loadPreviousTags usage and the other two call sites mentioned) and keep the saved format consistent with loadPreviousTags to avoid schema mismatches.scrapers/git/changes.go (1)
78-80:⚠️ Potential issue | 🟡 MinorUse
fmt.Fprintfinstead ofWriteString(fmt.Sprintf(...))to satisfy lint and avoid extra allocation.This is still failing staticcheck (QF1012).
🔧 Proposed fix
case v1.GitFileStrategyDiff: diffedFiles = append(diffedFiles, d.path) if d.patch != "" { - diffContent.WriteString(fmt.Sprintf("--- %s ---\n%s\n", d.path, d.patch)) + fmt.Fprintf(&diffContent, "--- %s ---\n%s\n", d.path, d.patch) } }#!/bin/bash rg -n 'WriteString\(fmt\.Sprintf\('🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/git/changes.go` around lines 78 - 80, Replace the WriteString(fmt.Sprintf(...)) call to avoid the extra allocation and satisfy staticcheck QF1012: change the usage of diffContent.WriteString(fmt.Sprintf("--- %s ---\n%s\n", d.path, d.patch)) to a direct fmt.Fprintf call (e.g. fmt.Fprintf(&diffContent, "--- %s ---\n%s\n", d.path, d.patch)) so that diffContent is written to directly; keep the same format string and use d.path and d.patch as arguments.
🧹 Nitpick comments (4)
scrapers/git/filetree.go (1)
104-109: Per-file commit-history lookup is likely to become expensive on large repositories.
lastModifyingCommitis invoked for each file, each with a fresh log traversal. This can heavily degrade scrape latency as repo size grows.♻️ Suggested direction
- lastCommit, err := lastModifyingCommit(repo, commitHash, f.Name) + // TODO: avoid per-file history walks; build once-per-commit metadata cache + // for touched paths, or make last-modified enrichment optional/configurable. + lastCommit, err := lastModifyingCommit(repo, commitHash, f.Name)scrapers/git/scraper_test.go (1)
323-323: Remove unused variable assignment.The
_ = dirassignment on line 323 appears to be a leftover from debugging or refactoring. Sincediris not used in the remainder of the test, consider removing this line or using the blank identifier directly in theinitTestRepocall.🧹 Proposed cleanup
- _ = dir - - changes, currentTags := generateTagChanges(repo, "git/test", map[string]string{}, time.Time{}) + changes, currentTags := generateTagChanges(repo, "git/test", map[string]string{}, time.Time{})Or modify the
initTestRepocall:- dir, repo := initTestRepo(t) + _, repo := initTestRepo(t)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/git/scraper_test.go` at line 323, Remove the no-op assignment "_ = dir" in the test; since the variable dir is never used after initTestRepo, either pass the blank identifier directly to initTestRepo or eliminate the dir declaration entirely and rely on initTestRepo's return being ignored — locate the test where initTestRepo is called and the unused variable dir is declared and delete the "_ = dir" statement (or replace the variable with _ in the initTestRepo call).config/schemas/scrape_config.schema.json (2)
1631-1633: Minor: Missing description fortypefield.The
typefield in the Git definition lacks a description, while the same field in other scraper definitions (e.g., AWS, Azure) includes the description: "A static value or JSONPath expression to use as the type for the resource."📝 Suggested fix
"type": { - "type": "string" + "type": "string", + "description": "A static value or JSONPath expression to use as the type for the resource." },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/schemas/scrape_config.schema.json` around lines 1631 - 1633, Add a "description" property to the "type" field in the Git scraper definition so it matches other scrapers: set the description to "A static value or JSONPath expression to use as the type for the resource." Locate the "type" field inside the Git definition in config/schemas/scrape_config.schema.json (the property named "type") and add the description string alongside the existing "type": "string" entry.
1761-1776: Consider adding enum constraint forstrategyfield.The
strategyfield accepts any string, but based on the Go type definition, only"track","diff", and"ignore"are valid values. Adding an enum constraint would provide better validation feedback at schema level.📝 Suggested enhancement
"strategy": { - "type": "string" + "type": "string", + "enum": ["track", "diff", "ignore"], + "description": "File handling strategy: track (metadata only), diff (include content changes), ignore (exclude from scraping)" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/schemas/scrape_config.schema.json` around lines 1761 - 1776, The GitFileRule schema allows any string for the "strategy" property; restrict it to valid values by adding an enum to the "strategy" property listing "track", "diff", and "ignore". Update the "GitFileRule" object schema so its "properties"."strategy" includes an "enum" array with those three strings and keep "type": "string" and existing required/additionalProperties settings unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/v1/git.go`:
- Around line 5-16: The API currently allows any string for
GitFileRule.Strategy; add explicit validation to reject unknown values by
implementing a strict unmarshaler/validator for GitFileStrategy that checks
incoming JSON/YAML against the allowed constants (GitFileStrategyIgnore,
GitFileStrategyTrack, GitFileStrategyDiff) and returns an error for anything
else, and ensure GitFileRule uses that by either adding a Validate() method on
GitFileRule or relying on GitFileStrategy.UnmarshalJSON/UnmarshalYAML so parsing
fails on invalid inputs; update callers/tests to expect and handle the
validation error.
In `@api/v1/types.go`:
- Line 70: ScraperSpec.IsEmpty() currently ignores the new Git field so a spec
that only sets Git can be reported empty; update ScraperSpec.IsEmpty() to treat
Git as non-empty when s.Git is non-nil and has elements (e.g., check len(s.Git)
> 0) or, if there is a Git.IsEmpty() helper on the element type, iterate and
treat the spec as non-empty if any Git element is non-empty; modify the IsEmpty
logic in ScraperSpec.IsEmpty() to include this Git presence check alongside the
existing field checks.
In `@config/schemas/config_git.schema.json`:
- Around line 295-297: Replace the loose "type": "string" for
GitFileRule.strategy with a JSON Schema enum of the exact supported strategy
values used by the runtime (e.g., the constants/enum used by the Git file
handling code — reference the runtime symbol that defines allowed strategies
such as GitFileRuleStrategy or the constant list in the codebase) so only valid
strategies are accepted; update the schema to use "type":"string" plus
"enum":[...allowed values...] and add a short "description" noting it must match
the runtime's strategy enum.
In `@config/schemas/config_github.schema.json`:
- Around line 337-343: Update the "description" value for the JSONStringMap
schema entry to correct the typo "defiend" to "defined" (entry symbol:
JSONStringMap); locate the JSONStringMap object and replace the misspelled word
in its "description" field so the sentence reads correctly while leaving the
rest of the schema unchanged.
In `@scrapers/git/changes.go`:
- Around line 207-213: The problem is that currentTags is only populated after
applying the since filter so comparing previousTags against this partial map
produces false "Tag deleted" events; move the assignment currentTags[name] =
tag.sha so it runs for every tag (before the if !since.IsZero() &&
tag.when.Before(since) check) and keep the since filter only for emitting change
events (e.g., skip creating "Tag added" or other events when tag.when is before
since). Apply the same fix to the other similar block (the code around lines
229-243) so both deletion detection and event emission use the full currentTags
map while the since filter only controls which events are produced.
---
Duplicate comments:
In `@scrapers/git/changes.go`:
- Around line 78-80: Replace the WriteString(fmt.Sprintf(...)) call to avoid the
extra allocation and satisfy staticcheck QF1012: change the usage of
diffContent.WriteString(fmt.Sprintf("--- %s ---\n%s\n", d.path, d.patch)) to a
direct fmt.Fprintf call (e.g. fmt.Fprintf(&diffContent, "--- %s ---\n%s\n",
d.path, d.patch)) so that diffContent is written to directly; keep the same
format string and use d.path and d.patch as arguments.
In `@scrapers/git/scraper.go`:
- Around line 87-90: The current code assigns repoResult.Tags =
v1.JSONStringMap{...}, which overwrites existing BaseScraper tags and loses
metadata; instead, merge the new keys into the existing repoResult.Tags map
(creating it if nil) and set "url" to gitClient.URL and "defaultBranch" to
gitClient.Branch so existing tags are preserved; apply the same change for the
other occurrence that sets repoResult.Tags (the block around the second
occurrence) so both places merge rather than replace.
- Around line 61-63: loadPreviousBranches currently returns nil and current
branch state isn't persisted, so branch create/delete deltas are recomputed
without a stable baseline; update loadPreviousBranches to read the saved branch
list from repoConfig (or storage used by loadPreviousTags), and after computing
currentBranches in the scraper persist the current branch list back into
repoConfig (the same place loadPreviousBranches reads from) so subsequent runs
have a stable baseline; ensure the same fix is applied wherever
loadPreviousBranches is called (e.g., alongside loadPreviousTags usage and the
other two call sites mentioned) and keep the saved format consistent with
loadPreviousTags to avoid schema mismatches.
---
Nitpick comments:
In `@config/schemas/scrape_config.schema.json`:
- Around line 1631-1633: Add a "description" property to the "type" field in the
Git scraper definition so it matches other scrapers: set the description to "A
static value or JSONPath expression to use as the type for the resource." Locate
the "type" field inside the Git definition in
config/schemas/scrape_config.schema.json (the property named "type") and add the
description string alongside the existing "type": "string" entry.
- Around line 1761-1776: The GitFileRule schema allows any string for the
"strategy" property; restrict it to valid values by adding an enum to the
"strategy" property listing "track", "diff", and "ignore". Update the
"GitFileRule" object schema so its "properties"."strategy" includes an "enum"
array with those three strings and keep "type": "string" and existing
required/additionalProperties settings unchanged.
In `@scrapers/git/scraper_test.go`:
- Line 323: Remove the no-op assignment "_ = dir" in the test; since the
variable dir is never used after initTestRepo, either pass the blank identifier
directly to initTestRepo or eliminate the dir declaration entirely and rely on
initTestRepo's return being ignored — locate the test where initTestRepo is
called and the unused variable dir is declared and delete the "_ = dir"
statement (or replace the variable with _ in the initTestRepo call).
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (20)
api/v1/git.goapi/v1/scrapeconfig_types.goapi/v1/types.goapi/v1/zz_generated.deepcopy.gochart/crds/configs.flanksource.com_scrapeconfigs.yamlconfig/schemas/config_git.schema.jsonconfig/schemas/config_github.schema.jsonconfig/schemas/config_http.schema.jsonconfig/schemas/scrape_config.schema.jsonfixtures/git.yamlgo.modscrapers/aws/cloudtrail_test.goscrapers/common.goscrapers/git/changes.goscrapers/git/filetree.goscrapers/git/git_integration_test.goscrapers/git/git_suite_test.goscrapers/git/scraper.goscrapers/git/scraper_test.goscrapers/trivy/trivy.go
💤 Files with no reviewable changes (1)
- scrapers/trivy/trivy.go
✅ Files skipped from review due to trivial changes (2)
- scrapers/aws/cloudtrail_test.go
- chart/crds/configs.flanksource.com_scrapeconfigs.yaml
🚧 Files skipped from review as they are similar to previous changes (6)
- go.mod
- scrapers/common.go
- api/v1/scrapeconfig_types.go
- api/v1/zz_generated.deepcopy.go
- scrapers/git/git_suite_test.go
- scrapers/git/git_integration_test.go
| type GitFileStrategy string | ||
|
|
||
| const ( | ||
| GitFileStrategyIgnore GitFileStrategy = "ignore" | ||
| GitFileStrategyTrack GitFileStrategy = "track" | ||
| GitFileStrategyDiff GitFileStrategy = "diff" | ||
| ) | ||
|
|
||
| type GitFileRule struct { | ||
| Pattern string `json:"pattern" yaml:"pattern"` | ||
| Strategy GitFileStrategy `json:"strategy" yaml:"strategy"` | ||
| } |
There was a problem hiding this comment.
Constrain GitFileStrategy values at the API layer.
GitFileRule.Strategy currently accepts arbitrary strings. Invalid values can silently bypass expected file handling logic.
🛡️ Proposed fix
+// +kubebuilder:validation:Enum=ignore;track;diff
type GitFileStrategy string📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| type GitFileStrategy string | |
| const ( | |
| GitFileStrategyIgnore GitFileStrategy = "ignore" | |
| GitFileStrategyTrack GitFileStrategy = "track" | |
| GitFileStrategyDiff GitFileStrategy = "diff" | |
| ) | |
| type GitFileRule struct { | |
| Pattern string `json:"pattern" yaml:"pattern"` | |
| Strategy GitFileStrategy `json:"strategy" yaml:"strategy"` | |
| } | |
| // +kubebuilder:validation:Enum=ignore;track;diff | |
| type GitFileStrategy string | |
| const ( | |
| GitFileStrategyIgnore GitFileStrategy = "ignore" | |
| GitFileStrategyTrack GitFileStrategy = "track" | |
| GitFileStrategyDiff GitFileStrategy = "diff" | |
| ) | |
| type GitFileRule struct { | |
| Pattern string `json:"pattern" yaml:"pattern"` | |
| Strategy GitFileStrategy `json:"strategy" yaml:"strategy"` | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/v1/git.go` around lines 5 - 16, The API currently allows any string for
GitFileRule.Strategy; add explicit validation to reject unknown values by
implementing a strict unmarshaler/validator for GitFileStrategy that checks
incoming JSON/YAML against the allowed constants (GitFileStrategyIgnore,
GitFileStrategyTrack, GitFileStrategyDiff) and returns an error for anything
else, and ensure GitFileRule uses that by either adding a Validate() method on
GitFileRule or relying on GitFileStrategy.UnmarshalJSON/UnmarshalYAML so parsing
fails on invalid inputs; update callers/tests to expect and handle the
validation error.
| GCP []GCP `json:"gcp,omitempty" yaml:"gcp,omitempty"` | ||
| AWS []AWS `json:"aws,omitempty" yaml:"aws,omitempty"` | ||
| File []File `json:"file,omitempty" yaml:"file,omitempty"` | ||
| Git []Git `json:"git,omitempty" yaml:"git,omitempty"` |
There was a problem hiding this comment.
ScraperSpec.IsEmpty() is now inconsistent with the new Git field.
Git-only scraper specs can be misclassified as empty because IsEmpty() still ignores Git.
🔧 Proposed fix
func (c ScraperSpec) IsEmpty() bool {
- return len(c.AWS) == 0 && len(c.File) == 0
+ return len(c.AWS) == 0 && len(c.File) == 0 && len(c.Git) == 0
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/v1/types.go` at line 70, ScraperSpec.IsEmpty() currently ignores the new
Git field so a spec that only sets Git can be reported empty; update
ScraperSpec.IsEmpty() to treat Git as non-empty when s.Git is non-nil and has
elements (e.g., check len(s.Git) > 0) or, if there is a Git.IsEmpty() helper on
the element type, iterate and treat the spec as non-empty if any Git element is
non-empty; modify the IsEmpty logic in ScraperSpec.IsEmpty() to include this Git
presence check alongside the existing field checks.
| "strategy": { | ||
| "type": "string" | ||
| } |
There was a problem hiding this comment.
Restrict GitFileRule.strategy to supported values.
Using plain "type": "string" allows invalid strategies that runtime logic won’t handle consistently.
🛡️ Proposed fix
"strategy": {
- "type": "string"
+ "type": "string",
+ "enum": ["ignore", "track", "diff"]
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "strategy": { | |
| "type": "string" | |
| } | |
| "strategy": { | |
| "type": "string", | |
| "enum": ["ignore", "track", "diff"] | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@config/schemas/config_git.schema.json` around lines 295 - 297, Replace the
loose "type": "string" for GitFileRule.strategy with a JSON Schema enum of the
exact supported strategy values used by the runtime (e.g., the constants/enum
used by the Git file handling code — reference the runtime symbol that defines
allowed strategies such as GitFileRuleStrategy or the constant list in the
codebase) so only valid strategies are accepted; update the schema to use
"type":"string" plus "enum":[...allowed values...] and add a short "description"
noting it must match the runtime's strategy enum.
| "JSONStringMap": { | ||
| "additionalProperties": { | ||
| "type": "string" | ||
| }, | ||
| "type": "object", | ||
| "description": "JSONStringMap defiend JSON data type, need to implements driver.Valuer, sql.Scanner interface" | ||
| }, |
There was a problem hiding this comment.
Minor typo in description.
The description contains a typo: "defiend" should be "defined".
📝 Suggested fix
"JSONStringMap": {
"additionalProperties": {
"type": "string"
},
"type": "object",
- "description": "JSONStringMap defiend JSON data type, need to implements driver.Valuer, sql.Scanner interface"
+ "description": "JSONStringMap defined JSON data type, needs to implement driver.Valuer, sql.Scanner interface"
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "JSONStringMap": { | |
| "additionalProperties": { | |
| "type": "string" | |
| }, | |
| "type": "object", | |
| "description": "JSONStringMap defiend JSON data type, need to implements driver.Valuer, sql.Scanner interface" | |
| }, | |
| "JSONStringMap": { | |
| "additionalProperties": { | |
| "type": "string" | |
| }, | |
| "type": "object", | |
| "description": "JSONStringMap defined JSON data type, needs to implement driver.Valuer, sql.Scanner interface" | |
| }, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@config/schemas/config_github.schema.json` around lines 337 - 343, Update the
"description" value for the JSONStringMap schema entry to correct the typo
"defiend" to "defined" (entry symbol: JSONStringMap); locate the JSONStringMap
object and replace the misspelled word in its "description" field so the
sentence reads correctly while leaving the rest of the schema unchanged.
| if !since.IsZero() && tag.when.Before(since) { | ||
| return nil | ||
| } | ||
|
|
||
| currentTags[name] = tag.sha | ||
|
|
||
| if _, existed := previousTags[name]; !existed { |
There was a problem hiding this comment.
Tag deletion detection is incorrect when since is set.
You only add tags to currentTags after the since filter, then compare previousTags against this partial map. That can emit false “Tag deleted” changes for tags that still exist but are older than since.
🐛 Proposed fix
tag, ok := resolveTag(repo, ref)
if !ok {
return nil
}
+ // Always track full current state for accurate deletion detection.
+ currentTags[name] = tag.sha
+
if !since.IsZero() && tag.when.Before(since) {
return nil
}
- currentTags[name] = tag.sha
-
if _, existed := previousTags[name]; !existed {
now := time.Now()
changes = append(changes, v1.ChangeResult{Also applies to: 229-243
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scrapers/git/changes.go` around lines 207 - 213, The problem is that
currentTags is only populated after applying the since filter so comparing
previousTags against this partial map produces false "Tag deleted" events; move
the assignment currentTags[name] = tag.sha so it runs for every tag (before the
if !since.IsZero() && tag.when.Before(since) check) and keep the since filter
only for emitting change events (e.g., skip creating "Tag added" or other events
when tag.when is before since). Apply the same fix to the other similar block
(the code around lines 229-243) so both deletion detection and event emission
use the full currentTags map while the since filter only controls which events
are produced.
Summary by CodeRabbit
New Features
Documentation
Tests
Chores