test: replace JSON fixtures with YAML fixture-folder for extraction tests#1948
test: replace JSON fixtures with YAML fixture-folder for extraction tests#1948
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (69)
WalkthroughThis PR introduces a comprehensive configuration extraction and normalization pipeline for scrapers. It restructures public API types (AnalysisResult, ExternalConfigAccess, ExternalID) to support JSON serialization, adds a DB-backed resolver for entity synchronization, implements an end-to-end extraction workflow with full-mode support, and updates scrapers and orchestration to use the new extract package. Changes
Sequence Diagram(s)sequenceDiagram
actor Client
participant RunScraper as RunScraper<br/>(Main)
participant Scraper as Scraper<br/>(e.g., GCP)
participant Extract as extract<br/>Package
participant Resolver as Resolver<br/>(DB-backed)
participant DB as Database
Client->>RunScraper: RunScraper(ctx)
RunScraper->>Scraper: Execute scraper logic
Scraper->>RunScraper: Return ScrapeResults
loop For each result
RunScraper->>Extract: ExtractConfigChangesFromConfig(config)
Extract->>Extract: Parse & validate config<br/>structure
Extract->>RunScraper: Return ExtractedConfig
RunScraper->>Extract: SyncEntities(result)
Extract->>Resolver: SyncExternalUsers/Groups/Roles
Resolver->>DB: Lookup/create external entities
DB->>Resolver: Return IDs & mappings
Resolver->>Extract: Return synchronized entities
RunScraper->>Extract: ResolveAccess(result)
Extract->>Resolver: FindUserIDByAliases(aliases)
Resolver->>DB: Query entity by alias
DB->>Resolver: Return entity ID
Resolver->>Extract: Return resolved IDs
Extract->>Extract: Update access configs<br/>with resolved IDs
RunScraper->>Extract: ApplyAnalysisRules(result)
Extract->>Extract: Enrich with rule-based<br/>Category & Severity
RunScraper->>Extract: SummarizeChanges(changes)
Extract->>Extract: Extract JSON patch paths<br/>into summary text
end
RunScraper->>DB: Save aggregated results
RunScraper->>Client: Return ScrapeOutput
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 15
🧹 Nitpick comments (5)
scrapers/extract/testdata/changes_with_details.yaml (1)
12-18: Consider assertingsummary.changes.skipped == 0for stronger guardrails.Current checks are good; adding skipped-count validation would make regressions in filtering logic easier to catch.
✅ Suggested assertion addition
assertions: - size(changes) == 1 - changes[0].config_type == 'VM' - changes[0].details.old_size == '100GB' - changes[0].details.new_size == '200GB' - changes[0].source == 'azure' - summary.changes.scraped == 1 + - summary.changes.skipped == 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/extract/testdata/changes_with_details.yaml` around lines 12 - 18, Add an assertion to the existing test assertions block to validate that no changes were skipped by asserting summary.changes.skipped == 0; update the assertions list (the same block containing size(changes), changes[0].config_type, etc.) to include this new check so the test fails if any items are filtered out unexpectedly.scrapers/extract/testdata/config_ref_shorthand.yaml (1)
24-38: Use one canonical key for config ref fields in this fixture.This file mixes
config_external_id(expected block) andexternal_config_id(assertions). Please standardize on one key name here to avoid ambiguous fixture semantics.Proposed cleanup
- - config_access[0].external_config_id.config_type == 'Organization' - - config_access[1].external_config_id.config_type == 'Database' + - config_access[0].config_external_id.config_type == 'Organization' + - config_access[1].config_external_id.config_type == 'Database'🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/extract/testdata/config_ref_shorthand.yaml` around lines 24 - 38, The fixture mixes two keys for the same concept—config_external_id and external_config_id—causing assertion mismatch; pick one canonical key (e.g., external_config_id) and update all occurrences so the data and the assertions refer to the same field: change the config blocks under config_access (the entries currently using config_external_id) to the chosen key name and/or update the assertion lines that reference config_access[...] .external_config_id to match the chosen key, ensuring both the config objects and the assertions use the identical identifier.scrapers/extract/analysis.go (1)
10-12: Add a nil guard forresultto harden this helper.This is a small defensive improvement to prevent accidental panics if a nil
*v1.ScrapeResultis ever passed.Suggested fix
func ApplyAnalysisRules(result *v1.ScrapeResult) { - if result.AnalysisResult == nil { + if result == nil || result.AnalysisResult == nil { return }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/extract/analysis.go` around lines 10 - 12, Add a nil guard at the start of ApplyAnalysisRules to return early if the passed-in pointer is nil: check if result == nil (in addition to the existing result.AnalysisResult check) and return when nil to avoid panics when a nil *v1.ScrapeResult is supplied; update the function ApplyAnalysisRules (and its existing guard for result.AnalysisResult) accordingly.scrapers/extract/extract_test.go (1)
16-19: Don’t ignore env conversion errors in test helpers.At Line 16 and Line 18, JSON errors are discarded, which can mask root causes and surface as misleading assertion/CEL failures later. Return an error from
resultToEnvand assert it before evaluating expressions.♻️ Proposed patch
-func resultToEnv(result ExtractedConfig) map[string]any { - raw, _ := json.Marshal(result) - var env map[string]any - _ = json.Unmarshal(raw, &env) - return env +func resultToEnv(result ExtractedConfig) (map[string]any, error) { + raw, err := json.Marshal(result) + if err != nil { + return nil, err + } + var env map[string]any + if err := json.Unmarshal(raw, &env); err != nil { + return nil, err + } + return env, nil } @@ - env := resultToEnv(result) + env, err := resultToEnv(result) + Expect(err).ToNot(HaveOccurred())Also applies to: 73-73
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/extract/extract_test.go` around lines 16 - 19, The helper resultToEnv currently swallows JSON errors (json.Marshal and json.Unmarshal), which can hide root causes; change resultToEnv to return (map[string]any, error), propagate and return any errors from json.Marshal and json.Unmarshal instead of discarding them, update callers (tests referencing resultToEnv) to check and assert the returned error before using the map, and apply the same fix for the other occurrence noted around line 73; use the function name resultToEnv to locate and modify the helper and all its call sites.scrapers/extract/schema.yaml (1)
315-323: Enforce non-emptyexternal_configsto match the documented rule.Description says this must be a non-empty array, but the schema currently allows
[].Suggested fix
external_configs: type: array + minItems: 1 description: > Related config references. Alternative to single external_id.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/extract/schema.yaml` around lines 315 - 323, The schema allows external_configs to be an empty array but the description requires a non-empty array; update the external_configs definition in schema.yaml (the external_configs property that references `#/`$defs/ExternalID) to enforce at least one element by adding "minItems: 1" (and optionally "type: array" is already present) so the array cannot be empty and remains mutually exclusive with external_id as documented.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/v1/interface.go`:
- Around line 216-223: The Pretty rendering unconditionally prepends a .Space()
before saved/skipped/deleted segments, causing a leading space when those are
the first segment; in EntitySummary.Pretty() (the code that builds t using
.Appendf, .AddText and .Space), change the logic to only call .Space() when
there's prior content (e.g., check t is non-empty or maintain a local "first"
flag) before appending each of the Saved, Skipped, and Deleted segments so the
first appended segment is not preceded by whitespace.
In `@db/update.go`:
- Around line 405-409: The direct cache lookup currently runs when only
result.AnalysisResult.ExternalID is set, which can misassociate analyses; change
the condition so ctx.TempCache().Find(...) is called only if both
result.AnalysisResult.ExternalID and result.AnalysisResult.ConfigType are
non-empty (i.e., guard with a check that ConfigType != "" && ExternalID != ""),
and apply the same guard to the second similar block (the code handling lines
415-424) so the safer fallback loop is used whenever ConfigType is missing.
In `@fixtures/data/extraction/alias_keys.yaml`:
- Around line 1-35: This YAML fixture file alias_keys.yaml is an orphan not used
by the extraction test harness and duplicates a working fixture; delete this
duplicate fixture to avoid confusion. If you must keep it instead, move it into
the test harness's testdata directory and populate it with the required
non-empty assertions and any pre-population data so it mirrors the canonical
alias_keys.yaml used by the tests. Ensure the fixture's top-level keys (input,
expected) remain correct and that assertions are present so
scrapers/extract/extract_test.go will load and execute it.
In `@fixtures/data/extraction/canonical_priority.yaml`:
- Around line 1-17: The YAML fixture 'canonical_priority.yaml' in
fixtures/data/extraction is orphaned (a duplicate exists in the active test
fixtures) — remove this unused file from the repository or move it into the
active test fixtures directory where scrapers/extract/testdata fixtures live so
there is only one canonical copy; ensure you keep the existing tested version
(the duplicate under scrapers/extract/testdata) and update/remove any references
to the orphaned fixture to keep the extraction tests unchanged.
In `@scrapers/extract/changes.go`:
- Line 31: The append to changes[i].Summary concatenates strings.Join(paths, ",
") without a separator; update the logic around changes[i].Summary so if
changes[i].Summary is non-empty you first append a separator (e.g., a single
space or " ") before adding strings.Join(paths, ", "), otherwise just set it to
the joined paths; reference changes[i].Summary and the paths variable when
making the conditional append to ensure output is not merged into the previous
text.
In `@scrapers/extract/extract.go`:
- Around line 214-221: The sync is happening before alias resolution so
resolveConfigAccessAliases (which can create users/roles/groups) appends new
entities after SyncEntities, allowing duplicate identities; change the flow so
resolver-created entities are synced: either call ResolveAccess(scraperID,
&result) before SyncEntities(resolver, scraperID, &result) or invoke
SyncEntities again immediately after ResolveAccess (ensure you update the code
paths in the extract flow that call SyncEntities and ResolveAccess so that any
new entities from resolveConfigAccessAliases are included in the resolver sync).
Target symbols: SyncEntities, ResolveAccess, resolveConfigAccessAliases; ensure
only one final sync runs after alias resolution to prevent duplicates.
- Around line 600-614: The validation loops over result.ConfigAccess (type
v1.ExternalConfigAccess) and result.AccessLogs (type v1.ExternalConfigAccessLog)
currently accept entries when ConfigID is nil but only check ExternalID != "";
update both checks to use the full external-ref emptiness helper (e.g. call
!ConfigExternalID.IsEmpty() on the respective struct) instead of comparing
ExternalID to an empty string so entries missing ConfigType are rejected; modify
the conditions in the loops that build validAccess and validLogs (the checks
around ca and al) to use !<struct>.ConfigExternalID.IsEmpty().
In `@scrapers/extract/fullmode.go`:
- Around line 12-18: The loop re-uses originalConfig for every scraped item
causing duplicate extraction; change the call to ExtractConfigChangesFromConfig
to use each item’s own config (scraped[i].Config) with a fallback to
originalConfig when nil. Concretely, inside the for i := range scraped loop
compute cfg := scraped[i].Config; if cfg == nil { cfg = originalConfig } then
call ExtractConfigChangesFromConfig(resolver, scraperID, cfg), keep assigning
scraped[i].Error on error and continue, and then merge the returned extracted
into all as before (all.Merge).
In `@scrapers/extract/README.md`:
- Around line 384-386: The markdown fenced code blocks in
scrapers/extract/README.md (the blocks containing "ConfigID == uuid.Nil AND
(ConfigType == "" OR ExternalID == "")" and the block starting with
"uuid.Parse(value) → ...") are missing language identifiers; update both fenced
blocks to include a language tag such as text (e.g., change ``` to ```text) so
markdownlint MD040 warnings are resolved, and repeat the same change for the
other occurrence at lines ~411-414.
In `@scrapers/extract/schema.yaml`:
- Around line 159-163: The schema's ChangeResult.required currently forces
external_id/external_change_id even when configID lookup should be allowed;
replace the flat required array with a oneOf (or anyOf) that enforces either {
required: [ "configID" ] } OR { required: [ "external_id", "external_change_id"
] } (include config_type in the external lookup branch if that field is needed),
and apply the same oneOf change to the second similar block (the section
referenced around the other occurrence). Update the ChangeResult schema
definition (and the matching schema at the other occurrence) so the validator
accepts configID-only payloads while still requiring
external_id+external_change_id for the external-lookup variant.
- Around line 47-53: The schema for the config property currently restricts it
to object (the "config" key in schema.yaml), which contradicts the extractor
contract that allows arrays, scalars and null; update the "config" schema to
accept all valid JSON/YAML value types (e.g., allow multiple types such as
object, array, string, number, boolean, null or remove the specific type
constraint) so the extractor can persist non-object payloads; ensure the change
is made on the "config" field definition so validation no longer rejects valid
non-object configs.
- Around line 183-188: The schema mistakenly applies the ChangeAction enum to
external_change_id (external_change_id currently references $defs/ChangeAction),
which constrains IDs to values like "delete|ignore|move"; change
external_change_id to be a plain string (remove the $ref to $defs/ChangeAction)
and add a separate field (e.g., action) that references $defs/ChangeAction for
the intended enum; update the object definition near external_change_id and
change_type to remove the enum constraint from external_change_id and introduce
action: $ref: "#/$defs/ChangeAction".
In `@scrapers/extract/testdata/changes_and_access.yaml`:
- Around line 31-35: The test summary is missing an assertion for
changes.skipped; update the YAML assertions block to include a line asserting
summary.changes.skipped == 0 alongside summary.changes.scraped == 1 so the
summary validation covers skipped counts for changes as it does for
config_access and access_logs.
In `@scrapers/extract/testdata/config_ref_defaults.yaml`:
- Around line 24-36: The fixture mixes two key names; use a single canonical key
(config_external_id) everywhere: update the entries that currently use
external_config_id (e.g., the records with ids a2 and a3 and the duplicate
occurrence around lines 40-43) to use config_external_id so the fixture keys
match the expected assertions and avoid ambiguity.
In `@scrapers/gcp/security_center.go`:
- Line 115: Set analysis.ConfigType whenever setting analysis.ExternalID so
upsertAnalysis can resolve by both keys; update the code around the assignment
to analysis.ExternalID (the block that sets analysis.Resource.Name) to also set
analysis.ConfigType to the appropriate constant/enum/value used for Security
Center resources (the same ConfigType expected by upsertAnalysis). Ensure the
value matches the key used in upsertAnalysis lookup so findings are not dropped.
---
Nitpick comments:
In `@scrapers/extract/analysis.go`:
- Around line 10-12: Add a nil guard at the start of ApplyAnalysisRules to
return early if the passed-in pointer is nil: check if result == nil (in
addition to the existing result.AnalysisResult check) and return when nil to
avoid panics when a nil *v1.ScrapeResult is supplied; update the function
ApplyAnalysisRules (and its existing guard for result.AnalysisResult)
accordingly.
In `@scrapers/extract/extract_test.go`:
- Around line 16-19: The helper resultToEnv currently swallows JSON errors
(json.Marshal and json.Unmarshal), which can hide root causes; change
resultToEnv to return (map[string]any, error), propagate and return any errors
from json.Marshal and json.Unmarshal instead of discarding them, update callers
(tests referencing resultToEnv) to check and assert the returned error before
using the map, and apply the same fix for the other occurrence noted around line
73; use the function name resultToEnv to locate and modify the helper and all
its call sites.
In `@scrapers/extract/schema.yaml`:
- Around line 315-323: The schema allows external_configs to be an empty array
but the description requires a non-empty array; update the external_configs
definition in schema.yaml (the external_configs property that references
`#/`$defs/ExternalID) to enforce at least one element by adding "minItems: 1" (and
optionally "type: array" is already present) so the array cannot be empty and
remains mutually exclusive with external_id as documented.
In `@scrapers/extract/testdata/changes_with_details.yaml`:
- Around line 12-18: Add an assertion to the existing test assertions block to
validate that no changes were skipped by asserting summary.changes.skipped == 0;
update the assertions list (the same block containing size(changes),
changes[0].config_type, etc.) to include this new check so the test fails if any
items are filtered out unexpectedly.
In `@scrapers/extract/testdata/config_ref_shorthand.yaml`:
- Around line 24-38: The fixture mixes two keys for the same
concept—config_external_id and external_config_id—causing assertion mismatch;
pick one canonical key (e.g., external_config_id) and update all occurrences so
the data and the assertions refer to the same field: change the config blocks
under config_access (the entries currently using config_external_id) to the
chosen key name and/or update the assertion lines that reference
config_access[...] .external_config_id to match the chosen key, ensuring both
the config objects and the assertions use the identical identifier.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a491afca-53a2-48ed-8caf-487fad7c2dc1
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (61)
api/v1/interface.goapi/v1/types.godb/external_cache.godb/external_resolver.godb/update.gofixtures/data/extraction/alias_keys.yamlfixtures/data/extraction/canonical_keys.yamlfixtures/data/extraction/canonical_priority.yamlfixtures/data/extraction/shorthand_fields.yamlgo.modscrapers/extract/README.mdscrapers/extract/analysis.goscrapers/extract/changes.goscrapers/extract/extract.goscrapers/extract/extract_suite_test.goscrapers/extract/extract_test.goscrapers/extract/fullmode.goscrapers/extract/mock_resolver_test.goscrapers/extract/schema.yamlscrapers/extract/testdata/access_invalid_ref.yamlscrapers/extract/testdata/access_logs_basic.yamlscrapers/extract/testdata/access_logs_shorthand_external.yamlscrapers/extract/testdata/access_logs_shorthand_uuid.yamlscrapers/extract/testdata/access_skipped_config_not_found.yamlscrapers/extract/testdata/alias_keys.yamlscrapers/extract/testdata/analysis_basic.yamlscrapers/extract/testdata/analysis_default_from_parent.yamlscrapers/extract/testdata/analysis_invalid_ref.yamlscrapers/extract/testdata/analysis_no_parent.yamlscrapers/extract/testdata/canonical_keys.yamlscrapers/extract/testdata/canonical_priority.yamlscrapers/extract/testdata/changes_and_access.yamlscrapers/extract/testdata/changes_basic.yamlscrapers/extract/testdata/changes_default_from_parent.yamlscrapers/extract/testdata/changes_invalid_ref.yamlscrapers/extract/testdata/changes_no_parent.yamlscrapers/extract/testdata/changes_with_details.yamlscrapers/extract/testdata/combined_aliases.yamlscrapers/extract/testdata/config_ref_defaults.yamlscrapers/extract/testdata/config_ref_shorthand.yamlscrapers/extract/testdata/config_ref_uuid.yamlscrapers/extract/testdata/drop_nil_user_groups.yamlscrapers/extract/testdata/empty_config_access.yamlscrapers/extract/testdata/external_config_id.yamlscrapers/extract/testdata/group_aliases.yamlscrapers/extract/testdata/json_roundtrip.yamlscrapers/extract/testdata/missing_config_access.yamlscrapers/extract/testdata/remap_user_groups.yamlscrapers/extract/testdata/resolve_after_sync.yamlscrapers/extract/testdata/resolve_aliases.yamlscrapers/extract/testdata/reuse_existing_ids.yamlscrapers/extract/testdata/role_aliases.yamlscrapers/extract/testdata/shorthand_fields.yamlscrapers/extract/testdata/stub_entity_creation.yamlscrapers/extract/testdata/unmarshal_all_aliases.yamlscrapers/extract/testdata/user_aliases.yamlscrapers/extraction_unit_test.goscrapers/gcp/security_center.goscrapers/kubernetes/informers_test.goscrapers/run.goscrapers/trivy/trivy.go
💤 Files with no reviewable changes (1)
- scrapers/extraction_unit_test.go
| t = t.Space().Appendf("%d", e.Saved).AddText(" saved", "success") | ||
| } | ||
| if e.Skipped > 0 { | ||
| parts = append(parts, fmt.Sprintf("skipped=%d", e.Skipped)) | ||
| t = t.Space().Appendf("%d", e.Skipped).AddText(" skipped", "warning") | ||
| } | ||
| if e.Deleted > 0 { | ||
| parts = append(parts, fmt.Sprintf("deleted=%d", e.Deleted)) | ||
| t = t.Space().Appendf("%d", e.Deleted).AddText(" deleted", "error") | ||
| } |
There was a problem hiding this comment.
Prevent leading whitespace in EntitySummary.Pretty().
At Line 216/219/222, .Space() is unconditional for saved/skipped/deleted segments. If the first rendered segment is not scraped, output starts with a leading space.
♻️ Proposed patch
func (e EntitySummary) Pretty() api.Text {
t := clicky.Text("")
+ first := true
+ appendPart := func(count int, label, style string) {
+ if count == 0 {
+ return
+ }
+ if !first {
+ t = t.Space()
+ }
+ t = t.Appendf("%d", count).AddText(" "+label, style)
+ first = false
+ }
- if e.Scraped > 0 {
- t = t.Appendf("%d", e.Scraped).AddText(" scraped", "muted")
- }
- if e.Saved > 0 {
- t = t.Space().Appendf("%d", e.Saved).AddText(" saved", "success")
- }
- if e.Skipped > 0 {
- t = t.Space().Appendf("%d", e.Skipped).AddText(" skipped", "warning")
- }
- if e.Deleted > 0 {
- t = t.Space().Appendf("%d", e.Deleted).AddText(" deleted", "error")
- }
+ appendPart(e.Scraped, "scraped", "muted")
+ appendPart(e.Saved, "saved", "success")
+ appendPart(e.Skipped, "skipped", "warning")
+ appendPart(e.Deleted, "deleted", "error")
return t
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/v1/interface.go` around lines 216 - 223, The Pretty rendering
unconditionally prepends a .Space() before saved/skipped/deleted segments,
causing a leading space when those are the first segment; in
EntitySummary.Pretty() (the code that builds t using .Appendf, .AddText and
.Space), change the logic to only call .Space() when there's prior content
(e.g., check t is non-empty or maintain a local "first" flag) before appending
each of the Saved, Skipped, and Deleted segments so the first appended segment
is not preceded by whitespace.
| input: | ||
| config: | ||
| name: Test Org | ||
| access: | ||
| - id: a1 | ||
| external_config_id: | ||
| config_type: Org | ||
| external_id: org-1 | ||
| user: alice | ||
| users: | ||
| - name: Alice | ||
| aliases: [alice] | ||
| roles: | ||
| - name: Admin | ||
| aliases: [admin] | ||
| groups: | ||
| - name: Devs | ||
| aliases: [devs] | ||
| user_groups: | ||
| - external_user_id: "00000000-0000-0000-0000-000000000001" | ||
| external_group_id: "00000000-0000-0000-0000-000000000002" | ||
|
|
||
| expected: | ||
| config_access: | ||
| - id: a1 | ||
| external_user_aliases: [alice] | ||
| external_users: | ||
| - name: Alice | ||
| external_roles: | ||
| - name: Admin | ||
| external_groups: | ||
| - name: Devs | ||
| external_user_groups: | ||
| - external_user_id: "00000000-0000-0000-0000-000000000001" | ||
| external_group_id: "00000000-0000-0000-0000-000000000002" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find the fixture file and check its location
find . -name "alias_keys.yaml" -type f
# Search for references to this fixture in the codebase
rg -n "alias_keys" --type go --type yaml
# Search for test harnesses that load fixtures from fixtures/data/extraction
rg -n "fixtures/data/extraction" --type go
# Search for test patterns related to fixture loading
rg -n "assertions" --type go | head -20Repository: flanksource/config-db
Length of output: 532
🏁 Script executed:
# Check if there are any tests or code that load from fixtures/data/extraction
rg -n "fixtures/data/extraction" --type go --type yaml
# Check what's actually in the scrapers/extract/testdata version
head -40 "./scrapers/extract/testdata/alias_keys.yaml"
# See if fixtures/data/extraction is used by any other test suite
fd -t f -path "*/extraction/*" "*.yaml" | head -20
# Check if there's any test loading from fixtures/data/extraction
rg -n "fixtures" scrapers/extract/extract_test.go | head -20Repository: flanksource/config-db
Length of output: 1292
🏁 Script executed:
# Check if there's any other test that loads from fixtures/data/extraction
rg -rn "fixtures/data/extraction" --type go
# Check if there's any test in other directories that might use fixtures/data/extraction
find . -name "*_test.go" -type f -exec grep -l "fixtures/data/extraction" {} \;
# Check the test file to see the fixture loading pattern more clearly
sed -n '30,60p' scrapers/extract/extract_test.goRepository: flanksource/config-db
Length of output: 1054
🏁 Script executed:
# Check if there are any other fixtures test harnesses or loaders
find . -name "*_test.go" -type f | xargs grep -l "fixtures/data"
# Check what test files exist in scrapers/extract
ls -la scrapers/extract/*_test.go
# Double-check: are there any references to fixtures/data/extraction in Go code
rg "fixtures/data/extraction" --type goRepository: flanksource/config-db
Length of output: 322
This fixture is not executed by the extraction test harness and should be removed or moved.
The test harness in scrapers/extract/extract_test.go loads fixtures only from testdata/*.yaml using filepath.Glob("testdata/*.yaml"). This fixture lives in fixtures/data/extraction/ and is never referenced. Additionally, the test explicitly requires assertions to be non-empty; this fixture has none. An identical, properly configured version exists in scrapers/extract/testdata/alias_keys.yaml with assertions and pre-population data. Remove the duplicate orphaned fixture to avoid future maintenance confusion.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@fixtures/data/extraction/alias_keys.yaml` around lines 1 - 35, This YAML
fixture file alias_keys.yaml is an orphan not used by the extraction test
harness and duplicates a working fixture; delete this duplicate fixture to avoid
confusion. If you must keep it instead, move it into the test harness's testdata
directory and populate it with the required non-empty assertions and any
pre-population data so it mirrors the canonical alias_keys.yaml used by the
tests. Ensure the fixture's top-level keys (input, expected) remain correct and
that assertions are present so scrapers/extract/extract_test.go will load and
execute it.
| continue | ||
| } | ||
|
|
||
| changes[i].Summary += strings.Join(paths, ", ") |
There was a problem hiding this comment.
Missing separator when appending to non-empty Summary.
If change.Summary already contains text, the joined paths are concatenated directly without any separator, resulting in malformed output like "existing texta, b" instead of "existing text a, b".
Proposed fix
- changes[i].Summary += strings.Join(paths, ", ")
+ pathStr := strings.Join(paths, ", ")
+ if changes[i].Summary != "" {
+ changes[i].Summary += " "
+ }
+ changes[i].Summary += pathStr📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| changes[i].Summary += strings.Join(paths, ", ") | |
| pathStr := strings.Join(paths, ", ") | |
| if changes[i].Summary != "" { | |
| changes[i].Summary += " " | |
| } | |
| changes[i].Summary += pathStr |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scrapers/extract/changes.go` at line 31, The append to changes[i].Summary
concatenates strings.Join(paths, ", ") without a separator; update the logic
around changes[i].Summary so if changes[i].Summary is non-empty you first append
a separator (e.g., a single space or " ") before adding strings.Join(paths, ",
"), otherwise just set it to the joined paths; reference changes[i].Summary and
the paths variable when making the conditional append to ensure output is not
merged into the previous text.
| Mutual exclusivity: configID (direct UUID lookup) vs external_id + config_type | ||
| (external lookup). If configID is set, external_id/config_type are ignored for lookup. | ||
| required: | ||
| - external_id | ||
| - external_change_id |
There was a problem hiding this comment.
ChangeResult.required conflicts with the documented configID lookup path.
The schema says configID can bypass external_id + config_type, but Line [162] always requires external_id. That invalidates legitimate configID-only payloads.
Suggested fix
- required:
- - external_id
- - external_change_id
- - change_type
+ required:
+ - external_change_id
+ - change_type
+ oneOf:
+ - required: [configID]
+ - required: [external_id, config_type]Also applies to: 225-231
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scrapers/extract/schema.yaml` around lines 159 - 163, The schema's
ChangeResult.required currently forces external_id/external_change_id even when
configID lookup should be allowed; replace the flat required array with a oneOf
(or anyOf) that enforces either { required: [ "configID" ] } OR { required: [
"external_id", "external_change_id" ] } (include config_type in the external
lookup branch if that field is needed), and apply the same oneOf change to the
second similar block (the section referenced around the other occurrence).
Update the ChangeResult schema definition (and the matching schema at the other
occurrence) so the validator accepts configID-only payloads while still
requiring external_id+external_change_id for the external-lookup variant.
| - summary.config_access.scraped == 1 | ||
| - summary.config_access.skipped == 0 | ||
| - summary.changes.scraped == 1 | ||
| - summary.access_logs.scraped == 1 | ||
| - summary.access_logs.skipped == 0 |
There was a problem hiding this comment.
Add the missing changes.skipped summary assertion.
You assert skipped == 0 for config_access and access_logs, but not for changes (Lines 31–35). This leaves a small regression gap in summary validation.
✅ Suggested assertion update
assertions:
- summary.config_access.scraped == 1
- summary.config_access.skipped == 0
- summary.changes.scraped == 1
+ - summary.changes.skipped == 0
- summary.access_logs.scraped == 1
- summary.access_logs.skipped == 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scrapers/extract/testdata/changes_and_access.yaml` around lines 31 - 35, The
test summary is missing an assertion for changes.skipped; update the YAML
assertions block to include a line asserting summary.changes.skipped == 0
alongside summary.changes.scraped == 1 so the summary validation covers skipped
counts for changes as it does for config_access and access_logs.
| config_external_id: | ||
| config_type: Organization | ||
| external_id: org-1 | ||
| external_user_aliases: [alice] | ||
| - id: a2 | ||
| config_external_id: | ||
| config_type: Organization | ||
| external_id: org-1 | ||
| external_user_aliases: [bob] | ||
| - id: a3 | ||
| config_external_id: | ||
| config_type: Database | ||
| external_id: db-1 |
There was a problem hiding this comment.
Use one canonical key name for config refs in this fixture.
expected uses config_external_id, while assertions use external_config_id. Keep these aligned to avoid ambiguity in future fixture consumers.
Suggested fix
- config_external_id:
+ external_config_id:
config_type: Organization
external_id: org-1
@@
- config_external_id:
+ external_config_id:
config_type: Organization
external_id: org-1
@@
- config_external_id:
+ external_config_id:
config_type: Database
external_id: db-1Also applies to: 40-43
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scrapers/extract/testdata/config_ref_defaults.yaml` around lines 24 - 36, The
fixture mixes two key names; use a single canonical key (config_external_id)
everywhere: update the entries that currently use external_config_id (e.g., the
records with ids a2 and a3 and the duplicate occurrence around lines 40-43) to
use config_external_id so the fixture keys match the expected assertions and
avoid ambiguity.
| } | ||
|
|
||
| analysis.ExternalConfigs = append(analysis.ExternalConfigs, v1.ExternalID{ExternalID: finding.Resource.Name}) | ||
| analysis.ExternalID = finding.Resource.Name |
There was a problem hiding this comment.
Set ConfigType alongside ExternalID to avoid dropping findings.
upsertAnalysis resolves by {ConfigType, ExternalID}. With only ExternalID set here, many findings won’t match a config item and can be silently skipped.
🛠️ Proposed fix
- analysis.ExternalID = finding.Resource.Name
+ analysis.ExternalID = finding.Resource.Name
+ analysis.ConfigType = finding.Resource.Type📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| analysis.ExternalID = finding.Resource.Name | |
| analysis.ExternalID = finding.Resource.Name | |
| analysis.ConfigType = finding.Resource.Type |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scrapers/gcp/security_center.go` at line 115, Set analysis.ConfigType
whenever setting analysis.ExternalID so upsertAnalysis can resolve by both keys;
update the code around the assignment to analysis.ExternalID (the block that
sets analysis.Resource.Name) to also set analysis.ConfigType to the appropriate
constant/enum/value used for Security Center resources (the same ConfigType
expected by upsertAnalysis). Ensure the value matches the key used in
upsertAnalysis lookup so findings are not dropped.
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (5)
scrapers/extract/README.md (1)
390-392:⚠️ Potential issue | 🟡 MinorAdd language identifiers to fenced code blocks (MD040).
Both blocks are missing a fence language and will keep triggering markdownlint warnings.
🧹 Suggested fix
-``` +```text ConfigID == uuid.Nil AND (ConfigType == "" OR ExternalID == "")@@
-
+text
uuid.Parse(value) → if err == nil: treat as ConfigID (UUID)
→ if err != nil: treat as ExternalID (string)Also applies to: 417-420
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/extract/README.md` around lines 390 - 392, The markdown code fences in the README are missing language identifiers (MD040); update the two fenced blocks containing the expressions "ConfigID == uuid.Nil AND (ConfigType == "" OR ExternalID == "")" and the uuid.Parse example so they start with a language tag (e.g., ```text) instead of just ``` — locate the blocks with those exact contents (the ConfigID expression and the uuid.Parse(value) → ... example) and change their opening fences to ```text to satisfy markdownlint.scrapers/extract/schema.yaml (1)
158-163:⚠️ Potential issue | 🟠 Major
ChangeResultis still internally contradictory onconfigIDvsexternal_id.The description says
configIDcan bypass external lookup, but Line [161] still requiresexternal_id. This blocks or contradicts the configID-only path.🧭 Suggested schema correction
ChangeResult: type: object @@ - required: - - external_id - - external_change_id - - change_type + required: + - external_change_id + - change_type + oneOf: + - required: [configID] + - required: [external_id, config_type]Also applies to: 225-231
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/extract/schema.yaml` around lines 158 - 163, The ChangeResult schema is contradictory: it documents mutual exclusivity between configID and external_id+config_type but still lists external_id in required; update the schema to express the mutual-exclusion properly (e.g., replace the flat required list with a oneOf/anyOf that defines two valid shapes: one object requiring configID only, and another requiring external_id and config_type (and external_change_id/change_type as needed)), and apply the same fix to the duplicate occurrence referenced around the other block (lines ~225-231). Ensure the schema names (ChangeResult, configID, external_id, config_type, external_change_id, change_type) are used so the validator accepts either the configID-only path or the external lookup path but not both.api/v1/interface.go (1)
210-223:⚠️ Potential issue | 🟡 Minor
EntitySummary.Pretty()can still emit leading whitespace.When
Scraped == 0andSaved/Skipped/Deleted > 0, Line [216]/[219]/[222] prepend.Space()before the first token.♻️ Suggested fix
func (e EntitySummary) Pretty() api.Text { t := clicky.Text("") + first := true + appendPart := func(count int, label, style string) { + if count == 0 { + return + } + if !first { + t = t.Space() + } + t = t.Appendf("%d", count).AddText(" "+label, style) + first = false + } - if e.Scraped > 0 { - t = t.Appendf("%d", e.Scraped).AddText(" scraped", "muted") - } - if e.Saved > 0 { - t = t.Space().Appendf("%d", e.Saved).AddText(" saved", "success") - } - if e.Skipped > 0 { - t = t.Space().Appendf("%d", e.Skipped).AddText(" skipped", "warning") - } - if e.Deleted > 0 { - t = t.Space().Appendf("%d", e.Deleted).AddText(" deleted", "error") - } + appendPart(e.Scraped, "scraped", "muted") + appendPart(e.Saved, "saved", "success") + appendPart(e.Skipped, "skipped", "warning") + appendPart(e.Deleted, "deleted", "error") return t }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/v1/interface.go` around lines 210 - 223, EntitySummary.Pretty() can emit a leading space because Saved/Skipped/Deleted branches always call .Space() before the first token; change each branch (the code inside EntitySummary.Pretty for Saved, Skipped, Deleted) to only prepend a space when the existing text is non-empty—i.e. detect whether t is empty (or maintain a firstToken boolean) and if empty call Appendf(...).AddText(...) directly, otherwise call t = t.Space().Appendf(...).AddText(...); apply this change to the Saved, Skipped and Deleted branches so no leading whitespace is produced.scrapers/extract/extract.go (2)
612-624:⚠️ Potential issue | 🟠 MajorValidate config references with
IsEmpty()instead ofExternalIDonly.Line [612] and Line [623] accept/reject records based only on
ConfigExternalID.ExternalID, which allows incomplete refs and ignores other valid external-ref forms.Suggested patch
- if ca.ConfigID != uuid.Nil || ca.ConfigExternalID.ExternalID != "" { + if ca.ConfigID != uuid.Nil || !ca.ConfigExternalID.IsEmpty() { validAccess = append(validAccess, ca) continue } @@ - if al.ConfigID != uuid.Nil || al.ConfigExternalID.ExternalID != "" { + if al.ConfigID != uuid.Nil || !al.ConfigExternalID.IsEmpty() { validLogs = append(validLogs, al) continue }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/extract/extract.go` around lines 612 - 624, The validation currently checks only ConfigID != uuid.Nil or ConfigExternalID.ExternalID != "" (in the ca and al loops), which misses other non-empty external ref forms; update both checks to use the ConfigExternalID.IsEmpty() helper instead (i.e., treat a non-empty ref as !ConfigExternalID.IsEmpty()). Concretely, in the first loop guarding variable ca (type with ConfigID and ConfigExternalID) and in the second loop for al (v1.ExternalConfigAccessLog), replace the "ConfigExternalID.ExternalID != \"\"" condition with "!ConfigExternalID.IsEmpty()" so all valid external-ref shapes are recognized. Ensure you update both occurrences so result.Summary.ConfigAccess.Skipped and result.AddWarning semantics remain unchanged.
214-221:⚠️ Potential issue | 🟠 MajorRun resolver sync after alias materialization to avoid unsynced/duplicate entities.
At Line [215],
SyncEntitiesruns before Line [218]ResolveAccess, but alias misses inresolveConfigAccessAliases(Lines [334], [351], [368]) append new users/roles/groups afterward. Those additions are not synced in this flow and can duplicate identities within the batch.Also applies to: 325-376
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/extract/extract.go` around lines 214 - 221, The current flow calls SyncEntities(resolver, scraperID, &result) before ResolveAccess, but resolveConfigAccessAliases (used by ResolveAccess) appends new users/roles/groups later, causing unsynced/duplicate identities; fix by ensuring entity sync happens after alias materialization: move the SyncEntities call to run after ResolveAccess has completed (or invoke a second SyncEntities pass immediately after ResolveAccess/resolveConfigAccessAliases), updating calls around SyncEntities and ResolveAccess so SyncEntities runs once aliases are materialized (references: SyncEntities, ResolveAccess, resolveConfigAccessAliases).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/v1/interface.go`:
- Around line 54-68: AnalysisResult has new fields ExternalID, ConfigType, and
ExternalConfigs that are not persisted because persistence uses
ToConfigAnalysis(); update the conversion/persistence flow so these fields are
carried through: modify ToConfigAnalysis() (and any callers in db/update.go) to
include ExternalID, ConfigType, and ExternalConfigs in the produced config
analysis struct or ensure an alternative conversion function is used that
preserves them; adjust db/update.go persistence logic to consume the extended
conversion output so the new targeting fields are stored.
In `@api/v1/types.go`:
- Line 191: ExternalID's emptiness check includes ConfigID (so IsEmpty() returns
false for ConfigID-only IDs) but ExternalID.Find() still doesn't use ConfigID in
its lookup, causing configID-only references to fail resolution; update the
lookup logic in ExternalID.Find (and any helper it calls) to include ConfigID as
a searchable key when ConfigID is non-empty (i.e., add a filter/branch that
queries by ConfigID or combines ConfigID with existing criteria), or
alternatively change ExternalID.IsEmpty to ignore ConfigID—prefer updating
ExternalID.Find to honor ConfigID so that ExternalID.IsEmpty, ExternalID.Find,
and any call paths using them are consistent.
In `@scrapers/extract/extract.go`:
- Around line 545-553: The loop over result.AccessLogs currently only checks
result.AccessLogs[i].ConfigID before applying top-level defaults, which can
overwrite an existing per-entry ConfigExternalID; update the condition so you
skip setting defaults if either ConfigID is set (ConfigID != uuid.Nil) OR
ConfigExternalID is set (!ConfigExternalID.IsEmpty()), and when assigning
defaults only set ConfigID if defaultConfigID != uuid.Nil and only set
ConfigExternalID if defaultExternalID is non-empty, using the existing symbols
result.AccessLogs, ConfigID, ConfigExternalID, defaultConfigID, and
defaultExternalID.
In `@scrapers/extract/README.md`:
- Line 229: The link fragment in the table row targeting
`#changeaction-allowed-values` is invalid because the actual heading is `###
ChangeAction Values`; fix by making the fragment match the heading (change the
link to `#changeaction-values`) or rename the heading to `ChangeAction Allowed
Values` so the fragment `#changeaction-allowed-values` is valid; update the
table entry referencing `ChangeAction`/`ChangeAction Values` accordingly to keep
the anchor consistent.
- Around line 381-385: The README docs for ExternalID.ConfigID are out of sync:
update the field table to match the code where ConfigID is defined as type
string (not uuid.UUID) and note emptiness is checked with ConfigID == "" rather
than uuid.Nil; ensure the same correction is applied to the other occurrence of
the ExternalID docs (the second table block mentioned) and keep the JSON tag
`config_id,omitempty` and the description meaning (direct config identifier)
consistent with the api/v1/types.go definition of ConfigID and its emptiness
rule.
---
Duplicate comments:
In `@api/v1/interface.go`:
- Around line 210-223: EntitySummary.Pretty() can emit a leading space because
Saved/Skipped/Deleted branches always call .Space() before the first token;
change each branch (the code inside EntitySummary.Pretty for Saved, Skipped,
Deleted) to only prepend a space when the existing text is non-empty—i.e. detect
whether t is empty (or maintain a firstToken boolean) and if empty call
Appendf(...).AddText(...) directly, otherwise call t =
t.Space().Appendf(...).AddText(...); apply this change to the Saved, Skipped and
Deleted branches so no leading whitespace is produced.
In `@scrapers/extract/extract.go`:
- Around line 612-624: The validation currently checks only ConfigID != uuid.Nil
or ConfigExternalID.ExternalID != "" (in the ca and al loops), which misses
other non-empty external ref forms; update both checks to use the
ConfigExternalID.IsEmpty() helper instead (i.e., treat a non-empty ref as
!ConfigExternalID.IsEmpty()). Concretely, in the first loop guarding variable ca
(type with ConfigID and ConfigExternalID) and in the second loop for al
(v1.ExternalConfigAccessLog), replace the "ConfigExternalID.ExternalID != \"\""
condition with "!ConfigExternalID.IsEmpty()" so all valid external-ref shapes
are recognized. Ensure you update both occurrences so
result.Summary.ConfigAccess.Skipped and result.AddWarning semantics remain
unchanged.
- Around line 214-221: The current flow calls SyncEntities(resolver, scraperID,
&result) before ResolveAccess, but resolveConfigAccessAliases (used by
ResolveAccess) appends new users/roles/groups later, causing unsynced/duplicate
identities; fix by ensuring entity sync happens after alias materialization:
move the SyncEntities call to run after ResolveAccess has completed (or invoke a
second SyncEntities pass immediately after
ResolveAccess/resolveConfigAccessAliases), updating calls around SyncEntities
and ResolveAccess so SyncEntities runs once aliases are materialized
(references: SyncEntities, ResolveAccess, resolveConfigAccessAliases).
In `@scrapers/extract/README.md`:
- Around line 390-392: The markdown code fences in the README are missing
language identifiers (MD040); update the two fenced blocks containing the
expressions "ConfigID == uuid.Nil AND (ConfigType == "" OR ExternalID == "")"
and the uuid.Parse example so they start with a language tag (e.g., ```text)
instead of just ``` — locate the blocks with those exact contents (the ConfigID
expression and the uuid.Parse(value) → ... example) and change their opening
fences to ```text to satisfy markdownlint.
In `@scrapers/extract/schema.yaml`:
- Around line 158-163: The ChangeResult schema is contradictory: it documents
mutual exclusivity between configID and external_id+config_type but still lists
external_id in required; update the schema to express the mutual-exclusion
properly (e.g., replace the flat required list with a oneOf/anyOf that defines
two valid shapes: one object requiring configID only, and another requiring
external_id and config_type (and external_change_id/change_type as needed)), and
apply the same fix to the duplicate occurrence referenced around the other block
(lines ~225-231). Ensure the schema names (ChangeResult, configID, external_id,
config_type, external_change_id, change_type) are used so the validator accepts
either the configID-only path or the external lookup path but not both.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 97141654-741d-4a4d-9ec8-55aae9469402
📒 Files selected for processing (14)
api/v1/interface.goapi/v1/types.gocmd/run.goscrapers/extract/README.mdscrapers/extract/extract.goscrapers/extract/schema.yamlscrapers/extract/testdata/changes_and_access.yamlscrapers/extract/testdata/changes_basic.yamlscrapers/extract/testdata/changes_default_from_parent.yamlscrapers/extract/testdata/changes_invalid_ref.yamlscrapers/extract/testdata/changes_missing_change_id.yamlscrapers/extract/testdata/changes_missing_change_type.yamlscrapers/extract/testdata/changes_no_parent.yamlscrapers/extract/testdata/changes_with_details.yaml
✅ Files skipped from review due to trivial changes (2)
- scrapers/extract/testdata/changes_missing_change_type.yaml
- cmd/run.go
🚧 Files skipped from review as they are similar to previous changes (3)
- scrapers/extract/testdata/changes_with_details.yaml
- scrapers/extract/testdata/changes_and_access.yaml
- scrapers/extract/testdata/changes_no_parent.yaml
| } | ||
|
|
||
| type ExternalID struct { | ||
| ConfigID string `json:"config_id,omitempty"` |
There was a problem hiding this comment.
ConfigID is now part of emptiness semantics, but lookup still ignores it.
After Line [244], ExternalID{ConfigID: ...} is considered non-empty, so call paths that branch on IsEmpty() can invoke external-ID lookup. But ExternalID.Find() still does not query by ConfigID, so configID-only references can fail resolution.
🔧 Proposed follow-up
func (e ExternalID) Find(db *gorm.DB) *gorm.DB {
- query := db.Limit(1).Order("updated_at DESC").Where("deleted_at IS NULL").Where("? = ANY(external_id)", strings.ToLower(e.ExternalID))
+ query := db.Limit(1).Order("updated_at DESC").Where("deleted_at IS NULL")
+ if e.ConfigID != "" {
+ return query.Where("id = ?", e.ConfigID)
+ }
+ query = query.Where("? = ANY(external_id)", strings.ToLower(e.ExternalID))
if e.ConfigType != "" {
query = query.Where("type = ?", e.ConfigType)
}
...
}
func (e ExternalID) Key() string {
- return strings.ToLower(fmt.Sprintf("%s%s%s", e.ConfigType, e.ExternalID, e.ScraperID))
+ if e.ConfigID != "" {
+ return strings.ToLower("config_id:" + e.ConfigID)
+ }
+ return strings.ToLower(fmt.Sprintf("%s%s%s", e.ConfigType, e.ExternalID, e.ScraperID))
}Also applies to: 243-245
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/v1/types.go` at line 191, ExternalID's emptiness check includes ConfigID
(so IsEmpty() returns false for ConfigID-only IDs) but ExternalID.Find() still
doesn't use ConfigID in its lookup, causing configID-only references to fail
resolution; update the lookup logic in ExternalID.Find (and any helper it calls)
to include ConfigID as a searchable key when ConfigID is non-empty (i.e., add a
filter/branch that queries by ConfigID or combines ConfigID with existing
criteria), or alternatively change ExternalID.IsEmpty to ignore ConfigID—prefer
updating ExternalID.Find to honor ConfigID so that ExternalID.IsEmpty,
ExternalID.Find, and any call paths using them are consistent.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
scrapers/extract/fullmode.go (1)
57-61: Add a defensive guard before indexingscraped[0].
Line 59assumesscrapedis non-empty. It’s safe with current invariants, but a small guard makes this panic-proof against future refactors.Suggested patch
+ if len(scraped) == 0 { + return scraped + } + for _, a := range appenders { if a.hasItems { result := v1.NewScrapeResult(scraped[0].BaseScraper) a.apply(result) scraped = append(scraped, *result) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/extract/fullmode.go` around lines 57 - 61, Add a defensive guard before indexing scraped[0] in the loop over appenders: check len(scraped) > 0 (or otherwise ensure scraped is non-empty) before calling v1.NewScrapeResult(scraped[0].BaseScraper); if scraped is empty, skip the appender (or handle the error) so a.apply(result) and scraped = append(scraped, *result) never cause a panic. Locate this in the loop iterating over appenders and protect the v1.NewScrapeResult(...) call and subsequent a.apply(...) usage.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@scrapers/extract/fullmode.go`:
- Around line 57-61: Add a defensive guard before indexing scraped[0] in the
loop over appenders: check len(scraped) > 0 (or otherwise ensure scraped is
non-empty) before calling v1.NewScrapeResult(scraped[0].BaseScraper); if scraped
is empty, skip the appender (or handle the error) so a.apply(result) and scraped
= append(scraped, *result) never cause a panic. Locate this in the loop
iterating over appenders and protect the v1.NewScrapeResult(...) call and
subsequent a.apply(...) usage.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d417b15c-7e16-4066-8fda-dbf3ff7ad02b
📒 Files selected for processing (7)
Makefiledb/external_entities.gojobs/jobs_test.goscrapers/exec/exec_backstage_integration_test.goscrapers/external_entities_test.goscrapers/extract/fullmode.goscrapers/run.go
💤 Files with no reviewable changes (1)
- db/external_entities.go
ed3f349 to
1549ee5
Compare
…ests Replace separate JSON test fixtures with YAML files in fixtures/data/extraction/ that contain both input and expected output.
When extracted config has no "config" key but has entities/access, clear Config and ID so the item is skipped as a config item.
- Add ExternalUserAliases field to ExternalConfigAccessLog struct - Expand "user" shorthand in expandAccessLogShorthand - Resolve user aliases to ExternalUserID in ResolveAccess - Validate user reference in validateConfigRefs
- Fix ExternalConfigAccess struct: remove incorrect embedding claim, add missing fields (id, audit timestamps), update file reference - Fix ExternalConfigAccessLog: add missing ExternalUserAliases field and user shorthand documentation - Add access_logs shorthand expansion table (user, config ref fields) - Add missing validations to summary: HasPrincipal() for config_access, user reference for access_logs - Add user shorthand to access_logs example in section 10 - Update schema.yaml: add id/external_config_id/audit fields to ConfigAccess, add user shorthand/external_user_aliases/external_config_id to ConfigAccessLog, update descriptions with validation warnings
1549ee5 to
db35b35
Compare
db35b35 to
7348cd4
Compare
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests