feat: add last_scrape_summary to template env by yashmehrotra · Pull Request #1924 · flanksource/config-db

yashmehrotra · 2026-02-28T06:52:59Z

Fixes: #1917

Summary by CodeRabbit

New Features
- Scrapers can now access the previous scrape summary in rules and custom scripts for improved state tracking and change detection.
- Added caching of historical scrape data to optimize performance.
Tests
- Added comprehensive unit tests for last scrape summary functionality and context propagation.

coderabbitai · 2026-02-28T06:53:15Z

Walkthrough

This PR adds support for templating the last scrape summary data. It extends ScrapeContext to carry scrape summary information, implements caching for efficient retrieval from job history, and exposes the summary to various templating engines.

Changes

Cohort / File(s)	Summary
Context Extension `api/context.go`, `api/context_test.go`	Adds `lastScrapeSummary` field to `ScrapeContext` with accessor methods `WithLastScrapeSummary()` and `LastScrapeSummary()`. Extends `WithValue()` to propagate the field. Tests cover default behavior, retrieval, and preservation through context operations.
Caching and Retrieval `scrapers/cron.go`	Introduces in-memory cache (`scraperSummaryCache`) to store per-scraper last scrape summaries. Adds `getLastScrapeSummary()` helper that checks cache first, then queries job_history table for the most recent run and deserializes the summary. Updates scraper job runner to populate context with cached summary and store the summary after successful scrape.
Template Integration `scrapers/changes/rules.go`, `scrapers/processors/json.go`, `scrapers/processors/script.go`	Makes `last_scrape_summary` available in template environments for change rules, JSON processor results, and script processor templates, enabling access to historical scrape data during template evaluation.

Sequence Diagram(s)

sequenceDiagram
    participant Scraper as Scraper Job
    participant Cache as Summary Cache
    participant DB as Job History DB
    participant Context as ScrapeContext
    participant Template as Template Engine
    participant Processor as Result Processor

    Scraper->>Cache: Check for last scrape summary
    alt Summary in cache
        Cache-->>Scraper: Return cached summary
    else Cache miss
        Scraper->>DB: Query job_history for latest run
        DB-->>Scraper: Return historical summary
        Scraper->>Cache: Store in cache
    end

    Scraper->>Context: WithLastScrapeSummary(summary)
    Context-->>Scraper: Return updated context

    Scraper->>Processor: Run scraper with context
    Processor->>Template: Evaluate template with last_scrape_summary
    Template-->>Processor: Return evaluated result
    Processor-->>Scraper: Return processed results

    Scraper->>DB: Store scrape_summary in job history
    Scraper->>Cache: Update cached summary

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: adding last_scrape_summary to the template environment, which is the primary objective of this pull request.
Linked Issues check	✅ Passed	The PR successfully implements the feature requested in issue `#1917` by exposing last_scrape_summary in template contexts across multiple processors and the cron job handler.
Out of Scope Changes check	✅ Passed	All changes are directly related to adding last_scrape_summary to template environments; no unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch tmpl-last-scrape-summary

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

scrapers/processors/json.go (1)

468-489: Build and reuse one env map per result in this loop.

Line 468 injects into one AsMap() call, while subsequent template/filter calls use fresh AsMap() calls. Reusing a single env here makes last_scrape_summary usage deterministic and reduces repeated map construction.

💡 Proposed refactor

 		for _, result := range ongoingInput {
-			result.AsMap()["last_scrape_summary"] = ctx.LastScrapeSummary()
+			env := result.AsMap()
+			env["last_scrape_summary"] = ctx.LastScrapeSummary()
 			for i, configProperty := range result.BaseScraper.Properties {
 				if configProperty.Filter != "" {
-					if response, err := gomplate.RunTemplate(result.AsMap(), gomplate.Template{Expression: configProperty.Filter}); err != nil {
+					if response, err := gomplate.RunTemplate(env, gomplate.Template{Expression: configProperty.Filter}); err != nil {
 						result.Errorf("failed to parse filter: %v", err)
 						continue
 					} else if boolVal, err := strconv.ParseBool(response); err != nil {
@@
 				templater := gomplate.StructTemplater{
-					Values:         result.AsMap(),
+					Values:         env,
 					ValueFunctions: true,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/processors/json.go` around lines 468 - 489, Build a single env map
once per result and reuse it inside the properties loop instead of calling
result.AsMap() repeatedly: create env := result.AsMap(); set
env["last_scrape_summary"] = ctx.LastScrapeSummary() before iterating over
result.BaseScraper.Properties, then pass that env into gomplate.RunTemplate (for
filters) and templater.StructTemplater{Values: env, ...} so all
templates/filters consistently see the same map; keep the existing copy of Links
(configProperty.Links) and other logic unchanged, just replace every in-loop
result.AsMap() usage with the env variable.

api/context_test.go (1)

12-63: Good coverage for LastScrapeSummary; add a WithValue propagation regression test.

Current tests validate summary propagation well, but there’s no guard for context-state propagation when cloning via WithValue (especially incremental mode).

🧪 Suggested test case

 func TestScrapeContext_LastScrapeSummary(t *testing.T) {
+	t.Run("preserves incremental mode through WithValue", func(t *testing.T) {
+		ctx := ScrapeContext{}.
+			AsIncrementalScrape().
+			WithValue("k", "v")
+
+		assert.True(t, ctx.IsIncrementalScrape())
+	})
+
 	t.Run("returns empty map when unset", func(t *testing.T) {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api/context_test.go` around lines 12 - 63, Add a regression test that
verifies ScrapeContext.LastScrapeSummary is preserved when the context is cloned
via WithValue, especially in incremental mode: create a ScrapeContext, set a
summary with WithLastScrapeSummary, then call WithValue (and also call
AsIncrementalScrape then WithValue in a sub-case) and assert the returned
context's LastScrapeSummary still contains the same entries (use ScrapeContext,
WithLastScrapeSummary, WithValue, AsIncrementalScrape, and LastScrapeSummary to
locate the code).

scrapers/cron.go (1)

36-39: Add cache eviction path for scraperSummaryCache.

scraperSummaryCache is introduced at Line 38, but entries are never removed when a scraper job is deleted. This can leave stale cache growth over time.

💡 Proposed fix

 func DeleteScrapeJob(id string) {
 	logger.Debugf("deleting scraper job for %s", id)
+	scraperSummaryCache.Delete(id)
 
 	if j, ok := scrapeJobs.Load(id); ok {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/cron.go` around lines 36 - 39, The scraperSummaryCache (sync.Map)
never evicts entries when a scraper job is removed, causing stale growth; update
the job-deletion flow (the function that removes entries from scrapeJobs / stops
cron jobs — e.g., DeleteScrapeJob or the handler that calls scrapeJobs.Delete)
to also remove related cache entries by calling
scraperSummaryCache.Delete(jobID) for the job's key (or iterate
scraperSummaryCache.Range and Delete keys that match the job id pattern if keys
are composite), ensuring cache entries are removed when a scraper job is
deleted.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/context.go`:
- Around line 93-101: WithValue currently reconstructs a ScrapeContext but omits
the isIncremental field, causing incremental mode set by AsIncrementalScrape()
to be lost; update the ScrapeContext literal returned by WithValue to copy the
isIncremental boolean from the receiver (ctx.isIncremental) so the incremental
state is preserved when cloning the context.

In `@scrapers/cron.go`:
- Around line 360-370: The current DB query fetches the latest run including
StatusFailed which can mask a previous successful summary; change the logic in
scrapers/cron.go so you first query only for models.StatusSuccess and
models.StatusWarning (using the same Where/Order/First against history), and if
that returns nil, run a second query that includes models.StatusFailed as a
fallback; update the code paths that read history.Details["scrape_summary"] to
use the history from the first successful/warning query when available, and only
use the failed-run history when no prior success/warning exists (refer to the
variables/history query block, scraperID, job.ResourceTypeScraper, and
history.Details access).

---

Nitpick comments:
In `@api/context_test.go`:
- Around line 12-63: Add a regression test that verifies
ScrapeContext.LastScrapeSummary is preserved when the context is cloned via
WithValue, especially in incremental mode: create a ScrapeContext, set a summary
with WithLastScrapeSummary, then call WithValue (and also call
AsIncrementalScrape then WithValue in a sub-case) and assert the returned
context's LastScrapeSummary still contains the same entries (use ScrapeContext,
WithLastScrapeSummary, WithValue, AsIncrementalScrape, and LastScrapeSummary to
locate the code).

In `@scrapers/cron.go`:
- Around line 36-39: The scraperSummaryCache (sync.Map) never evicts entries
when a scraper job is removed, causing stale growth; update the job-deletion
flow (the function that removes entries from scrapeJobs / stops cron jobs —
e.g., DeleteScrapeJob or the handler that calls scrapeJobs.Delete) to also
remove related cache entries by calling scraperSummaryCache.Delete(jobID) for
the job's key (or iterate scraperSummaryCache.Range and Delete keys that match
the job id pattern if keys are composite), ensuring cache entries are removed
when a scraper job is deleted.

In `@scrapers/processors/json.go`:
- Around line 468-489: Build a single env map once per result and reuse it
inside the properties loop instead of calling result.AsMap() repeatedly: create
env := result.AsMap(); set env["last_scrape_summary"] = ctx.LastScrapeSummary()
before iterating over result.BaseScraper.Properties, then pass that env into
gomplate.RunTemplate (for filters) and templater.StructTemplater{Values: env,
...} so all templates/filters consistently see the same map; keep the existing
copy of Links (configProperty.Links) and other logic unchanged, just replace
every in-loop result.AsMap() usage with the env variable.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 72b1a58 and 83b4a6c.

📒 Files selected for processing (6)

api/context.go
api/context_test.go
scrapers/changes/rules.go
scrapers/cron.go
scrapers/processors/json.go
scrapers/processors/script.go

coderabbitai · 2026-02-28T07:11:26Z

api/context.go

 func (ctx ScrapeContext) WithValue(key, val any) ScrapeContext {
 	return ScrapeContext{
-		Context:      ctx.Context.WithValue(key, val),
-		temp:         ctx.temp,
-		namespace:    ctx.namespace,
-		jobHistory:   ctx.jobHistory,
-		scrapeConfig: ctx.scrapeConfig,
+		Context:           ctx.Context.WithValue(key, val),
+		temp:              ctx.temp,
+		namespace:         ctx.namespace,
+		jobHistory:        ctx.jobHistory,
+		scrapeConfig:      ctx.scrapeConfig,
+		lastScrapeSummary: ctx.lastScrapeSummary,
 	}


⚠️ Potential issue | 🟡 Minor

Preserve incremental mode when cloning context with WithValue.

WithValue rebuilds ScrapeContext but currently drops isIncremental. If AsIncrementalScrape() is called before WithValue, incremental state is lost.

💡 Proposed fix

func (ctx ScrapeContext) WithValue(key, val any) ScrapeContext { return ScrapeContext{ Context: ctx.Context.WithValue(key, val), temp: ctx.temp, + isIncremental: ctx.isIncremental, namespace: ctx.namespace, jobHistory: ctx.jobHistory, scrapeConfig: ctx.scrapeConfig, lastScrapeSummary: ctx.lastScrapeSummary, }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (ctx ScrapeContext) WithValue(key, val any) ScrapeContext {

return ScrapeContext{

Context: ctx.Context.WithValue(key, val),

temp: ctx.temp,

namespace: ctx.namespace,

jobHistory: ctx.jobHistory,

scrapeConfig: ctx.scrapeConfig,

Context: ctx.Context.WithValue(key, val),

temp: ctx.temp,

namespace: ctx.namespace,

jobHistory: ctx.jobHistory,

scrapeConfig: ctx.scrapeConfig,

lastScrapeSummary: ctx.lastScrapeSummary,

}

func (ctx ScrapeContext) WithValue(key, val any) ScrapeContext {

return ScrapeContext{

Context: ctx.Context.WithValue(key, val),

temp: ctx.temp,

isIncremental: ctx.isIncremental,

namespace: ctx.namespace,

jobHistory: ctx.jobHistory,

scrapeConfig: ctx.scrapeConfig,

lastScrapeSummary: ctx.lastScrapeSummary,

}

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@api/context.go` around lines 93 - 101, WithValue currently reconstructs a ScrapeContext but omits the isIncremental field, causing incremental mode set by AsIncrementalScrape() to be lost; update the ScrapeContext literal returned by WithValue to copy the isIncremental boolean from the receiver (ctx.isIncremental) so the incremental state is preserved when cloning the context.

coderabbitai · 2026-02-28T07:11:27Z

scrapers/cron.go

+		Where("resource_id = ? AND resource_type = ?", scraperID, job.ResourceTypeScraper).
+		Where("status IN ?", []string{models.StatusSuccess, models.StatusWarning, models.StatusFailed}).
+		Order("time_end DESC").First(&history).Error
+	if err != nil {
+		return nil
+	}
+
+	raw, ok := history.Details["scrape_summary"]
+	if !ok {
+		return nil
+	}


⚠️ Potential issue | 🟠 Major

Latest failed run can hide the previous valid summary.

At Line 361-362, the query includes models.StatusFailed. At Line 367-370, if that failed row has no scrape_summary, this returns nil and drops incremental context (notably after process restart when cache is empty).

💡 Proposed fix

err := ctx.DB(). Where("resource_id = ? AND resource_type = ?", scraperID, job.ResourceTypeScraper). - Where("status IN ?", []string{models.StatusSuccess, models.StatusWarning, models.StatusFailed}). + Where("status IN ?", []string{models.StatusSuccess, models.StatusWarning}). Order("time_end DESC").First(&history).Error

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Where("resource_id = ? AND resource_type = ?", scraperID, job.ResourceTypeScraper).

Where("status IN ?", []string{models.StatusSuccess, models.StatusWarning, models.StatusFailed}).

Order("time_end DESC").First(&history).Error

if err != nil {

return nil

}

raw, ok := history.Details["scrape_summary"]

if !ok {

return nil

}

Where("resource_id = ? AND resource_type = ?", scraperID, job.ResourceTypeScraper).

Where("status IN ?", []string{models.StatusSuccess, models.StatusWarning}).

Order("time_end DESC").First(&history).Error

if err != nil {

return nil

}

raw, ok := history.Details["scrape_summary"]

if !ok {

return nil

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/cron.go` around lines 360 - 370, The current DB query fetches the latest run including StatusFailed which can mask a previous successful summary; change the logic in scrapers/cron.go so you first query only for models.StatusSuccess and models.StatusWarning (using the same Where/Order/First against history), and if that returns nil, run a second query that includes models.StatusFailed as a fallback; update the code paths that read history.Details["scrape_summary"] to use the history from the first successful/warning query when available, and only use the failed-run history when no prior success/warning exists (refer to the variables/history query block, scraperID, job.ResourceTypeScraper, and history.Details access).

yashmehrotra added 2 commits February 27, 2026 19:36

feat: add last_scrape_summary to template env

6abf0dd

chore: update jh db query

83b4a6c

coderabbitai bot reviewed Feb 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add last_scrape_summary to template env#1924

feat: add last_scrape_summary to template env#1924
yashmehrotra wants to merge 2 commits intomainfrom
tmpl-last-scrape-summary

yashmehrotra commented Feb 28, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 28, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 28, 2026

Uh oh!

coderabbitai bot Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yashmehrotra commented Feb 28, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yashmehrotra commented Feb 28, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 28, 2026 •

edited

Loading