Skip to content

Scanner improvements #208

@r2dedios

Description

@r2dedios

Describe the solution you'd like
The scanner will no longer be a cronjob generated pod. Instead, it will be executed as a short-lived Kubernetes Job, created dynamically on demand or based on a scheduled trigger. The Jobs will be launched via the Agent, using its existing Action model, with a new action type for triggering scanner jobs. The actual scan will be performed by the same scanner container, configured via environment variables or arguments.

FR:

  • FR-1. The Agent will support a new action type: ScanAction, implementing ActionOperation.
  • FR-2. When a scan is scheduled (via API or frontend), a new action will be created in the DB and dispatched to the Agent.
  • FR-3. Upon receiving a ScanAction, the Agent will launch a Kubernetes Job in the cluster using the Scanner container image.
  • FR-4. The Job will receive the list of account_id to scan, and optionally a flag for enabling billing, via env vars or args.
  • FR-5. The Job will execute the scan and terminate after processing the data and reporting the result.
  • FR-6. The scan schedule will be persisted in the database.
  • FR-7. Scheduled scans will be managed (create, update, delete, list) via new API endpoints:
    • [POST] /scans/now — Triggers an on-demand scan. Body: list of accounts + billing flag.
    • [GET] /scans/schedule — Lists all scheduled scans.
    • [POST] /scans/schedule — Creates a new scheduled scan.
    • [PATCH] /scans/schedule/:id — Modifies an existing schedule.
    • [DELETE] /scans/schedule/:id — Deletes an existing schedule.
  • FR-8. The backend will include a lightweight scheduler (using cron library) that periodically checks which scheduled scans must be triggered. This logic will reside in the API service.
  • FR-9. The scanner Job will report the result of the scan back to the API or directly to the DB, including:
    status (success/error),
    accounts scanned,
    duration,
    scan_run_id.
  • FR-10. Rate limits and maintenance windows will be validated before launching the scan job, based on predefined configuration.

NFR:

  • NFR-1. The scanner container must be stateless, self-contained and receive configuration only through env vars or command-line args.
  • NFR-2. The scanner Jobs will be labeled with scan_run_id and other metadata to facilitate tracing and log collection.
  • NFR-3. The Agent will require RBAC permissions to create Jobs within a specified namespace.
  • NFR-4. The concurrency of scan jobs will be limited via configuration (e.g., MAX_CONCURRENT_JOBS) and enforced by the API when scheduling.
  • NFR-5. API rate-limiting policies per cloud provider must be respected. Throttling logic may be implemented at the backend scheduler level.
  • NFR-6. Maintenance windows will be stored in the DB and consulted before launching any scan job.
  • [ ]

Additional considerations

  • 1. Review and update the compose files used for development and for CI/CD
  • 2. In order to keep the schedule persistent, it will be stored on the DB. Create the necessary DB tables to save the scanner schedule. Proposal:
    -- Scheduled scans
    CREATE TABLE IF NOT EXISTS schedule (
      id BIGINT GENERATED ALWAYS AS IDENTITY NOT NULL,
      time TIMESTAMP WITH TIME ZONE,
      cron_exp TEXT,
      target_accounts TEXT REFERENCES acounts(id) ON DELETE CASCADE,
      status TEXT REFERENCES action_status(name)
    );
    

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requesthelp wantedExtra attention is needed

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions