Skip to content

Review k8s feedback and fix #160

@vredchenko

Description

@vredchenko

Critical Issues Found:

  1. Data Loss Risk: All databases use emptyDir volumes - you'll lose all data on pod restarts
  2. Inadequate Resources: Current allocations (128Mi-256Mi RAM) are far too small for 1000+ images/hour
    processing
  3. Security Vulnerabilities: Hardcoded secrets, latest tags, no network policies
  4. No Monitoring: Missing health checks, metrics, and observability
  5. Single Points of Failure: No backup strategy or high availability

Priority Fixes Needed:

Immediate (Phase 1):

  • Replace emptyDir with persistent volumes for all databases
  • Remove hardcoded secrets from dev script
  • Pin container versions instead of using latest
  • Increase resource limits (API: 2-4Gi RAM, Workers: 4-8Gi RAM)

Short-term (Phases 2-3):

  • Add comprehensive monitoring stack (Prometheus/Grafana)
  • Implement database backups via CronJobs
  • Configure network policies for security
  • Optimize scaling for scientific computing bursts

The agent provided detailed YAML examples for each recommendation. Would you like me to help implement any of
these fixes, starting with the critical data persistence issues?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugfixGet working something that doesn'trefactorHousekeeping and bitsrefineNeeds more speccingtest

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions