Skip to content

grafana-operator v5.x API Incompatibility - Deployment Failures #1236

@vimalkansal

Description

@vimalkansal

OpenShift version

4.16, 4.19

Problem description

grafana-operator v5.x API Incompatibility - Deployment Failures

Issue Summary

Pelorus master branch and all tagged releases (v2.0.0-v2.0.12) are incompatible with grafana-operator v5.x due to outdated API versions in Helm chart templates. Deployments fail with "API version not found" errors.

Environment

  • OpenShift Version: 4.16. Combo box does not allow me to go beyond OCP 4.13 so I chose that
  • Pelorus Version: master branch (commit 7413e5b, Sept 5 2024) and v2.0.12
  • grafana-operator Available: v5.18.0 (lab), v5.19.4 (production)
  • Deployment Method: ArgoCD ApplicationSet with Helm charts

Problem Description

What's Happening

Pelorus Helm charts reference deprecated grafana-operator v4.x APIs that no longer exist in OpenShift operator catalogs (community-operators, redhat-operators). Modern clusters only provide grafana-operator v5.x which uses incompatible API versions.

API Version Mismatch

Pelorus Templates Use:          grafana-operator v5.x Provides:
integreatly.org/v1alpha1    ≠   grafana.integreatly.org/v1beta1

Affected Resources

  • charts/pelorus/templates/grafana.yaml - uses integreatly.org/v1alpha1
  • charts/pelorus/templates/grafana-thanos-datasource.yaml - uses integreatly.org/v1alpha1
  • All GrafanaDashboard templates - use integreatly.org/v1alpha1

Error Messages

The Kubernetes API could not find integreatly.org/GrafanaDataSource for requested resource.
Make sure the "GrafanaDataSource" CRD is installed on the destination cluster.

The Kubernetes API could not find version "v1alpha1" of integreatly.org/GrafanaDashboard.
Version "v1beta1" of grafana.integreatly.org/GrafanaDashboard is installed on the destination cluster.

Root Cause Analysis

Incomplete Migration in Master Branch

Commit 7413e5b (Sept 5, 2024) updated charts/operators/values.yaml:

grafana_subscription_version: 5.12.0  # ✓ Updated

BUT template files were NOT updated:

# charts/pelorus/templates/grafana.yaml
apiVersion: integreatly.org/v1alpha1  # ✗ Still v4.x API
kind: Grafana

This created an inconsistency: operator chart requests v5.x operator but templates use v4.x APIs.

Catalog Reality

OpenShift operator catalogs (community-operators, redhat-operators) have:

  • ✅ grafana-operator v5.x (channel: v5) - provides grafana.integreatly.org/v1beta1
  • ❌ grafana-operator v4.8.0 - deprecated and removed from catalogs

No backwards compatibility exists - v5.x CRDs only support v1beta1.

Impact

Deployment Status

  • v2.0.12: Requests unavailable v4.8.0, deployment fails
  • master branch: Requests available v5.x but templates incompatible, deployment fails
  • Result: No Pelorus version works on modern OpenShift clusters

Organizations Affected

Any OpenShift cluster with:

  • OpenShift 4.14+ (grafana-operator v5.x default)
  • Updated operator catalogs (v4.x removed)
  • GitOps deployments (ArgoCD/Flux)

Proposed Solution

Required Changes

Migrate all Grafana-related templates to v1beta1 APIs:

  1. Update API versions:

    # OLD
    apiVersion: integreatly.org/v1alpha1
    
    # NEW
    apiVersion: grafana.integreatly.org/v1beta1
  2. Update API group references:

    • integreatly.org/v1alpha1grafana.integreatly.org/v1beta1
  3. Review field schema changes:

  4. Update affected files:

    • charts/pelorus/templates/grafana.yaml
    • charts/pelorus/templates/grafana-thanos-datasource.yaml
    • All GrafanaDashboard template files
    • charts/pelorus/templates/prometheus-grafana-datasource.yaml

Testing Recommendations

  • Test against grafana-operator v5.12.0+ (minimum)
  • Verify on OpenShift 4.14+
  • Validate dashboard rendering and datasource connectivity

Workarounds (Not Recommended)

We evaluated but rejected these approaches:

  1. Downgrade to v4.8.0: Security risk, removed from catalogs, no support
  2. Custom catalog source: Maintenance overhead, technical debt
  3. Manual patching: Breaks upgrade path, unsupported

Additional Context

GitOps Pattern Validation

We successfully implemented ArgoCD ApplicationSet:

  • ✅ Multiple sources (Helm chart + values overlay)
  • ✅ Values override for operator versions
  • ✅ Operators installed successfully
  • ❌ Pelorus chart deployment blocked by API incompatibility

References

Request

Please prioritize updating Pelorus Helm charts for grafana-operator v5.x compatibility. This is blocking adoption on modern OpenShift clusters.

Steps to reproduce

Prerequisites

  • OpenShift 4.14+ cluster with grafana-operator v5.x available in catalog
  • ArgoCD or Helm 3 installed

Reproduction Steps

  1. Verify grafana-operator v5.x is available:

    oc get packagemanifest grafana-operator -n openshift-marketplace \
      -o jsonpath='{.status.channels[?(@.name=="v5")].currentCSV}'
    # Should show: grafana-operator.v5.x.x
  2. Install Pelorus operators chart:

    git clone https://github.com/dora-metrics/pelorus.git
    cd pelorus
    git checkout master
    
    helm install pelorus-operators charts/operators \
      --namespace pelorus --create-namespace
  3. Wait for operators to install:

    oc get csv -n pelorus | grep grafana
    # Should show: grafana-operator.v5.x.x Succeeded
  4. Install Pelorus chart:

    helm install pelorus charts/pelorus --namespace pelorus
  5. Observe deployment failure:

    helm status pelorus -n pelorus
    # Shows: deployed but not ready
    
    oc get grafana -n pelorus
    # Error: no matches for kind "Grafana" in version "integreatly.org/v1alpha1"

Expected Behavior

Pelorus deploys successfully with Grafana instance, dashboards, and datasources created.

Actual Behavior

Deployment fails with API version not found errors. Resources remain in pending/failed state.

Minimal Reproduction

The issue occurs immediately when applying any Grafana-related resource:

kubectl apply -f charts/pelorus/templates/grafana.yaml
# Error: no matches for kind "Grafana" in version "integreatly.org/v1alpha1"

Current behavior

Current Behavior

Deployment Fails with API Errors

When deploying Pelorus charts on clusters with grafana-operator v5.x:

  1. Operators install successfully:

    oc get csv -n pelorus
    # grafana-operator.v5.19.4    Succeeded
    # prometheusoperator.0.56.3   Succeeded
  2. Pelorus chart deployment fails:

    helm install pelorus charts/pelorus -n pelorus
    # Error: INSTALLATION FAILED: unable to build kubernetes objects from release manifest
  3. Specific error messages:

    The Kubernetes API could not find integreatly.org/GrafanaDataSource for requested resource pelorus/prometheus-grafana-datasource.
    Make sure the "GrafanaDataSource" CRD is installed on the destination cluster.
    
    The Kubernetes API could not find version "v1alpha1" of integreatly.org/GrafanaDashboard for requested resource pelorus/dashboard-sdp.
    Version "v1beta1" of grafana.integreatly.org/GrafanaDashboard is installed on the destination cluster.
    
    The Kubernetes API could not find version "v1alpha1" of integreatly.org/Grafana for requested resource pelorus/grafana-oauth.
    Version "v1beta1" of grafana.integreatly.org/Grafana is installed on the destination cluster.
    
  4. No Grafana resources created:

    oc get grafana -n pelorus
    # Error: the server doesn't have a resource type "grafana" in group "integreatly.org"
    
    oc api-resources | grep grafana
    # All show: grafana.integreatly.org/v1beta1 (not integreatly.org/v1alpha1)
  5. Helm/ArgoCD shows perpetual sync failures:

    • ArgoCD Application status: OutOfSync - Progressing (never completes)
    • Sync fails with "API version not found" errors on every retry
    • Resources stuck in SyncFailed state

Impact on GitOps Workflows

  • ArgoCD ApplicationSets cannot deploy Pelorus automatically
  • Continuous sync failures trigger alerting
  • Manual intervention cannot resolve (API simply doesn't exist)
  • Blocks automated DORA metrics collection for organizations using GitOps

Expected behavior

Pelorus Helm charts should deploy successfully on OpenShift 4.14+ clusters with grafana-operator v5.x. All Grafana resources (Grafana instance, GrafanaDashboards, GrafanaDataSource) should be created using the v1beta1 API version. Grafana should be accessible with DORA metrics dashboards loaded and Prometheus datasource configured. ArgoCD/GitOps deployments should sync successfully without API version errors.

Or if you want more detail:

  1. Helm install completes successfully: helm install pelorus charts/pelorus -n pelorus returns STATUS: deployed
  2. All Grafana CRs created: oc get grafana,grafanadashboard,grafanadatasource -n pelorus shows all resources in Ready state
  3. Grafana instance accessible via Route with DORA dashboards loaded
  4. Prometheus datasource auto-configured and functional
  5. ArgoCD Application shows: Synced - Healthy status
  6. Compatible with grafana-operator v5.x (v5.12.0+) available in OpenShift 4.14+ catalogs

Code of Conduct

  • I agree to follow Pelorus's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions