Skip to content

[Internal]: Add Kubernetes security policy troubleshooting guidance for ECK #3929

@damianpfister

Description

@damianpfister

Description

Description

Add comprehensive guidance for diagnosing and resolving Kubernetes security policy issues that block ECK pod creation.

What: We are adding a new section that explains how Kubernetes security policies (PodSecurityAdmission, PodSecurityPolicy, admission webhooks) prevent ECK pods from being created, with specific guidance for each ECK component's security requirements.

Why: Security policy violations are the most common "invisible" blocker for ECK deployments. Symptoms are misleading (operator errors, no pods) and root cause isn't obvious without checking namespace events. Different ECK components have different security requirements (Fleet Server requires root, Elasticsearch needs specific fsGroup, etc.).

Details users need to know:

  • How to identify security policy blocks (UP-TO-DATE: 0, FailedCreate events)
  • ECK component security requirements vs. common policies
  • How to check current namespace security level (K8s 1.23+ vs legacy)
  • Three solution options: adjust namespace policy, configure securityContext, or get exemptions
  • Fleet Server specifically requires runAsUser: 0 and cannot run under restricted policies

Proposed Content

Section Title: Kubernetes Security Policies Blocking Pod Creation

Location: Add as new section under "Common problems" or as standalone major section

Target Page:
https://www.elastic.co/docs/troubleshoot/deployments/cloud-on-k8s/kubernetes

Content:

## Kubernetes Security Policies Blocking Pod Creation

### Symptoms

Kubernetes security policies (PodSecurityAdmission, PodSecurityPolicy, admission webhooks) can silently block ECK pod creation:

- Deployments show `UP-TO-DATE: 0` 
- No pods appear despite successful resource creation
- Operator logs show connection errors (401, 503)

### Diagnosis

```bash
# Check deployment status
kubectl get deployment -n <namespace>
# UP-TO-DATE: 0 indicates Kubernetes admission block

# Check events for security violations
kubectl get events -n <namespace> | grep -i "forbidden\|violates\|denied"

Example FailedCreate event:

Warning  FailedCreate  replicaset/fleet-server-xyz
Error creating: pods "fleet-server-xyz" is forbidden: 
violates PodSecurity "restricted:latest": runAsUser=0

ECK Component Security Requirements

Component Security Requirement Restricted Policy Impact
Fleet Server Requires runAsUser: 0 ❌ Blocked by restricted
Elasticsearch Requires specific fsGroup ⚠️ May conflict
Kibana Flexible ✅ Compatible
Elastic Agent Varies by monitoring target ⚠️ Depends on use case

Check Current Security Policy

Kubernetes 1.23+ (PodSecurityAdmission):

kubectl get namespace <namespace> -o jsonpath='{.metadata.labels}' | grep pod-security

Legacy (PodSecurityPolicy):

kubectl get podsecuritypolicy

Solutions

Option 1: Adjust Namespace Security Level (Recommended)

Change namespace to baseline or privileged for ECK workloads:

kubectl label namespace <namespace> \
  pod-security.kubernetes.io/enforce=baseline --overwrite

Option 2: Configure Security Context (If Policy Cannot Change)

Only for components that don't require root:

spec:
  podTemplate:
    spec:
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 1000
      containers:
      - name: elasticsearch
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop: ["ALL"]

Note: Fleet Server requires runAsUser: 0 and cannot use this option.

Option 3: Request Cluster Admin Exemption

Work with cluster administrators to create policy exemptions for ECK namespaces.

Prevention

Check namespace security policy before deploying:

kubectl get namespace <namespace> -o yaml | grep pod-security

Guidelines:

  • Fleet Server: Requires baseline or privileged policy
  • Elasticsearch/Kibana: Work with baseline policy
  • Avoid restricted policy for ECK workloads unless specific component supports it

---

## Rationale

**Problem:** Security policy violations are not visible in pod logs or operator logs. Users don't know to check namespace events or understand ECK component security requirements.

**Impact:** Provides clear diagnostic steps and solution paths for the most common Kubernetes-layer blocker. Includes component-specific requirements table for quick reference.

**Placement:** In kubernetes troubleshooting page where users with K8s-specific issues will find it.


### Resources

N/A

### Which documentation set does this change impact?

Elastic On-Prem only

### Feature differences

N/A

### What release is this request related to?

N/A

### Serverless release

N/A

### Collaboration model

The documentation team

### Point of contact.

**Main contact:** @eedugon 

**Stakeholders:** @damianpfister 

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions