Skip to content

Commit 85dd747

Browse files
authored
feat: Add optional entity operator wait skip for Kind (#397)
Add SKIP_ENTITY_OPERATOR_WAIT environment variable to work around entity operator timeout issues in some environments (Linux + Podman). When set to 'true': - Checks only Kafka broker pod readiness (not full Kafka resource) - Polls Kafka broker directly to verify topic creation This is an opt-in workaround that doesn't affect standard deployments. Also improves user messaging: - Added timeout information (10 minutes) - Added proactive hints about workaround if timeout occurs - Set expectations for topic creation timing
1 parent 15fa456 commit 85dd747

File tree

2 files changed

+90
-21
lines changed

2 files changed

+90
-21
lines changed

examples/cloud-deployment/README.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,22 @@ cd scripts
9090
./deploy.sh --container-tool podman
9191
```
9292

93-
Note that using Kind with Podman on Linux may have some occasional issues due to Kind's experimental support for Podman. In our testing, a reboot normally solves this.
93+
Note that using Kind with Podman on Linux may have some occasional issues due to Kind's experimental support for Podman. In our testing, a reboot normally solves this.
94+
95+
**Troubleshooting entity operator timeout:**
96+
97+
In some environments (particularly Linux with Podman), the Kafka entity operator may not start properly, causing deployment to timeout while waiting for Kafka to be ready. If you encounter this issue, you can skip the entity operator wait:
98+
99+
```bash
100+
export SKIP_ENTITY_OPERATOR_WAIT=true
101+
./deploy.sh --container-tool podman
102+
```
103+
104+
This tells the script to:
105+
- Check only the Kafka broker pod (not the full Kafka resource with entity operator)
106+
- Poll the Kafka broker directly to verify topic creation (instead of waiting for the topic operator)
107+
108+
The entity operator manages topic and user resources, but the broker handles the actual message streaming. Skipping the entity operator wait does not affect the demo's core functionality.
94109

95110
The script will:
96111
- Create Kind cluster with local registry support (if not already exists)

examples/cloud-deployment/scripts/deploy.sh

Lines changed: 74 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -219,34 +219,88 @@ echo -e "${GREEN}✓ PostgreSQL deployed${NC}"
219219
echo ""
220220
echo "Deploying Kafka..."
221221
kubectl apply -f ../k8s/02-kafka.yaml
222-
echo "Waiting for Kafka to be ready (using KRaft mode, typically 2-3 minutes)..."
222+
echo "Waiting for Kafka to be ready (using KRaft mode, typically 2-3 minutes. Timeout is 10 minutes)..."
223223

224-
# Monitor progress while waiting
225-
for i in {1..60}; do
226-
echo "Checking Kafka status (attempt $i/60)..."
227-
kubectl get kafka -n kafka -o wide 2>/dev/null || true
228-
kubectl get pods -n kafka -l strimzi.io/cluster=a2a-kafka 2>/dev/null || true
224+
# Check if we should skip entity operator wait (workaround for some environments)
225+
if [ "${SKIP_ENTITY_OPERATOR_WAIT}" = "true" ]; then
226+
echo -e "${YELLOW}⚠ SKIP_ENTITY_OPERATOR_WAIT is set - checking broker pod only${NC}"
229227

230-
if kubectl wait --for=condition=Ready kafka/a2a-kafka -n kafka --timeout=10s 2>/dev/null; then
231-
echo -e "${GREEN}✓ Kafka deployed${NC}"
232-
break
233-
fi
228+
# Wait for broker pod to be ready (skip entity operator check)
229+
for i in {1..60}; do
230+
echo "Checking Kafka broker status (attempt $i/60)..."
231+
kubectl get pods -n kafka -l strimzi.io/cluster=a2a-kafka 2>/dev/null || true
234232

235-
if [ $i -eq 60 ]; then
236-
echo -e "${RED}ERROR: Timeout waiting for Kafka${NC}"
237-
kubectl describe kafka/a2a-kafka -n kafka
238-
kubectl get events -n kafka --sort-by='.lastTimestamp'
239-
exit 1
240-
fi
241-
done
233+
if kubectl wait --for=condition=Ready pod/a2a-kafka-broker-0 -n kafka --timeout=5s 2>/dev/null; then
234+
echo -e "${GREEN}✓ Kafka broker pod is ready${NC}"
235+
echo -e "${YELLOW}⚠ Entity operator may not be ready, but this does not affect functionality${NC}"
236+
break
237+
fi
238+
239+
if [ $i -eq 60 ]; then
240+
echo -e "${RED}ERROR: Timeout waiting for Kafka broker${NC}"
241+
kubectl get pods -n kafka -l strimzi.io/cluster=a2a-kafka
242+
kubectl describe pod a2a-kafka-broker-0 -n kafka 2>/dev/null || true
243+
exit 1
244+
fi
245+
246+
sleep 5
247+
done
248+
else
249+
echo -e "${YELLOW} If waiting for Kafka times out, run ./cleanup.sh, and retry having set 'SKIP_ENTITY_OPERATOR_WAIT=true'${NC}"
250+
# Standard wait for full Kafka resource (includes entity operator)
251+
for i in {1..60}; do
252+
echo "Checking Kafka status (attempt $i/60)..."
253+
kubectl get kafka -n kafka -o wide 2>/dev/null || true
254+
kubectl get pods -n kafka -l strimzi.io/cluster=a2a-kafka 2>/dev/null || true
255+
256+
if kubectl wait --for=condition=Ready kafka/a2a-kafka -n kafka --timeout=10s 2>/dev/null; then
257+
echo -e "${GREEN}✓ Kafka deployed${NC}"
258+
break
259+
fi
260+
261+
if [ $i -eq 60 ]; then
262+
echo -e "${RED}ERROR: Timeout waiting for Kafka${NC}"
263+
kubectl describe kafka/a2a-kafka -n kafka
264+
kubectl get events -n kafka --sort-by='.lastTimestamp'
265+
exit 1
266+
fi
267+
done
268+
fi
242269

243270
# Create Kafka Topic for event replication
244271
echo ""
245272
echo "Creating Kafka topic for event replication..."
246273
kubectl apply -f ../k8s/03-kafka-topic.yaml
247-
echo "Waiting for Kafka topic to be ready..."
248-
kubectl wait --for=condition=Ready kafkatopic/a2a-replicated-events -n kafka --timeout=60s
249-
echo -e "${GREEN}✓ Kafka topic created${NC}"
274+
275+
if [ "${SKIP_ENTITY_OPERATOR_WAIT}" = "true" ]; then
276+
echo -e "${YELLOW}⚠ SKIP_ENTITY_OPERATOR_WAIT is set - polling Kafka broker for topic${NC}"
277+
echo " Topic operator may not be ready, waiting for broker to create topic. This check can take several minutes..."
278+
279+
# Wait for topic to actually exist in Kafka broker (not just CRD)
280+
for i in {1..30}; do
281+
if kubectl exec a2a-kafka-broker-0 -n kafka -- \
282+
/opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092 2>/dev/null | \
283+
grep -q "a2a-replicated-events"; then
284+
echo -e "${GREEN}✓ Topic exists in Kafka broker${NC}"
285+
break
286+
fi
287+
if [ $i -eq 30 ]; then
288+
echo -e "${RED}ERROR: Topic not found in broker after 30 attempts${NC}"
289+
exit 1
290+
fi
291+
sleep 2
292+
done
293+
else
294+
echo "Waiting for Kafka topic to be ready..."
295+
if kubectl wait --for=condition=Ready kafkatopic/a2a-replicated-events -n kafka --timeout=60s; then
296+
echo -e "${GREEN}✓ Kafka topic created${NC}"
297+
else
298+
echo -e "${RED}ERROR: Timeout waiting for Kafka topic${NC}"
299+
echo -e "${YELLOW}The topic operator may not be ready in this environment.${NC}"
300+
echo -e "${YELLOW}Run ./cleanup.sh, then retry with: export SKIP_ENTITY_OPERATOR_WAIT=true${NC}"
301+
exit 1
302+
fi
303+
fi
250304

251305
# Deploy Agent ConfigMap
252306
echo ""

0 commit comments

Comments
 (0)