Skip to content

Commit f3d87dc

Browse files
Merge pull request #51 from Prashanth684/ai-mg
Add a must-gather plugin for basic must-gather parsing and analysis
2 parents f958889 + 0c8758d commit f3d87dc

File tree

13 files changed

+2677
-0
lines changed

13 files changed

+2677
-0
lines changed

.claude-plugin/marketplace.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,11 @@
5858
"name": "yaml",
5959
"source": "./plugins/yaml",
6060
"description": "YAML documentation and utilities"
61+
},
62+
{
63+
"name": "must-gather",
64+
"source": "./plugins/must-gather",
65+
"description": "A plugin to analyze and report on must-gather data"
6166
}
6267
]
6368
}
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
{
2+
"name": "must-gather",
3+
"description": "A plugin to analyze and report on must-gather data",
4+
"version": "0.0.1",
5+
"author": {
6+
"name": "openshift"
7+
}
8+
}

plugins/must-gather/README.md

Lines changed: 330 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,330 @@
1+
# Must-Gather Analyzer Plugin
2+
3+
Claude Code plugin for analyzing OpenShift must-gather diagnostic data.
4+
5+
## Overview
6+
7+
This plugin provides tools to analyze must-gather data collected from OpenShift clusters, displaying resource status in familiar `oc`-like format and identifying cluster issues.
8+
9+
## Features
10+
11+
### Skills
12+
13+
- **Must-Gather Analyzer** - Comprehensive analysis of cluster operators, pods, nodes, and network components
14+
- Parses YAML resources from must-gather dumps
15+
- Displays output similar to `oc get` commands
16+
- Identifies and categorizes issues
17+
- Provides actionable diagnostics
18+
19+
### Analysis Scripts
20+
21+
All scripts located in `skills/must-gather-analyzer/scripts/`:
22+
23+
#### `analyze_clusterversion.py`
24+
Analyzes cluster version, update status, and capabilities.
25+
26+
```bash
27+
./analyze_clusterversion.py <must-gather-path>
28+
```
29+
30+
Output format matches `oc get clusterversion`:
31+
```
32+
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
33+
version 4.20.0-0.okd-scos-2025-08-18-130459 False 65d
34+
```
35+
36+
Also provides detailed information:
37+
- Cluster ID and version hash
38+
- Current and desired versions
39+
- Conditions (Available, Progressing, Failing, etc.)
40+
- Update history
41+
- Available updates
42+
- Enabled capabilities
43+
44+
#### `analyze_clusteroperators.py`
45+
Analyzes cluster operator status and health.
46+
47+
```bash
48+
./analyze_clusteroperators.py <must-gather-path>
49+
```
50+
51+
Output format matches `oc get clusteroperators`:
52+
```
53+
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
54+
authentication 4.18.26 True False False 149m
55+
baremetal 4.18.26 True False False 169m
56+
```
57+
58+
#### `analyze_pods.py`
59+
Analyzes pod status across all namespaces.
60+
61+
```bash
62+
# All pods in all namespaces
63+
./analyze_pods.py <must-gather-path>
64+
65+
# Specific namespace
66+
./analyze_pods.py <must-gather-path> --namespace openshift-etcd
67+
68+
# Only problematic pods
69+
./analyze_pods.py <must-gather-path> --problems-only
70+
```
71+
72+
Output format matches `oc get pods -A`:
73+
```
74+
NAMESPACE NAME READY STATUS RESTARTS AGE
75+
openshift-kube-apiserver kube-apiserver-master-0 4/4 Running 0 5d
76+
openshift-etcd etcd-master-1 1/1 CrashLoopBackOff 15 2h
77+
```
78+
79+
#### `analyze_nodes.py`
80+
Analyzes node status and conditions.
81+
82+
```bash
83+
# All nodes
84+
./analyze_nodes.py <must-gather-path>
85+
86+
# Only nodes with issues
87+
./analyze_nodes.py <must-gather-path> --problems-only
88+
```
89+
90+
Output format matches `oc get nodes`:
91+
```
92+
NAME STATUS ROLES AGE VERSION
93+
master-0.example.com Ready master 10d v1.27.0+1234
94+
worker-1.example.com Ready,MemoryPressure worker 10d v1.27.0+1234
95+
```
96+
97+
#### `analyze_network.py`
98+
Analyzes network configuration and health.
99+
100+
```bash
101+
./analyze_network.py <must-gather-path>
102+
```
103+
104+
Shows:
105+
- Network type (OVN-Kubernetes, OpenShift SDN)
106+
- Network operator status
107+
- OVN pod health
108+
- PodNetworkConnectivityCheck results
109+
110+
#### `analyze_events.py`
111+
Analyzes cluster events sorted by last occurrence.
112+
113+
```bash
114+
# Recent events (last 100)
115+
./analyze_events.py <must-gather-path>
116+
117+
# Warning events only
118+
./analyze_events.py <must-gather-path> --type Warning
119+
120+
# Events in specific namespace
121+
./analyze_events.py <must-gather-path> --namespace openshift-etcd
122+
123+
# Show last 50 events
124+
./analyze_events.py <must-gather-path> --count 50
125+
```
126+
127+
Output format:
128+
```
129+
NAMESPACE LAST SEEN TYPE REASON OBJECT MESSAGE
130+
openshift-etcd 64d Warning Unhealthy Pod/etcd-guard-ip-10-0-90-209 Readiness probe failed
131+
openshift-kube-apiserver 64d Normal Started Pod/kube-apiserver-master-0 Started container
132+
```
133+
134+
#### `analyze_etcd.py`
135+
Analyzes etcd cluster health from etcd_info directory.
136+
137+
```bash
138+
./analyze_etcd.py <must-gather-path>
139+
```
140+
141+
Shows:
142+
- Member health status
143+
- Member list with IDs and URLs
144+
- Endpoint status (leader, version, DB size)
145+
- Quorum status and summary
146+
147+
Output includes:
148+
```
149+
ETCD CLUSTER SUMMARY
150+
Total Members: 3
151+
Healthy Members: 3/3
152+
✅ All members healthy
153+
✅ Quorum achieved (3/2)
154+
```
155+
156+
#### `analyze_pvs.py`
157+
Analyzes PersistentVolumes and PersistentVolumeClaims.
158+
159+
```bash
160+
# All PVs and PVCs
161+
./analyze_pvs.py <must-gather-path>
162+
163+
# PVCs in specific namespace
164+
./analyze_pvs.py <must-gather-path> --namespace openshift-monitoring
165+
```
166+
167+
Output format:
168+
```
169+
PERSISTENT VOLUMES
170+
NAME CAPACITY ACCESS MODES RECLAIM STATUS CLAIM
171+
pvc-3d4a0119-b2f2-44fa-9b2f-b11c611c74f2 20Gi ReadWriteOnce Delete Bound openshift-monitoring/prometheus-data-pro
172+
173+
PERSISTENT VOLUME CLAIMS
174+
NAMESPACE NAME STATUS VOLUME CAPACITY
175+
openshift-monitoring prometheus-data-prometheus-0 Bound pvc-3d4a0119-b2f2-44fa-9b2f-b11c611c74f2 20Gi
176+
```
177+
178+
### Slash Commands
179+
180+
#### `/analyze-mg [path]`
181+
Runs comprehensive analysis of must-gather data.
182+
183+
```
184+
/analyze-mg ./must-gather.local.123456789
185+
```
186+
187+
Executes all analysis scripts and provides:
188+
- Executive summary of cluster health
189+
- Critical issues and warnings
190+
- Actionable recommendations
191+
- Suggested logs to review
192+
193+
## Installation
194+
195+
### From Local Repository
196+
197+
If you're working in the must-gather repository:
198+
199+
1. The plugin is already available in `.claude-plugin/`
200+
2. Claude Code will automatically detect project plugins
201+
202+
### Manual Installation
203+
204+
To use this plugin in other projects:
205+
206+
1. Copy the `.claude-plugin/` directory to your desired location
207+
2. Add to Claude Code:
208+
```bash
209+
/plugin marketplace add /path/to/.claude-plugin
210+
/plugin install must-gather-analyzer
211+
```
212+
213+
## Usage Examples
214+
215+
### Analyzing Cluster Version
216+
217+
Ask Claude:
218+
- "What version is this cluster running?"
219+
- "Show me the cluster version"
220+
- "What's the update status?"
221+
- "What capabilities are enabled?"
222+
223+
### Analyzing Cluster Operators
224+
225+
Ask Claude:
226+
- "Analyze the cluster operators in this must-gather"
227+
- "Which operators are degraded?"
228+
- "Show me operator status"
229+
230+
Claude will automatically use the Must-Gather Analyzer skill and run `analyze_clusteroperators.py`.
231+
232+
### Finding Pod Issues
233+
234+
Ask Claude:
235+
- "What pods are failing in this must-gather?"
236+
- "Show me crashlooping pods"
237+
- "Analyze pods in openshift-etcd namespace"
238+
239+
### Analyzing Events
240+
241+
Ask Claude:
242+
- "Show me warning events from this must-gather"
243+
- "What events occurred in openshift-etcd namespace?"
244+
- "Show me the last 50 events"
245+
246+
### Checking etcd Health
247+
248+
Ask Claude:
249+
- "Check etcd cluster health"
250+
- "What's the etcd member status?"
251+
- "Is etcd quorum healthy?"
252+
253+
### Analyzing Storage
254+
255+
Ask Claude:
256+
- "Show me PersistentVolumes and PVCs"
257+
- "What storage resources exist?"
258+
- "Are there any pending PVCs?"
259+
260+
### Complete Cluster Analysis
261+
262+
```
263+
/analyze-mg ./must-gather.local.5464029130631179436
264+
```
265+
266+
This runs all analysis scripts and provides comprehensive diagnostics.
267+
268+
## Requirements
269+
270+
- Python 3.6+
271+
- PyYAML library (`pip install pyyaml`)
272+
273+
## Must-Gather Directory Structure
274+
275+
Expected directory structure from `oc adm must-gather` output:
276+
277+
```
278+
must-gather.local.*/
279+
├── cluster-scoped-resources/
280+
│ ├── config.openshift.io/
281+
│ │ └── clusteroperators/
282+
│ └── core/
283+
│ └── nodes/
284+
├── namespaces/
285+
│ └── <namespace>/
286+
│ └── core/
287+
│ └── pods/
288+
└── network_logs/
289+
```
290+
291+
## Development
292+
293+
### Adding New Analysis Scripts
294+
295+
1. Create script in `skills/must-gather-analyzer/scripts/`
296+
2. Follow the output format pattern (matching `oc get` commands)
297+
3. Update `SKILL.md` with usage instructions
298+
4. Add to `/analyze-mg` command workflow
299+
300+
### Output Format Guidelines
301+
302+
All scripts should:
303+
- Use tabular output matching `oc` command format
304+
- Handle missing resources gracefully
305+
- Print "No resources found." when appropriate
306+
- Support common flags like `--namespace`, `--problems-only`
307+
308+
## Troubleshooting
309+
310+
### "No resources found"
311+
- Verify must-gather path is correct
312+
- Check that must-gather completed successfully
313+
- Ensure directory structure matches expected format
314+
315+
### Scripts not executing
316+
- Verify scripts are executable: `chmod +x scripts/*.py`
317+
- Check Python 3 is available
318+
- Install dependencies: `pip install pyyaml`
319+
320+
## Contributing
321+
322+
When adding new analysis capabilities:
323+
1. Follow existing script patterns
324+
2. Match `oc` command output format
325+
3. Include error handling for missing data
326+
4. Update this README with new features
327+
328+
## License
329+
330+
This plugin is part of the openshift/must-gather repository.

0 commit comments

Comments
 (0)