Skip to content

Commit b62b85f

Browse files
author
Patrick Bareiss
committed
merge with master
1 parent fdb3a3b commit b62b85f

File tree

5 files changed

+609
-0
lines changed

5 files changed

+609
-0
lines changed

.github/VALIDATION_WORKFLOWS.md

Lines changed: 245 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,245 @@
1+
# Attack Data Validation Workflows
2+
3+
This document explains the GitHub Actions workflows that automatically validate attack data YAML files on every pull request and push to ensure data quality and consistency.
4+
5+
## Overview
6+
7+
The validation system consists of four main workflows that work together to ensure all attack data meets the required schema and quality standards:
8+
9+
1. **validate-pr.yml** - Full validation on all PRs
10+
2. **validate-changed-files.yml** - Optimized validation for only changed files
11+
3. **validate-push.yml** - Validation on pushes to main branches
12+
4. **required-checks.yml** - Status checks and YAML linting
13+
14+
## Workflows Description
15+
16+
### 1. Validate Attack Data on PR (`validate-pr.yml`)
17+
18+
**Triggers:** Pull requests to `master` or `main` branches
19+
**Purpose:** Comprehensive validation of all dataset YAML files
20+
21+
**Features:**
22+
- Runs on PR open, synchronize, and reopen events
23+
- Validates all YAML files in the `datasets/` directory
24+
- Uses the validation script at `bin/validate.py`
25+
- Comments on PR with success/failure status
26+
- Only triggers when relevant files are changed
27+
28+
**Path filters:**
29+
- `datasets/**/*.yml`
30+
- `datasets/**/*.yaml`
31+
- `bin/validate.py`
32+
- `bin/dataset_schema.json`
33+
- `bin/requirements.txt`
34+
35+
### 2. Validate Changed Attack Data Files (`validate-changed-files.yml`)
36+
37+
**Triggers:** Pull requests to `master` or `main` branches
38+
**Purpose:** Fast validation of only changed YAML files
39+
40+
**Features:**
41+
- Optimized for performance - only validates changed files
42+
- Uses `tj-actions/changed-files` to detect modifications
43+
- Provides detailed feedback on which files passed/failed
44+
- Automatically skips if no YAML files were changed
45+
- Comments on PR with detailed results
46+
47+
**Benefits:**
48+
- Faster execution for large repositories
49+
- Clear visibility into which specific files have issues
50+
- Reduces CI/CD time for PRs with few changes
51+
52+
### 3. Validate Attack Data on Push (`validate-push.yml`)
53+
54+
**Triggers:** Pushes to `master` or `main` branches
55+
**Purpose:** Safety net to catch validation failures that reach main branches
56+
57+
**Features:**
58+
- Validates all dataset files after merge
59+
- Creates GitHub issues automatically if validation fails
60+
- Provides detailed error reporting
61+
- Labels issues with appropriate tags for triage
62+
63+
**Issue Creation:**
64+
- Creates issues labeled with `bug`, `validation-failure`, `high-priority`
65+
- Includes commit hash and workflow run links
66+
- Provides action items for resolution
67+
68+
### 4. Required Status Checks (`required-checks.yml`)
69+
70+
**Triggers:** Pull requests to `master` or `main` branches
71+
**Purpose:** Enforce validation requirements and provide additional checks
72+
73+
**Features:**
74+
- Basic YAML syntax linting with `yamllint`
75+
- Status check requirement enforcement
76+
- Configuration for branch protection rules
77+
78+
## Setup Instructions
79+
80+
### 1. Branch Protection Rules
81+
82+
To enforce these validations, configure branch protection rules in your GitHub repository:
83+
84+
1. Go to **Settings****Branches**
85+
2. Add a rule for your main branch (`master` or `main`)
86+
3. Enable **Require status checks to pass before merging**
87+
4. Add these required status checks:
88+
- `validate-attack-data` (from validate-pr.yml)
89+
- `validate-changed-files` (from validate-changed-files.yml)
90+
- `validation-status` (from required-checks.yml)
91+
- `yaml-lint` (from required-checks.yml)
92+
93+
### 2. Repository Secrets
94+
95+
No additional secrets are required for the validation workflows. They use the default `GITHUB_TOKEN` for commenting on PRs and creating issues.
96+
97+
### 3. Dependencies
98+
99+
The workflows automatically install Python dependencies from `bin/requirements.txt`:
100+
- `pyyaml`
101+
- `jsonschema`
102+
- Other dependencies as needed
103+
104+
## Validation Rules
105+
106+
The validation process checks:
107+
108+
### Schema Validation
109+
- All YAML files must conform to the JSON schema in `bin/dataset_schema.json`
110+
- Required fields must be present and properly formatted
111+
- Data types must match schema specifications
112+
113+
### Custom Validations
114+
- **UUID Format**: The `id` field must be a valid UUID
115+
- **Date Format**: The `date` field must follow YYYY-MM-DD format
116+
- **File Naming**: Template files and files with 'old' in the name are excluded
117+
118+
### YAML Syntax
119+
- Valid YAML syntax
120+
- Proper indentation (2 spaces)
121+
- Line length limits (120 characters)
122+
- Consistent formatting
123+
124+
## Workflow Outputs
125+
126+
### Success Scenarios
127+
- ✅ PR comments indicating successful validation
128+
- ✅ Green status checks in PR interface
129+
- ✅ Detailed file-by-file validation results
130+
131+
### Failure Scenarios
132+
- ❌ PR comments with error details
133+
- ❌ Failed status checks blocking merge
134+
- 🚨 Automatic issue creation for main branch failures
135+
- 📝 Detailed error logs in workflow runs
136+
137+
## Troubleshooting
138+
139+
### Common Issues
140+
141+
1. **Schema Validation Errors**
142+
- Check that all required fields are present
143+
- Verify field data types match schema
144+
- Ensure proper YAML formatting
145+
146+
2. **UUID Format Errors**
147+
- Generate valid UUIDs using tools like `uuidgen`
148+
- Ensure no extra characters or formatting
149+
150+
3. **Date Format Errors**
151+
- Use YYYY-MM-DD format (e.g., 2024-01-15)
152+
- Avoid time components or other formats
153+
154+
4. **YAML Syntax Errors**
155+
- Use a YAML validator or linter
156+
- Check indentation (use spaces, not tabs)
157+
- Verify string quoting when needed
158+
159+
### Debugging Workflows
160+
161+
1. **Check Workflow Logs**
162+
- Go to Actions tab in GitHub
163+
- Click on the failed workflow run
164+
- Review step-by-step execution logs
165+
166+
2. **Local Testing**
167+
```bash
168+
cd bin
169+
python validate.py ../datasets
170+
```
171+
172+
3. **File-Specific Testing**
173+
```bash
174+
cd bin
175+
python validate.py path/to/specific/file.yml
176+
```
177+
178+
## Best Practices
179+
180+
### For Contributors
181+
182+
1. **Test Locally First**
183+
- Run validation script before pushing
184+
- Use the same schema and validation rules
185+
186+
2. **Keep Changes Small**
187+
- Smaller PRs are easier to validate and review
188+
- Changed-files workflow provides faster feedback
189+
190+
3. **Follow Schema Requirements**
191+
- Always include required fields
192+
- Use proper data types and formats
193+
- Reference schema documentation
194+
195+
### For Maintainers
196+
197+
1. **Monitor Validation Health**
198+
- Review failed workflows regularly
199+
- Update schema as requirements evolve
200+
- Keep dependencies updated
201+
202+
2. **Branch Protection**
203+
- Enforce status checks on main branches
204+
- Require reviews in addition to validation
205+
- Consider additional quality gates
206+
207+
3. **Issue Triage**
208+
- Address validation failures on main branches quickly
209+
- Create hotfix procedures for critical issues
210+
- Maintain schema documentation
211+
212+
## Files Structure
213+
214+
```
215+
.github/
216+
├── workflows/
217+
│ ├── validate-pr.yml # Full PR validation
218+
│ ├── validate-changed-files.yml # Changed files validation
219+
│ ├── validate-push.yml # Push validation
220+
│ └── required-checks.yml # Status checks & linting
221+
└── VALIDATION_WORKFLOWS.md # This documentation
222+
223+
bin/
224+
├── validate.py # Main validation script
225+
├── dataset_schema.json # JSON schema definition
226+
└── requirements.txt # Python dependencies
227+
228+
datasets/ # Attack data files
229+
└── **/*.yml, **/*.yaml # Files to validate
230+
```
231+
232+
## Support
233+
234+
For issues with validation workflows:
235+
236+
1. Check this documentation first
237+
2. Review workflow logs in GitHub Actions
238+
3. Test validation locally using the `validate.py` script
239+
4. Create an issue if problems persist
240+
241+
For schema-related questions:
242+
- Review `bin/dataset_schema.json`
243+
- Check existing valid examples in `datasets/`
244+
- Refer to attack data documentation
245+
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
name: Required Status Checks
2+
3+
on:
4+
pull_request:
5+
branches: [ master, main ]
6+
types: [opened, synchronize, reopened]
7+
8+
jobs:
9+
# This job ensures all validation workflows are required for PRs
10+
validation-status:
11+
runs-on: ubuntu-latest
12+
steps:
13+
- name: Check validation status
14+
run: |
15+
echo "This workflow ensures that all validation checks are required for PR merging."
16+
echo "Required workflows:"
17+
echo " ✅ Validate Attack Data on PR"
18+
echo " ✅ Validate Changed Attack Data Files"
19+
echo ""
20+
echo "This job will always pass, but the other validation workflows must complete successfully."
21+
echo "Configure branch protection rules to require these status checks before merging."
22+
23+
# Lint YAML syntax (basic check)
24+
yaml-lint:
25+
runs-on: ubuntu-latest
26+
steps:
27+
- name: Checkout repository
28+
uses: actions/checkout@v4
29+
30+
- name: Set up Python
31+
uses: actions/setup-python@v4
32+
with:
33+
python-version: '3.9'
34+
35+
- name: Install yamllint
36+
run: pip install yamllint
37+
38+
- name: Lint YAML files
39+
run: |
40+
# Create a yamllint configuration
41+
cat > .yamllint.yml << 'EOF'
42+
extends: default
43+
rules:
44+
line-length:
45+
max: 120
46+
indentation:
47+
spaces: 2
48+
comments:
49+
min-spaces-from-content: 1
50+
document-start: disable
51+
truthy: disable
52+
EOF
53+
54+
# Find and lint all YAML files in datasets
55+
find datasets -name "*.yml" -o -name "*.yaml" | while read file; do
56+
echo "Linting: $file"
57+
yamllint -c .yamllint.yml "$file"
58+
done
59+

0 commit comments

Comments
 (0)