-
Notifications
You must be signed in to change notification settings - Fork 147
Description
Summary
The current Vale setup generates ~23,000 lines of output with ~11,000+ rule violations across the documentation. The vast majority of these are low-value suggestions that are routinely ignored, undermining the effectiveness of the linter. This proposal recommends a configuration that catches real issues while eliminating noise.
Current State Analysis
Problem Summary
| Issue | Impact |
|---|---|
| 3,695 E-Prime violations | write-good.E-Prime flags every use of "is", "are", "was" — impractical for any documentation |
| 2,690 duplicate acronym warnings | Google.Acronyms + Microsoft.Acronyms both fire for the same acronyms |
| 942 "complex words" suggestions | Flags "provide" → "give", "determine" → "decide" — often wrong for technical context |
| 582 heading capitalization issues | Microsoft sentence-case rule conflicts with Crossplane's title-case headings |
| 563 vocabulary suggestions | "Verify 'allow' with the A-Z word list" — not actionable |
| 7 overlapping style packages | alex, Google, Microsoft, proselint, write-good, gitlab, Crossplane — creates duplicate/conflicting rules |
Root Causes
- Too many style packages: Using 7 packages creates massive overlap and conflicting rules
- No severity differentiation: Warnings treated the same as errors, leading to "alert fatigue"
- Rules not tuned for technical docs: E-Prime (avoiding "to be") is for creative writing, not technical documentation
- Missing Crossplane-specific acronym exceptions: CNCF, AWS, GCP, CRD, XRD not in acronym exception lists
- Outdated style packages: Packages are vendored and not updated via
vale sync - CI fails on ANY output: Even suggestions block PRs, so authors add
<!-- vale off -->everywhere
Proposed Solution
1. Simplify Style Package Selection
Current: alex, Google, Microsoft, proselint, write-good, gitlab, Crossplane (7 packages)
Proposed: Google, Crossplane (2 packages + selective rules from others)
Rationale:
- Google Developer Documentation Style Guide is the most appropriate for technical docs
- Consolidating to one primary style eliminates rule conflicts
- Custom Crossplane rules handle project-specific needs
- Cherry-pick individual valuable rules from other packages rather than importing everything
2. Implement Tiered Severity Levels
Following GitLab's best practice:
| Level | Purpose | CI Behavior | Examples |
|---|---|---|---|
| Error | Branding, broken content, serious issues | Fails PR | Spelling errors, deprecated terms, broken shortcodes |
| Warning | Style guide violations worth fixing | Displays, doesn't fail | Passive voice, sentence length, contractions |
| Suggestion | Nice-to-have improvements | Local editor only, ignored in CI | Word choice preferences |
CI Change: Only fail on errors, display warnings for awareness.
3. Disable High-Noise, Low-Value Rules
Rules to disable entirely:
| Rule | Reason |
|---|---|
write-good.E-Prime |
Banning "is/are/was" is impractical — generates 3,695 violations |
Microsoft.Acronyms |
Duplicate of Google.Acronyms |
Microsoft.ComplexWords |
"provide" → "give" is wrong for technical docs |
Microsoft.Vocab |
Not actionable ("verify with A-Z word list") |
Microsoft.Headings |
Conflicts with Crossplane's title-case heading style |
gitlab.ReadingLevel |
Grade-level scoring not meaningful for technical docs |
gitlab.FutureTense |
Over-aggressive for docs that describe future behavior |
gitlab.Uppercase |
511 violations, mostly false positives on proper nouns |
Google.Parens |
"Use parentheses judiciously" is too vague to be actionable |
write-good.Passive |
Duplicate of Google.Passive |
proselint.* |
Overlap with Google rules; disable entire package |
4. Expand Crossplane Acronym Exceptions
Add to Google/Acronyms.yml exceptions or create Crossplane/Acronyms.yml:
exceptions:
# Cloud providers
- AWS
- GCP
- EKS
- GKE
- AKS
# Kubernetes ecosystem
- CRD
- RBAC
- CNI
- CSI
# Crossplane-specific
- XRD
- XRC
- MRC
- UXP
# Organizations
- CNCF5. Modernize Package Management
Current: Vendored packages in utils/vale/styles/ with no version tracking
Proposed: Use Vale's native package management:
# .vale.ini
StylesPath = styles
MinAlertLevel = warning
Packages = Google, alex
[*.md]
BasedOnStyles = Google, CrossplaneThen run vale sync to download/update packages. This enables:
- Easy updates via
vale sync - Version pinning in CI
- Smaller repository size
6. Restructure CI Workflow
Current behavior: Any Vale output (including suggestions) fails the PR
Proposed:
- name: Run Vale
run: |
vale --config="./utils/vale/.vale.ini" \
--minAlertLevel=error \
${{ steps.changed-files.outputs.all_changed_files }}This approach:
- Only fails PRs on actual errors
- Still shows warnings/suggestions for author awareness
- Reduces friction while maintaining quality
7. Create a Crossplane-Specific Style Package
Consolidate custom rules into a proper Crossplane style:
utils/vale/styles/Crossplane/
├── Spelling.yml # (existing, enhanced)
├── Acronyms.yml # New: Crossplane/K8s acronyms
├── Terminology.yml # New: Enforce correct product names
├── Headings.yml # New: Crossplane title-case standard
├── vocab/
│ ├── accept.txt # Consolidated word lists
│ └── reject.txt # Terms to avoid
Example Terminology.yml:
extends: substitution
message: "Use '%s' instead of '%s'."
level: error
ignorecase: true
swap:
crossplane: Crossplane
kubernetes: Kubernetes
k8s: Kubernetes
helm chart: Helm chart
providerconfig: ProviderConfigProposed .vale.ini Configuration
StylesPath = styles
MinAlertLevel = suggestion
Packages = Google, alex
[*.md]
BasedOnStyles = Google, alex, Crossplane
# === DISABLED RULES ===
# Write-good: Disable impractical rules
write-good.E-Prime = NO
write-good.Passive = NO # Covered by Google.Passive
write-good.TooWordy = NO
write-good.Weasel = NO
# Google: Disable overly pedantic rules
Google.Parens = NO
Google.We = NO # First person acceptable in tutorials
# Microsoft: Don't use (overlaps with Google)
# Not loaded
# GitLab: Don't use (too opinionated for our needs)
# Not loaded
# Proselint: Don't use (overlaps with Google)
# Not loaded
# === RULE LEVEL ADJUSTMENTS ===
# Demote to suggestion (won't fail CI)
Google.Passive = suggestion
Google.Acronyms = suggestion
# Keep as warning (displayed but won't fail CI)
Google.Contractions = warning
alex.Gendered = warning
# Keep as error (will fail CI)
Crossplane.Spelling = error
Crossplane.Terminology = error
# === TOKEN/BLOCK IGNORES ===
TokenIgnores = (\(#.*\)),(\]\(),(http.*),({{<\s*\/?.*>}}),(\!\[),(\d.*px),(\d\.\d\.\d)
BlockIgnores = (\{\{<\s*\/?.*>\}\}),(\{\{\<\/\*.*\*\/\>\}\})Implementation Plan
Phase 1: Reduce Noise (Immediate)
- Disable
write-good.E-Primeand other high-volume, low-value rules - Remove duplicate rules (Microsoft.Acronyms, write-good.Passive)
- Expand acronym exceptions for Kubernetes/Crossplane terms
- Update CI to only fail on errors
Phase 2: Consolidate Packages (Short-term)
- Remove gitlab, Microsoft, proselint, write-good packages
- Keep Google as primary style, alex for inclusive language
- Create
Crossplane/Terminology.ymlfor project-specific terms - Migrate to
vale syncpackage management
Phase 3: Clean Existing Content (Medium-term)
- Fix remaining errors in existing docs (now manageable count)
- Remove scattered
<!-- vale off -->comments where possible - Document which rules are enforced and why
Phase 4: Continuous Improvement (Ongoing)
- Review rule violations quarterly
- Add custom rules for recurring issues
- Update packages via
vale sync
Expected Outcomes
| Metric | Current | After Phase 1 | After Phase 2 |
|---|---|---|---|
| Total violations | ~11,000 | ~1,500 | ~500 |
| Rules that block PRs | All | Errors only | Errors only |
| Actionable violations | ~5% | ~60% | ~90% |
| Style packages | 7 | 7 (with disabled rules) | 3 |
Appendix: Rule Violation Breakdown
Top violations in current setup:
3695 write-good.E-Prime ← DISABLE (impractical)
1345 Microsoft.Acronyms ← DISABLE (duplicate)
1345 Google.Acronyms ← ADD EXCEPTIONS
942 Microsoft.ComplexWords ← DISABLE (wrong for tech docs)
582 Microsoft.Headings ← DISABLE (conflicts with style)
567 Google.Parens ← DISABLE (not actionable)
563 Microsoft.Vocab ← DISABLE (not actionable)
511 gitlab.Uppercase ← DISABLE (false positives)
415 Microsoft.Passive ← DISABLE (duplicate)
415 Google.Passive ← DEMOTE to suggestion
345 gitlab.SentenceLength ← DEMOTE to suggestion
Sources: