Skip to content

Proposal: Create a Crossplane-specific Vale configuration #1029

@jeanduplessis

Description

@jeanduplessis

Summary

The current Vale setup generates ~23,000 lines of output with ~11,000+ rule violations across the documentation. The vast majority of these are low-value suggestions that are routinely ignored, undermining the effectiveness of the linter. This proposal recommends a configuration that catches real issues while eliminating noise.

Current State Analysis

Problem Summary

Issue Impact
3,695 E-Prime violations write-good.E-Prime flags every use of "is", "are", "was" — impractical for any documentation
2,690 duplicate acronym warnings Google.Acronyms + Microsoft.Acronyms both fire for the same acronyms
942 "complex words" suggestions Flags "provide" → "give", "determine" → "decide" — often wrong for technical context
582 heading capitalization issues Microsoft sentence-case rule conflicts with Crossplane's title-case headings
563 vocabulary suggestions "Verify 'allow' with the A-Z word list" — not actionable
7 overlapping style packages alex, Google, Microsoft, proselint, write-good, gitlab, Crossplane — creates duplicate/conflicting rules

Root Causes

  1. Too many style packages: Using 7 packages creates massive overlap and conflicting rules
  2. No severity differentiation: Warnings treated the same as errors, leading to "alert fatigue"
  3. Rules not tuned for technical docs: E-Prime (avoiding "to be") is for creative writing, not technical documentation
  4. Missing Crossplane-specific acronym exceptions: CNCF, AWS, GCP, CRD, XRD not in acronym exception lists
  5. Outdated style packages: Packages are vendored and not updated via vale sync
  6. CI fails on ANY output: Even suggestions block PRs, so authors add <!-- vale off --> everywhere

Proposed Solution

1. Simplify Style Package Selection

Current: alex, Google, Microsoft, proselint, write-good, gitlab, Crossplane (7 packages)

Proposed: Google, Crossplane (2 packages + selective rules from others)

Rationale:

  • Google Developer Documentation Style Guide is the most appropriate for technical docs
  • Consolidating to one primary style eliminates rule conflicts
  • Custom Crossplane rules handle project-specific needs
  • Cherry-pick individual valuable rules from other packages rather than importing everything

2. Implement Tiered Severity Levels

Following GitLab's best practice:

Level Purpose CI Behavior Examples
Error Branding, broken content, serious issues Fails PR Spelling errors, deprecated terms, broken shortcodes
Warning Style guide violations worth fixing Displays, doesn't fail Passive voice, sentence length, contractions
Suggestion Nice-to-have improvements Local editor only, ignored in CI Word choice preferences

CI Change: Only fail on errors, display warnings for awareness.

3. Disable High-Noise, Low-Value Rules

Rules to disable entirely:

Rule Reason
write-good.E-Prime Banning "is/are/was" is impractical — generates 3,695 violations
Microsoft.Acronyms Duplicate of Google.Acronyms
Microsoft.ComplexWords "provide" → "give" is wrong for technical docs
Microsoft.Vocab Not actionable ("verify with A-Z word list")
Microsoft.Headings Conflicts with Crossplane's title-case heading style
gitlab.ReadingLevel Grade-level scoring not meaningful for technical docs
gitlab.FutureTense Over-aggressive for docs that describe future behavior
gitlab.Uppercase 511 violations, mostly false positives on proper nouns
Google.Parens "Use parentheses judiciously" is too vague to be actionable
write-good.Passive Duplicate of Google.Passive
proselint.* Overlap with Google rules; disable entire package

4. Expand Crossplane Acronym Exceptions

Add to Google/Acronyms.yml exceptions or create Crossplane/Acronyms.yml:

exceptions:
  # Cloud providers
  - AWS
  - GCP
  - EKS
  - GKE
  - AKS
  # Kubernetes ecosystem  
  - CRD
  - RBAC
  - CNI
  - CSI
  # Crossplane-specific
  - XRD
  - XRC
  - MRC
  - UXP
  # Organizations
  - CNCF

5. Modernize Package Management

Current: Vendored packages in utils/vale/styles/ with no version tracking

Proposed: Use Vale's native package management:

# .vale.ini
StylesPath = styles
MinAlertLevel = warning

Packages = Google, alex

[*.md]
BasedOnStyles = Google, Crossplane

Then run vale sync to download/update packages. This enables:

  • Easy updates via vale sync
  • Version pinning in CI
  • Smaller repository size

6. Restructure CI Workflow

Current behavior: Any Vale output (including suggestions) fails the PR

Proposed:

- name: Run Vale
  run: |
    vale --config="./utils/vale/.vale.ini" \
         --minAlertLevel=error \
         ${{ steps.changed-files.outputs.all_changed_files }}

This approach:

  • Only fails PRs on actual errors
  • Still shows warnings/suggestions for author awareness
  • Reduces friction while maintaining quality

7. Create a Crossplane-Specific Style Package

Consolidate custom rules into a proper Crossplane style:

utils/vale/styles/Crossplane/
├── Spelling.yml           # (existing, enhanced)
├── Acronyms.yml           # New: Crossplane/K8s acronyms
├── Terminology.yml        # New: Enforce correct product names
├── Headings.yml           # New: Crossplane title-case standard
├── vocab/
│   ├── accept.txt         # Consolidated word lists
│   └── reject.txt         # Terms to avoid

Example Terminology.yml:

extends: substitution
message: "Use '%s' instead of '%s'."
level: error
ignorecase: true
swap:
  crossplane: Crossplane
  kubernetes: Kubernetes
  k8s: Kubernetes
  helm chart: Helm chart
  providerconfig: ProviderConfig

Proposed .vale.ini Configuration

StylesPath = styles
MinAlertLevel = suggestion

Packages = Google, alex

[*.md]
BasedOnStyles = Google, alex, Crossplane

# === DISABLED RULES ===

# Write-good: Disable impractical rules
write-good.E-Prime = NO
write-good.Passive = NO  # Covered by Google.Passive
write-good.TooWordy = NO
write-good.Weasel = NO

# Google: Disable overly pedantic rules  
Google.Parens = NO
Google.We = NO  # First person acceptable in tutorials

# Microsoft: Don't use (overlaps with Google)
# Not loaded

# GitLab: Don't use (too opinionated for our needs)
# Not loaded

# Proselint: Don't use (overlaps with Google)
# Not loaded

# === RULE LEVEL ADJUSTMENTS ===

# Demote to suggestion (won't fail CI)
Google.Passive = suggestion
Google.Acronyms = suggestion

# Keep as warning (displayed but won't fail CI)
Google.Contractions = warning
alex.Gendered = warning

# Keep as error (will fail CI)
Crossplane.Spelling = error
Crossplane.Terminology = error

# === TOKEN/BLOCK IGNORES ===
TokenIgnores = (\(#.*\)),(\]\(),(http.*),({{<\s*\/?.*>}}),(\!\[),(\d.*px),(\d\.\d\.\d)
BlockIgnores = (\{\{<\s*\/?.*>\}\}),(\{\{\<\/\*.*\*\/\>\}\})

Implementation Plan

Phase 1: Reduce Noise (Immediate)

  1. Disable write-good.E-Prime and other high-volume, low-value rules
  2. Remove duplicate rules (Microsoft.Acronyms, write-good.Passive)
  3. Expand acronym exceptions for Kubernetes/Crossplane terms
  4. Update CI to only fail on errors

Phase 2: Consolidate Packages (Short-term)

  1. Remove gitlab, Microsoft, proselint, write-good packages
  2. Keep Google as primary style, alex for inclusive language
  3. Create Crossplane/Terminology.yml for project-specific terms
  4. Migrate to vale sync package management

Phase 3: Clean Existing Content (Medium-term)

  1. Fix remaining errors in existing docs (now manageable count)
  2. Remove scattered <!-- vale off --> comments where possible
  3. Document which rules are enforced and why

Phase 4: Continuous Improvement (Ongoing)

  1. Review rule violations quarterly
  2. Add custom rules for recurring issues
  3. Update packages via vale sync

Expected Outcomes

Metric Current After Phase 1 After Phase 2
Total violations ~11,000 ~1,500 ~500
Rules that block PRs All Errors only Errors only
Actionable violations ~5% ~60% ~90%
Style packages 7 7 (with disabled rules) 3

Appendix: Rule Violation Breakdown

Top violations in current setup:

3695 write-good.E-Prime      ← DISABLE (impractical)
1345 Microsoft.Acronyms      ← DISABLE (duplicate)
1345 Google.Acronyms         ← ADD EXCEPTIONS
 942 Microsoft.ComplexWords  ← DISABLE (wrong for tech docs)
 582 Microsoft.Headings      ← DISABLE (conflicts with style)
 567 Google.Parens           ← DISABLE (not actionable)
 563 Microsoft.Vocab         ← DISABLE (not actionable)
 511 gitlab.Uppercase        ← DISABLE (false positives)
 415 Microsoft.Passive       ← DISABLE (duplicate)
 415 Google.Passive          ← DEMOTE to suggestion
 345 gitlab.SentenceLength   ← DEMOTE to suggestion

Sources:

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions