|
| 1 | +# AGENTS.md |
| 2 | + |
| 3 | +This file provides instructions for AI coding agents working on the **OWASP Noir Passive Rules** repository. |
| 4 | + |
| 5 | +## Repository Overview |
| 6 | + |
| 7 | +This repository contains YAML-based passive scan rules consumed by [OWASP Noir](https://github.com/owasp-noir/noir). Each rule defines patterns (keywords and regular expressions) to detect secrets, API keys, tokens, and other sensitive data in source code during static analysis. |
| 8 | + |
| 9 | +### Directory Layout |
| 10 | + |
| 11 | +``` |
| 12 | +. |
| 13 | +├── secrets/ # YAML rule definition files (main content) |
| 14 | +├── spec/ # Crystal test suite |
| 15 | +│ ├── spec_helper.cr # Rule parser & matcher engine |
| 16 | +│ └── secrets_spec.cr # Tests for every rule |
| 17 | +├── .github/ |
| 18 | +│ ├── workflows/ |
| 19 | +│ │ ├── ci.yml # Push-to-main: Crystal spec, contributors, revision |
| 20 | +│ │ └── yamllint.yml # PR: YAML lint check |
| 21 | +│ ├── yamllint.yml # yamllint configuration |
| 22 | +│ └── labeler.yml # PR auto-labeling rules |
| 23 | +├── AGENTS.md # ← You are here |
| 24 | +├── README.md |
| 25 | +├── LICENSE |
| 26 | +├── revision # Auto-updated timestamp |
| 27 | +└── CONTRIBUTORS.svg # Auto-generated contributor list |
| 28 | +``` |
| 29 | + |
| 30 | +## Rule File Schema |
| 31 | + |
| 32 | +Every rule file lives under `secrets/` and follows this structure: |
| 33 | + |
| 34 | +```yaml |
| 35 | +--- |
| 36 | +id: unique-rule-id |
| 37 | +info: |
| 38 | + name: Human readable name |
| 39 | + author: [author names] |
| 40 | + severity: critical|high|medium|low |
| 41 | + description: Description of what this rule detects |
| 42 | + reference: ['https://reference-url'] |
| 43 | +matchers-condition: or|and |
| 44 | +matchers: |
| 45 | + - type: word |
| 46 | + patterns: [KEYWORD_ONE, KEYWORD_TWO] |
| 47 | + condition: or|and |
| 48 | + - type: regex |
| 49 | + patterns: |
| 50 | + - 'regex_pattern_here' |
| 51 | + condition: or|and |
| 52 | +category: secret |
| 53 | +techs: ['*'] |
| 54 | +``` |
| 55 | +
|
| 56 | +### Required Fields |
| 57 | +
|
| 58 | +| Field | Description | |
| 59 | +|---|---| |
| 60 | +| `id` | Unique identifier, lowercase with hyphens (e.g. `github-token`) | |
| 61 | +| `info.name` | Human-readable rule name | |
| 62 | +| `info.author` | List of authors | |
| 63 | +| `info.severity` | One of: `critical`, `high`, `medium`, `low` | |
| 64 | +| `info.description` | What the rule detects | |
| 65 | +| `info.reference` | List of reference URLs (use `['']` if none) | |
| 66 | +| `matchers-condition` | How multiple matcher blocks combine: `or` or `and` | |
| 67 | +| `matchers` | At least one matcher block with `type`, `patterns`, `condition` | |
| 68 | +| `category` | Always `secret` | |
| 69 | +| `techs` | Always `['*']` for language-agnostic rules | |
| 70 | + |
| 71 | +### Matcher Types |
| 72 | + |
| 73 | +- **`word`** — Exact substring match. Use for environment variable names, fixed prefixes, and URL stems. |
| 74 | +- **`regex`** — Regular expression match. Use for structured token formats with known character classes and lengths. |
| 75 | + |
| 76 | +## Adding a New Rule — Step by Step |
| 77 | + |
| 78 | +### 1. Create the YAML Rule File |
| 79 | + |
| 80 | +Copy an existing rule as a starting point: |
| 81 | + |
| 82 | +```bash |
| 83 | +cp secrets/github-token.yaml secrets/your-new-rule.yaml |
| 84 | +``` |
| 85 | + |
| 86 | +Edit the file. Replace `id`, `info`, and `matchers` with your new rule's content. |
| 87 | + |
| 88 | +**CRITICAL**: The file MUST end with exactly one newline character. |
| 89 | + |
| 90 | +### 2. Validate YAML Syntax |
| 91 | + |
| 92 | +```bash |
| 93 | +yamllint -c .github/yamllint.yml secrets/your-new-rule.yaml |
| 94 | +``` |
| 95 | + |
| 96 | +Expected: no output, exit code 0. This completes in under 1 second. |
| 97 | + |
| 98 | +### 3. Add Tests to `spec/secrets_spec.cr` |
| 99 | + |
| 100 | +Every rule **must** have corresponding tests. Add a new `describe` block for your rule inside the `"Passive Secret Rules"` describe block. Each rule needs at minimum: |
| 101 | + |
| 102 | +- **Positive tests** — one per word pattern and one per regex pattern, confirming `rule.match?` returns `true`. |
| 103 | +- **Negative tests** — at least one benign string that must NOT match, confirming `rule.match?` returns `false`. |
| 104 | +- **Boundary tests** — strings that are close to matching but should not (e.g. too short, wrong prefix). |
| 105 | + |
| 106 | +Example structure: |
| 107 | + |
| 108 | +```crystal |
| 109 | +describe "your-new-rule" do |
| 110 | + rule = Rule.from_file(File.join(SECRETS_DIR, "your-new-rule.yaml")) |
| 111 | +
|
| 112 | + it "matches YOUR_ENV_VAR keyword" do |
| 113 | + rule.match?("YOUR" + "_ENV_VAR=something").should be_true |
| 114 | + end |
| 115 | +
|
| 116 | + it "matches token regex pattern" do |
| 117 | + rule.match?("pre" + "fix_" + "A" * 40).should be_true |
| 118 | + end |
| 119 | +
|
| 120 | + it "does not match unrelated text" do |
| 121 | + rule.match?("This is normal text with no secrets").should be_false |
| 122 | + end |
| 123 | +
|
| 124 | + it "does not match too-short token" do |
| 125 | + rule.match?("pre" + "fix_short").should be_false |
| 126 | + end |
| 127 | +end |
| 128 | +``` |
| 129 | + |
| 130 | +### 4. Run Tests |
| 131 | + |
| 132 | +```bash |
| 133 | +crystal spec/secrets_spec.cr |
| 134 | +``` |
| 135 | + |
| 136 | +All tests must pass with 0 failures. |
| 137 | + |
| 138 | +### 5. Final Validation |
| 139 | + |
| 140 | +```bash |
| 141 | +yamllint -c .github/yamllint.yml secrets/*.yaml |
| 142 | +crystal spec/secrets_spec.cr |
| 143 | +``` |
| 144 | + |
| 145 | +Both commands must succeed before committing. |
| 146 | + |
| 147 | +## ⚠️ GitHub Push Protection — CRITICAL |
| 148 | + |
| 149 | +GitHub Push Protection scans pushed code for patterns that look like real secrets. Since this repository tests secret detection patterns, **test strings can trigger push protection and block your push entirely**. |
| 150 | + |
| 151 | +### The Problem |
| 152 | + |
| 153 | +If you write a test string like this: |
| 154 | + |
| 155 | +```crystal |
| 156 | +# ❌ BAD — GitHub will block the push |
| 157 | +rule.match?("ghp_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghij").should be_true |
| 158 | +``` |
| 159 | + |
| 160 | +GitHub's static scanner sees `ghp_ABCDEF...` as a real GitHub Personal Access Token and rejects the push with error `GH013: Repository rule violations found`. |
| 161 | + |
| 162 | +### The Solution — Runtime String Assembly |
| 163 | + |
| 164 | +**Never write a complete secret-like literal in source code.** Instead, assemble it at runtime using string concatenation so no single string literal matches a known secret pattern. |
| 165 | + |
| 166 | +```crystal |
| 167 | +# ✅ GOOD — Assembled at runtime, invisible to static scanners |
| 168 | +rule.match?("gh" + "p_" + "ABCDEFGHIJKLMNOPQRSTUVWXYZ" + "abcdefghij").should be_true |
| 169 | +``` |
| 170 | + |
| 171 | +### Using the `FakeSecrets` Module |
| 172 | + |
| 173 | +`spec/secrets_spec.cr` contains a `FakeSecrets` module that centralises all fake secret strings. When adding new rules: |
| 174 | + |
| 175 | +1. **Add a new method** to `FakeSecrets` that builds the fake token with concatenation: |
| 176 | + |
| 177 | +```crystal |
| 178 | +module FakeSecrets |
| 179 | + # ... existing methods ... |
| 180 | +
|
| 181 | + def self.your_new_token |
| 182 | + "your_pre" + "fix_" + "A" * 40 |
| 183 | + end |
| 184 | +end |
| 185 | +``` |
| 186 | + |
| 187 | +2. **Use it in tests**: |
| 188 | + |
| 189 | +```crystal |
| 190 | +it "matches token regex" do |
| 191 | + rule.match?(FakeSecrets.your_new_token).should be_true |
| 192 | +end |
| 193 | +``` |
| 194 | + |
| 195 | +### What to Split |
| 196 | + |
| 197 | +Split the string at the **signature prefix** that secret scanners look for. Common split points: |
| 198 | + |
| 199 | +| Secret Type | Literal to Avoid | How to Split | |
| 200 | +|---|---|---| |
| 201 | +| GitHub PAT | `ghp_XXXX` | `"gh" + "p_" + "XXXX"` | |
| 202 | +| GitLab PAT | `glpat-XXXX` | `"glp" + "at-" + "XXXX"` | |
| 203 | +| AWS Key ID | `AKIAXXXX` | `"AKI" + "AXXXX"` | |
| 204 | +| OpenAI Key | `sk-XXXX` | `"sk" + "-" + "XXXX"` | |
| 205 | +| Stripe Key | `sk_live_XXXX` | `"sk" + "_live_" + "XXXX"` | |
| 206 | +| Webhook URL | `https://hooks.slack.com/services/TXXX` | `"https://hooks" + ".slack.com/services/" + "TXXX"` | |
| 207 | +| PEM Header | `-----BEGIN RSA PRIVATE KEY-----` | `"-----BEGIN " + "RSA PRIVATE KEY-----"` | |
| 208 | +| Environment variable | `AWS_SECRET_ACCESS_KEY` | `"AWS_SECRET" + "_ACCESS_KEY"` | |
| 209 | + |
| 210 | +**Rule of thumb**: if `grep -E 'known_prefix_pattern'` would match your test string, it needs to be split. |
| 211 | + |
| 212 | +### Verification Before Push |
| 213 | + |
| 214 | +Run this command to check for accidental secret-like literals in test files: |
| 215 | + |
| 216 | +```bash |
| 217 | +grep -rnE '(ghp_[A-Za-z0-9]{20,}|gho_[A-Za-z0-9]{20,}|ghu_[A-Za-z0-9]{20,}|ghs_[A-Za-z0-9]{20,}|ghr_[A-Za-z0-9]{20,}|github_pat_[A-Za-z0-9_]{20,}|glpat-[A-Za-z0-9_-]{15,}|glptt-[A-Za-z0-9_-]{15,}|sk_live_[A-Za-z0-9]{20,}|rk_live_[A-Za-z0-9]{20,}|AKIA[0-9A-Z]{16}|sk-[a-zA-Z0-9]{40,}|AIzaSy[A-Za-z0-9\-_]{30,}|xai-[A-Za-z0-9]{80,})' spec/ |
| 218 | +``` |
| 219 | + |
| 220 | +Expected: **no output** (exit code 1). If anything matches, split the offending string. |
| 221 | + |
| 222 | +## Test Architecture |
| 223 | + |
| 224 | +### `spec/spec_helper.cr` |
| 225 | + |
| 226 | +Defines two structs: |
| 227 | + |
| 228 | +- **`Matcher`** — Represents one matcher block. Supports `word` (substring match) and `regex` (Regex match) types with `and`/`or` condition logic. |
| 229 | +- **`Rule`** — Represents a full YAML rule. Parses via `Rule.from_file(path)`, evaluates via `Rule#match?(text)`. |
| 230 | + |
| 231 | +### `spec/secrets_spec.cr` |
| 232 | + |
| 233 | +Organised into sections: |
| 234 | + |
| 235 | +| Section | Purpose | |
| 236 | +|---|---| |
| 237 | +| **YAML structure validation** | Validates every `.yaml` file has all required fields | |
| 238 | +| **Regex validity** | Ensures all regex patterns compile without error | |
| 239 | +| **Per-rule matching tests** | Positive, negative, and boundary tests for each rule | |
| 240 | +| **Cross-rule false positive checks** | Confirms benign strings match zero rules | |
| 241 | +| **Matchers-condition semantics** | Validates structural conventions (all use `or`, category `secret`) | |
| 242 | +| **Severity validation** | Ensures severity is one of the four allowed values | |
| 243 | + |
| 244 | +The YAML structure and regex validity tests are **automatically applied to all rule files** via `Dir.glob`, so new rules get basic validation for free. But per-rule matching tests must be written manually. |
| 245 | + |
| 246 | +## CI Workflows |
| 247 | + |
| 248 | +### On Push to `main` (`.github/workflows/ci.yml`) |
| 249 | + |
| 250 | +| Job | What It Does | |
| 251 | +|---|---| |
| 252 | +| `test` | Installs Crystal, runs `crystal spec/secrets_spec.cr` | |
| 253 | +| `contributors` | Updates `CONTRIBUTORS.svg` | |
| 254 | +| `revision` | Updates `revision` timestamp and pushes | |
| 255 | + |
| 256 | +### On Pull Request (`.github/workflows/yamllint.yml`) |
| 257 | + |
| 258 | +Runs `yamllint` against all `secrets/*.yaml` files. |
| 259 | + |
| 260 | +## Checklist for New Patterns |
| 261 | + |
| 262 | +Use this checklist every time you add or modify a rule: |
| 263 | + |
| 264 | +- [ ] YAML file created/modified in `secrets/` with all required fields |
| 265 | +- [ ] `yamllint -c .github/yamllint.yml secrets/your-rule.yaml` passes |
| 266 | +- [ ] File ends with exactly one newline character |
| 267 | +- [ ] `FakeSecrets` method added in `spec/secrets_spec.cr` using string concatenation |
| 268 | +- [ ] `describe` block added with positive, negative, and boundary tests |
| 269 | +- [ ] All test strings use runtime assembly — no complete secret literals |
| 270 | +- [ ] `grep -rnE '<secret patterns>' spec/` returns no matches |
| 271 | +- [ ] `crystal spec/secrets_spec.cr` passes with 0 failures |
| 272 | +- [ ] `yamllint -c .github/yamllint.yml secrets/*.yaml` passes for all files |
| 273 | + |
| 274 | +## Common Pitfalls |
| 275 | + |
| 276 | +| Pitfall | Fix | |
| 277 | +|---|---| |
| 278 | +| YAML file missing trailing newline | Ensure exactly one `\n` at end of file | |
| 279 | +| YAML indentation with tabs | Use 2 spaces, never tabs | |
| 280 | +| Regex special chars unescaped in YAML | Wrap regex in single quotes: `'pattern'` | |
| 281 | +| Push rejected by GitHub Push Protection | Split secret-like test strings with `+` concatenation | |
| 282 | +| Test passes locally but fails in CI | Ensure `spec/spec_helper.cr` and rule file are both committed | |
| 283 | +| New rule not covered by structure tests | Structure/regex tests auto-discover via glob — just ensure the file is in `secrets/` | |
| 284 | +| New rule missing matching tests | Always add a dedicated `describe` block with positive and negative cases | |
0 commit comments