Skip to content

Commit 26fa4d6

Browse files
committed
leaklens initial commit
1 parent 93876fe commit 26fa4d6

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+2650
-0
lines changed

.github/workflows/leaklens.yml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
name: LeakLens CI Scan
2+
3+
on:
4+
pull_request:
5+
push:
6+
branches:
7+
- main
8+
9+
jobs:
10+
scan:
11+
runs-on: ubuntu-latest
12+
steps:
13+
- uses: actions/checkout@v4
14+
with:
15+
fetch-depth: 0
16+
17+
- uses: actions/setup-python@v5
18+
with:
19+
python-version: "3.12"
20+
21+
- name: Install dependencies
22+
run: |
23+
python -m pip install --upgrade pip
24+
pip install -e '.[dev]'
25+
26+
- name: Run tests
27+
run: pytest
28+
29+
- name: Run LeakLens report (SARIF)
30+
run: leaklens report --format sarif --output leaklens.sarif
31+
32+
- name: Upload SARIF
33+
uses: github/codeql-action/upload-sarif@v3
34+
with:
35+
sarif_file: leaklens.sarif

.gitignore

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
__pycache__/
2+
*.py[cod]
3+
*.egg-info/
4+
.pytest_cache/
5+
.ruff_cache/
6+
.venv/
7+
.leaklens-cache.json
8+
.leaklens-baseline.json

.leaklensignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Local samples and generated assets
2+
examples/generated/**
3+
*.min.js

.pre-commit-config.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
repos:
2+
- repo: local
3+
hooks:
4+
- id: leaklens
5+
name: leaklens secret scan
6+
entry: leaklens scan --staged
7+
language: system
8+
pass_filenames: false

Makefile

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
.PHONY: install dev test lint format scan scan-staged
2+
3+
install:
4+
python -m pip install -e .
5+
6+
dev:
7+
python -m pip install -e '.[dev]'
8+
9+
test:
10+
pytest
11+
12+
lint:
13+
ruff check src tests
14+
15+
format:
16+
ruff format src tests
17+
18+
scan:
19+
python -m leaklens scan .
20+
21+
scan-staged:
22+
python -m leaklens scan --staged

README.md

Lines changed: 253 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,253 @@
1+
# LeakLens
2+
3+
LeakLens is a production-focused credential and secret detection tool for Git repositories.
4+
5+
It is designed to be stronger than regex-only scanners by combining:
6+
7+
- regex detection for known secret formats
8+
- entropy detection for unknown/random secrets
9+
- contextual code analysis for suspicious hardcoded values
10+
- developer-friendly remediation guidance
11+
- safe autofix suggestions (advisory only, no automatic file mutation)
12+
13+
## Why this exists
14+
15+
Most leaked credentials are introduced during normal development and code review misses.
16+
LeakLens helps developers catch those leaks early:
17+
18+
- locally via CLI
19+
- before commit via pre-commit hook
20+
- in CI/CD via GitHub Actions
21+
22+
## Features
23+
24+
- `leaklens scan .`
25+
- `leaklens scan --staged`
26+
- `leaklens scan --commit <hash>`
27+
- `leaklens scan --diff <base> <head>`
28+
- `leaklens rules list`
29+
- `leaklens report --format json`
30+
- `leaklens report --format sarif`
31+
32+
Deterministic CI behavior:
33+
34+
- stable sort order for findings and JSON/SARIF output
35+
- non-zero exit code when findings meet `--fail-on` (or configured threshold)
36+
- redacted previews only (never full secret output)
37+
38+
Detection pipeline:
39+
40+
1. Regex detectors (AWS/GitHub/GitLab/Slack/Stripe/OpenAI/Google/JWT/private keys/.env/db URLs)
41+
2. Entropy detector using Shannon entropy over candidate literals
42+
3. Context detector for suspicious assignments and auth-adjacent literals
43+
44+
Output includes:
45+
46+
- finding type
47+
- file path and line number
48+
- redacted preview
49+
- detector source(s)
50+
- confidence score
51+
- severity (`low|medium|high|critical`)
52+
- risk explanation
53+
- safer alternative
54+
- remediation guidance
55+
- autofix suggestion
56+
57+
## Installation
58+
59+
```bash
60+
pip install -e .
61+
```
62+
63+
Development setup:
64+
65+
```bash
66+
pip install -e '.[dev]'
67+
```
68+
69+
## Usage
70+
71+
Command quick reference:
72+
73+
| Command | Purpose |
74+
| --- | --- |
75+
| `leaklens scan .` | Full repository scan |
76+
| `leaklens scan --staged` | Staged changes scan |
77+
| `leaklens scan --commit <hash>` | Single commit scan |
78+
| `leaklens scan --diff <base> <head>` | Commit-range diff scan |
79+
| `leaklens rules list` | List active rules |
80+
| `leaklens report --format json` | CI JSON report |
81+
| `leaklens report --format sarif` | SARIF report for code scanning |
82+
83+
Scan repository:
84+
85+
```bash
86+
leaklens scan .
87+
```
88+
89+
Scan staged changes:
90+
91+
```bash
92+
leaklens scan --staged
93+
```
94+
95+
Scan specific commit:
96+
97+
```bash
98+
leaklens scan --commit <hash>
99+
```
100+
101+
Scan diff range:
102+
103+
```bash
104+
leaklens scan --diff main HEAD
105+
```
106+
107+
List rules:
108+
109+
```bash
110+
leaklens rules list
111+
```
112+
113+
CI JSON report:
114+
115+
```bash
116+
leaklens report --format json
117+
```
118+
119+
SARIF report:
120+
121+
```bash
122+
leaklens report --format sarif --output leaklens.sarif
123+
```
124+
125+
Version:
126+
127+
```bash
128+
leaklens --version
129+
```
130+
131+
Fail threshold override:
132+
133+
```bash
134+
leaklens scan . --fail-on high
135+
```
136+
137+
Run as module:
138+
139+
```bash
140+
python -m leaklens scan .
141+
```
142+
143+
Exit code semantics:
144+
145+
- `0`: no findings at/above fail threshold
146+
- `1`: findings at/above fail threshold
147+
- `2`: CLI usage/configuration errors
148+
149+
## Configuration
150+
151+
Default config file: `leaklens.yml`
152+
153+
Example:
154+
155+
```yaml
156+
entropy_threshold: 4.2
157+
severity_threshold: medium
158+
enabled_detectors: [regex, entropy, context]
159+
ignored_paths:
160+
- "node_modules/**"
161+
allowlist:
162+
values: ["example-secret"]
163+
patterns: ["^dummy_"]
164+
rules:
165+
- name: custom_internal_token
166+
regex: "inttok_[A-Za-z0-9]{24}"
167+
secret_type: "Internal API Token"
168+
severity: high
169+
confidence: 0.9
170+
baseline_file: .leaklens-baseline.json
171+
```
172+
173+
## Ignore and baseline support
174+
175+
- `.leaklensignore` for path patterns
176+
- inline ignore markers: `leaklens:ignore`
177+
- allowlist values and patterns in config
178+
- baseline suppression via fingerprints
179+
- legacy compatibility: `.aicredleakignore` and `aicredleak:ignore` are also accepted
180+
181+
Generate baseline from current findings:
182+
183+
```bash
184+
leaklens scan . --write-baseline .leaklens-baseline.json
185+
```
186+
187+
## Safe redaction
188+
189+
LeakLens never prints full secret values. Example previews:
190+
191+
- `ghp_****ABCD`
192+
- `sk-****XYZ`
193+
194+
## Pre-commit setup
195+
196+
Use the included `.pre-commit-config.yaml` hook:
197+
198+
```yaml
199+
repos:
200+
- repo: local
201+
hooks:
202+
- id: leaklens
203+
entry: leaklens scan --staged
204+
```
205+
206+
## GitHub Actions setup
207+
208+
Use `.github/workflows/leaklens.yml`.
209+
210+
The workflow:
211+
212+
- installs dependencies
213+
- runs tests
214+
- generates SARIF via `leaklens report --format sarif`
215+
- uploads SARIF to GitHub Code Scanning
216+
217+
## Project structure
218+
219+
```text
220+
src/leaklens/
221+
cli.py
222+
config.py
223+
engine.py
224+
rules.py
225+
models.py
226+
detectors/
227+
reporters/
228+
tests/
229+
examples/
230+
.github/workflows/
231+
```
232+
233+
## Limitations
234+
235+
- No live credential validity checks by default (offline-safe behavior)
236+
- Context detection is heuristic and may produce false positives in edge cases
237+
- Binary and generated minified assets are intentionally skipped
238+
- LeakLens does not rewrite source files automatically; autofix output is advisory guidance
239+
240+
## Roadmap
241+
242+
- Optional AI review stage for borderline findings
243+
- PR comment bot integration for developer feedback loops
244+
- Secret validity verification integrations (cloud/vendor APIs)
245+
- Exposure timeline analysis across commit history and branches
246+
247+
## Quality checks
248+
249+
```bash
250+
pytest
251+
ruff check src tests
252+
ruff format src tests
253+
```

examples/vulnerable_repo/.env

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
GITHUB_TOKEN=ghp_1234567890abcdefghijklmnopqrstuvwxyzABCD
2+
GITLAB_TOKEN=glpat-AbCdEfGhIjKlMnOpQrStUvWx

examples/vulnerable_repo/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
This directory intentionally contains fake secrets for LeakLens testing.
2+
Do not use these values in production.

examples/vulnerable_repo/app.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
"""Intentionally vulnerable sample for scanner validation."""
2+
3+
aws_access_key_id = "AKIAIOSFODNN7EXAMPLE"
4+
stripe_key = "sk_live_1234567890abcdefghijklmnop"
5+
openai_api_key = "sk-proj-abcdefghijklmnopqrstuvwxyz123456"
6+
db_url = "postgres://admin:supersecret@db.internal/prod"
7+
8+
# leaklens:ignore
9+
password = "example"
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
service:
2+
auth_token: "xoxb-123456789012-123456789012-abcdefghijklmnop"

0 commit comments

Comments
 (0)