|
| 1 | +# GitHub Copilot Instructions for OpenAEV Collectors |
| 2 | + |
| 3 | +## Repository Overview |
| 4 | + |
| 5 | +**OpenAEV collectors** - Python integrations for security tools (EDR, XDR, SIEM, etc.) to collect data for OpenAEV platform. Monorepo with 15 collectors. |
| 6 | + |
| 7 | +**Key Facts:** |
| 8 | +- **Language**: Python 3.11+ (CI: Python 3.13) |
| 9 | +- **Package Manager**: Poetry 2.1.3+ |
| 10 | +- **CI/CD**: CircleCI |
| 11 | +- **Collectors**: 8 in root pyproject.toml (atomic-red-team, crowdstrike, microsoft-defender, microsoft-entra, microsoft-sentinel, mitre-attack, nvd-nist-cve, tanium-threat-response), 7 standalone (aws-resources, google-workspace, microsoft-azure, microsoft-intune, openaev, sentinelone, splunk-es) |
| 12 | + |
| 13 | +## Critical Build Requirements |
| 14 | + |
| 15 | +### Poetry and Dependency Management |
| 16 | + |
| 17 | +**IMPORTANT**: Uses **mutually exclusive extra markers** for `pyoaev` dependency. Different sources based on extras. |
| 18 | + |
| 19 | +**Installation modes:** |
| 20 | +- **Production**: `poetry install --extras prod` (PyPI) |
| 21 | +- **Development**: `poetry install --extras dev` (local `../client-python`) |
| 22 | + |
| 23 | +**Expected dev structure:** |
| 24 | +``` |
| 25 | +/home/runner/work/ |
| 26 | +├── client-python/ # pyoaev library |
| 27 | +└── collectors/ # This repo |
| 28 | +``` |
| 29 | + |
| 30 | +**NEVER** use both `dev` and `prod` extras simultaneously. |
| 31 | + |
| 32 | +**Common issue**: `Path for pyoaev does not exist` - clone `client-python` or use `--extras prod`. |
| 33 | + |
| 34 | +## Code Quality and Linting |
| 35 | + |
| 36 | +**CI requires three checks:** |
| 37 | +1. **isort** - Import sorting (with black profile) |
| 38 | +2. **black** - Code formatting |
| 39 | +3. **flake8** - Linting |
| 40 | + |
| 41 | +**Run before committing:** |
| 42 | +```bash |
| 43 | +pip install black isort flake8 |
| 44 | +isort --profile black --check . |
| 45 | +black --check . |
| 46 | +flake8 --ignore=E,W . # Match CI behavior |
| 47 | +``` |
| 48 | + |
| 49 | +**Auto-fix:** `isort --profile black .` and `black .` |
| 50 | + |
| 51 | +**Config notes:** |
| 52 | +- **isort**: Must use `--profile black` |
| 53 | +- **flake8**: CI uses `--ignore=E,W` (overrides `.flake8` file) |
| 54 | + |
| 55 | +## Testing |
| 56 | + |
| 57 | +**Collectors with tests:** crowdstrike, sentinelone, splunk-es, nvd-nist-cve |
| 58 | + |
| 59 | +**Run tests (crowdstrike example):** |
| 60 | +```bash |
| 61 | +cd crowdstrike |
| 62 | +poetry install --extras prod |
| 63 | +poetry run pip install --force-reinstall git+https://github.com/OpenAEV-Platform/client-python.git@main |
| 64 | +poetry run python -m unittest |
| 65 | +``` |
| 66 | + |
| 67 | +## CI/CD Pipeline (CircleCI) |
| 68 | + |
| 69 | +**Job order:** |
| 70 | +1. **ensure_formatting** - black and isort checks |
| 71 | +2. **linter** - flake8 |
| 72 | +3. **test** - crowdstrike collector tests (unittest) |
| 73 | +4. **build_docker_images** - All collectors (python:3.13-alpine, Poetry 2.1.3) |
| 74 | +5. **publish_images** - Docker Hub (main/release/tags) |
| 75 | + |
| 76 | +**Branch strategy:** |
| 77 | +- **main**: Rolling tag |
| 78 | +- **release/current**: Prerelease tag |
| 79 | +- **tags (vX.Y.Z)**: Version tag |
| 80 | + |
| 81 | +## Collector Architecture |
| 82 | + |
| 83 | +**Standard structure:** |
| 84 | +``` |
| 85 | +collector-name/ |
| 86 | +├── collector_name/ # Python package |
| 87 | +│ └── openaev_<name>.py # Entry point |
| 88 | +├── test/ or tests/ # Tests (unittest) |
| 89 | +├── Dockerfile # python:3.13-alpine, Poetry 2.1.3 |
| 90 | +├── pyproject.toml # Dependencies with mutually exclusive extras |
| 91 | +└── README.md |
| 92 | +``` |
| 93 | + |
| 94 | +**Run collector:** |
| 95 | +- Poetry: `cd <collector> && poetry install --extras prod && poetry run python -m <collector_name>.openaev_<collector_name>` |
| 96 | +- Docker: `cd <collector> && docker build -t collector . && docker compose up -d` |
| 97 | + |
| 98 | +**Common env vars:** `OPENAEV_URL`, `OPENAEV_TOKEN`, `COLLECTOR_ID`, `COLLECTOR_NAME`, `COLLECTOR_PERIOD`, `COLLECTOR_LOG_LEVEL`, `COLLECTOR_PLATFORM` |
| 99 | + |
| 100 | +## Making Changes |
| 101 | + |
| 102 | +**Modify collector:** |
| 103 | +1. Make changes in collector's package directory |
| 104 | +2. **ALWAYS run linters:** `black .`, `isort --profile black .`, `flake8 --ignore=E,W .` |
| 105 | +3. Run tests if they exist: `poetry run python -m unittest` |
| 106 | +4. Test locally if possible |
| 107 | + |
| 108 | +**Add new collector:** Use `poetry new new_collector` then edit pyproject.toml for pyoaev with mutually exclusive markers (see README.md) |
| 109 | + |
| 110 | +**Update dependencies:** Use Renovate bot (automated) or `poetry update <package>`. **NEVER modify pyoaev structure** without team approval. |
| 111 | + |
| 112 | +## Troubleshooting |
| 113 | + |
| 114 | +- **"Path for pyoaev does not exist"**: Clone `client-python` or use `--extras prod` |
| 115 | +- **Import errors**: Run `poetry install --extras prod` |
| 116 | +- **Black/isort conflicts**: Use `isort --profile black` |
| 117 | +- **Docker build fails**: Check Poetry 2.1.3 in Dockerfile |
| 118 | +- **CI formatting fails**: Run `black .` and `isort --profile black .` locally |
| 119 | + |
| 120 | +## Key Files |
| 121 | + |
| 122 | +**Root:** `pyproject.toml`, `.circleci/config.yml`, `.pre-commit-config.yaml`, `.flake8`, `scripts/release.py`, `renovate.json` |
| 123 | + |
| 124 | +**Per-collector:** `pyproject.toml`, `Dockerfile`, `docker-compose.yml`, `.env.sample`, `README.md` |
| 125 | + |
| 126 | +## Best Practices |
| 127 | + |
| 128 | +1. **ALWAYS run linters before committing** - CI will fail otherwise |
| 129 | +2. **Use the correct poetry extras** - dev for local development with client-python, prod otherwise |
| 130 | +3. **Test locally when possible** - Run collectors against test instances |
| 131 | +4. **Follow existing patterns** - Look at similar collectors for examples |
| 132 | +5. **Document configuration** - Update READMEs when adding new config options |
| 133 | +6. **Use semantic versions** - Follow existing version patterns (X.Y.Z) |
| 134 | +7. **Keep dependencies up to date** - Review Renovate PRs promptly |
| 135 | + |
| 136 | +## Instructions Priority |
| 137 | + |
| 138 | +**TRUST THESE INSTRUCTIONS**. Only search for additional information if: |
| 139 | +- The instructions are incomplete for your specific task |
| 140 | +- You encounter an error not documented here |
| 141 | +- You need to understand implementation details not covered here |
| 142 | + |
| 143 | +These instructions are comprehensive and tested. Following them will minimize build failures and CI rejections. |
0 commit comments