|
| 1 | +# Copilot Instructions for CAPEv2 |
| 2 | + |
| 3 | +## General Architecture |
| 4 | +- CAPEv2 is an automated malware analysis platform, based on Cuckoo Sandbox, with extensions for dynamic, static, and network analysis. |
| 5 | +- The backend is mainly Python, using SQLAlchemy for the database and Django/DRF for the web API. |
| 6 | +- Main components include: |
| 7 | + - `lib/cuckoo/core/database.py`: database logic and ORM. |
| 8 | + - `web/apiv2/views.py`: REST API endpoints (Django REST Framework). |
| 9 | + - `lib/cuckoo/common/`: shared utilities, configuration, helpers. |
| 10 | + - `storage/`: analysis results and temporary files. |
| 11 | +- Typical flow: sample upload → DB registration → VM assignment → analysis → result storage → API query. |
| 12 | + |
| 13 | +## Conventions and Patterns |
| 14 | +- Heavy use of SQLAlchemy 2.0 ORM, with explicit sessions and nested transactions (`begin_nested`). |
| 15 | +- Database models (Sample, Task, Machine, etc.) are always managed via `Database` object methods. |
| 16 | +- API endpoints always return a dict with `error`, `data`, and, if applicable, `error_value` keys. |
| 17 | +- Validation and request argument parsing is centralized in helpers (`parse_request_arguments`, etc.). |
| 18 | +- Integrity errors (e.g., duplicates) are handled with `try/except IntegrityError` and recovery of the existing object. |
| 19 | +- Tags are managed as comma-separated strings and normalized before associating to models. |
| 20 | +- Code avoids mutable global variables; configuration is accessed via `Config` objects. |
| 21 | + |
| 22 | +## Developer Workflows |
| 23 | +- No Makefile or standard build scripts; dependency management is usually via `poetry` or `pip`. |
| 24 | +- For testing, use virtual environments and run scripts manually. |
| 25 | +- Typical backend startup is via Django (`manage.py runserver`), and analysis workers are launched separately. |
| 26 | +- Database changes require manual migrations (see Alembic comments in `database.py`). |
| 27 | + |
| 28 | +## Integrations and Dependencies |
| 29 | +- Optional integration with MongoDB and Elasticsearch, controlled by configuration (`reporting.conf`). |
| 30 | +- The system can use different compression tools (zlib, 7zip) depending on config. |
| 31 | +- Sample analysis may invoke external utilities (e.g., Sflock, PE parsers). |
| 32 | + |
| 33 | +## Key Pattern Examples |
| 34 | +- IntegrityError handling example: |
| 35 | + ```python |
| 36 | + try: |
| 37 | + with self.session.begin_nested(): |
| 38 | + self.session.add(sample) |
| 39 | + except IntegrityError: |
| 40 | + sample = self.session.scalar(select(Sample).where(Sample.md5 == file_md5)) |
| 41 | + ``` |
| 42 | +- API response example: |
| 43 | + ```python |
| 44 | + return Response({"error": False, "data": result}) |
| 45 | + ``` |
| 46 | +- Tag assignment example: |
| 47 | + ```python |
| 48 | + tags = ",".join(set(_tags)) |
| 49 | + ``` |
| 50 | + |
| 51 | +## Key Files |
| 52 | +- `lib/cuckoo/core/database.py`: database logic, sample/task registration, machine management. |
| 53 | +- `web/apiv2/views.py`: REST endpoints, validation, high-level business logic. |
| 54 | +- `lib/cuckoo/common/`: utilities, helpers, configuration. |
| 55 | + |
| 56 | +--- |
| 57 | + |
| 58 | +If you introduce new endpoints, helpers, or models, follow the validation, error handling, and standard response patterns. See the files above for implementation examples. |
0 commit comments