Skip to content

Commit e95f1be

Browse files
authored
Merge pull request #5 from RHEcosystemAppEng/add-integration-tests
feat: Add integration tests
2 parents 99cf6aa + d5563b0 commit e95f1be

14 files changed

Lines changed: 1066 additions & 86 deletions

README.md

Lines changed: 153 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,33 @@
1-
# Vulnerability Automation Test Script
1+
# Vulnerability Automation Test Scripts
22

3-
A Python automation script that reads CVE scan requests from `scan.json`, sends them to a vulnerability service, and saves the results to results folder.
3+
## Confusion matrix Script
4+
A Python automation script that reads CVE scan requests from `scan.json`, sends them to a vulnerability-analysis service, and saves the results to results folder.
45

5-
## Features
6+
## Integration tests Script
7+
A Python automation script that reads several analysis requests tests entries from `scan-it.json` file, sends them to the vulnerability-analysis service, and match results with test entries' expectations.
8+
9+
## Common Features
610

7-
- Reads `scan.json` from a configurable input directory
811
- Generates payloads from templates for different languages/ecosystems
912
- Sends POST requests to the vulnerability service endpoint
1013
- Saves results to `{scan_id}_{vuln_id}_{iteration}.json` in a configurable output directory
11-
- Extracts data from result files and exports to CSV format
12-
- Analyzes results against expected results and generates confusion matrices
14+
- Extracts data from result files
1315
- Archives reports into timestamped tar files
1416
- Supports command-line arguments and environment variables
1517
- Comprehensive error handling and logging
16-
- Google Sheets integration for reading input data and writing analysis results
1718
- Containerized for use in Tekton CI/CD pipelines
1819
- Automated Docker image builds via GitHub Actions with push and manual trigger options
1920

21+
## Confusion Matrix automation features
22+
- For Generating Confusion Matrix, it reads `scan.json` from a configurable input directory
23+
- exports extracted data of analysis results to a file of CSV format.
24+
- Analyzes results against expected results and generates confusion matrices
25+
- Google Sheets integration for reading input data and writing analysis results
26+
## Integration Tests automation features
27+
- For Running Integration tests, it reads `scan-it.json` from a configurable input directory
28+
- Supports Running the test cases concurrently with ThreadPool of 3 workers.
29+
- Integration Test semantic logic to compare the actual results with tests' expectations.
30+
- Colorful logging based on test case failure or success.
2031
## Requirements
2132

2233
- Python 3.9+
@@ -26,6 +37,7 @@ A Python automation script that reads CVE scan requests from `scan.json`, sends
2637
- `scikit-learn` library (for confusion matrix calculations)
2738
- `gspread` library (for Google Sheets integration)
2839
- `google-auth-oauthlib` library (for Google Sheets authentication)
40+
- `colorama` library for colorful prints to console output.
2941

3042
## Installation
3143

@@ -78,6 +90,8 @@ The built image is pushed to: `quay.io/ecosystem-appeng/auto-cm-testing:latest`
7890

7991
### Local Development
8092

93+
94+
#### Confusion Matrix Automation
8195
Basic usage with default directories (`src/input` and `src/reports`):
8296
```bash
8397
python src/vulnerability_main_automation.py
@@ -99,7 +113,7 @@ export SERVICE_URL=http://localhost:26466/generate
99113
python src/vulnerability_main_automation.py --input-dir /path/to/input
100114
```
101115

102-
### Command-line Options
116+
##### Command-line Options
103117

104118
- `--input-dir`: Input directory containing `scan.json` (default: `src/input` or `INPUT_DIR` env var)
105119
- `--output-dir`: Output directory for result JSON files (default: `src/reports` or `OUTPUT_DIR` env var)
@@ -113,7 +127,7 @@ python src/vulnerability_main_automation.py --input-dir /path/to/input
113127
- `--gsheets-service-account-file`: Path to Google service account JSON file (required if `gsheets-mode` is not `none`)
114128
- `--gsheets-tag`: Tag/label for the run (optional, for future use)
115129

116-
### File Structure
130+
##### File Structure
117131

118132
The script expects:
119133
- **Input**: `scan.json` (or `scan_generated.json` if generated from Google Sheets) in the input directory
@@ -147,36 +161,136 @@ Example `scan.json` structure:
147161
}
148162
```
149163

164+
#### Integration tests Automation
165+
Basic usage with default directories (`src/input` and `src/reports`):
166+
```bash
167+
python src/main_integration_tests.py
168+
```
169+
170+
Specify input and output directories:
171+
```bash
172+
python src/main_integration_tests.py --input-dir /path/to/input --output-dir /path/to/output
173+
```
174+
175+
Specify service URL:
176+
```bash
177+
python src/main_integration_tests.py --input-dir /path/to/input --url http://localhost:26466/generate
178+
```
179+
180+
Using environment variables:
181+
```bash
182+
export SERVICE_URL=http://localhost:26466/generate
183+
python src/main_integration_tests.py --input-dir /path/to/input
184+
```
185+
186+
##### Command-line Options
187+
188+
- `--input-dir`: Input directory containing `scan.json` (default: `src/input` or `INPUT_DIR` env var)
189+
- `--output-dir`: Output directory for result JSON files (default: `src/reports` or `OUTPUT_DIR` env var)
190+
- `--url`: Service URL endpoint (default: `http://localhost:26466/generate` or `SERVICE_URL` env var)
191+
- `--timeout`: Request timeout in seconds (default: 1800 = 30 minutes or `TIMEOUT` env var)
192+
- `--log-level`: Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL` (default: `DEBUG` or `LOG_LEVEL` env var)
193+
- `--language`: Filter tests by language (e.g., `c`, `go`, `python`). If not specified, runs all languages (default: `None` or `LANGUAGE` env var)
194+
- `--strict_mode`: If True, Makes test comparison strict => demands both equality for label and exploitability category. If False, still demands equality for exploitability category,
195+
but allowing actual label to be part of a small list of closely related labels=>
196+
(e.g `code_not_present` in [`code_not_reachable`, `code_not_present`]).
197+
If not specified, assumed to be False by default.'
198+
199+
200+
##### File Structure
201+
202+
The script expects:
203+
- **Input**: `scan-it.json`
204+
- **Output**: Result JSON files named `{scan_id}_{vuln_id}_{iteration}.json` in the output directory
205+
- **Config**: Configuration files in `src/config/` directory:
206+
207+
- `sboms/*.sbom`: SBOM files referenced in scan configuration, for test entries that requires
208+
- **Templates**: Payload templates in `src/templates/` directory (e.g., `c_payload_template.json`, `go_payload_template.json`, `python_payload_template.jsob`, `java_payload_template.json`)
209+
210+
Example `scan-it.json` structure:
211+
```json
212+
{
213+
"iterations": 1,
214+
"tests": [
215+
{
216+
"language": "c",
217+
"vuln_id": "CVE-2025-1094",
218+
"image": {
219+
"name": "registry.redhat.io/rhel8/postgresql-13",
220+
"tag": "1-196.1724180180"
221+
},
222+
"git": {
223+
"repo": "https://github.com/postgres/postgres",
224+
"ref": "REL_13_14"
225+
},
226+
"use_sbom": true,
227+
"sbom_file": "sboms/postgresql-13-1-196.1724180180.sbom",
228+
"expected_label": "vulnerable",
229+
"expected_result": "Exploitable",
230+
"allowed_deviation_labels": [
231+
"vulnerable"
232+
],
233+
"skip": false
234+
},
235+
{
236+
"language": "go",
237+
"vuln_id": "CVE-2025-22865",
238+
"git": {
239+
"repo": "https://github.com/kuadrant/authorino",
240+
"ref": "f792cd138891dc1ead99fd089aa757fbca3aace9"
241+
},
242+
"use_sbom": false,
243+
"expected_label": "vulnerable",
244+
"expected_result": "Exploitable",
245+
"allowed_deviation_labels": [
246+
"vulnerable"
247+
],
248+
"skip": false
249+
}
250+
]
251+
}
252+
```
253+
150254
### Workflow
151255

152-
The script performs the following steps:
256+
The 2 scripts perform the following steps
153257

154-
1. **Input Generation** (if Google Sheets input mode is enabled):
258+
1. **Input Generation** (if Google Sheets input mode is enabled, Only performed by confusion matrix automation):
155259
- Reads test data from Google Sheets tabs (`C_Sheet` and `Go_Sheet`)
156-
- Writes data to CSV files (`prodsec_expected_results_*.csv`) to synchronize with Google Sheets
260+
- Writes data to CSV files (`prodsec_expected_results_*.csv`) to synchronize with Google Sheets
157261
- Generates `scan_generated.json` in the input directory
158262

159263
2. **Execution**:
160-
- Reads scan configuration from `scan.json` (or `scan_generated.json`)
264+
- Reads scan configuration from `scan.json` or `scan_generated.json` ( or `scan-it.json`)
161265
- Generates payloads from templates for each test
162266
- Sends POST requests to the vulnerability service
163267
- Saves results as `{scan_id}_{vuln_id}_{iteration}.json` files
164268

165269
3. **Data Extraction**:
166270
- Extracts key metrics from successful result JSON files
167-
- Exports extracted data to `extracted_data.csv`
271+
- Exports extracted data to `extracted_data.csv`
168272

169-
4. **Analysis**:
273+
4. **Analysis** (Only performed by confusion matrix automation):
170274
- Compares extracted data against expected results
171275
- Generates confusion matrices (categorical and binary)
172276
- Calculates performance metrics (Accuracy, Precision, Recall, F1 Score)
173277
- Exports analysis reports to console and Excel (`merged_data.csv`)
174278
- Writes results to Google Sheets (if output mode is enabled)
175279

176-
5. **Cleanup**:
177-
- Archives all result files into timestamped tar files
178-
- Moves archives to `archive/` subdirectory
179-
- Deletes original result files from reports directory
280+
281+
5. **Matching** (Only performed by Integration tests automation)
282+
- Matching actual results of tests cases with tests cases expected results
283+
- Demands an exact match on category ( e.g `Exploitable` ,`Not Exploitable`)
284+
- More flexible on the label, requires either expected_label=actual_label or actual_label to be in a set of common related labels as defined in the test entry' `allowed_deviation_labels`, for example [`code_not_reachable`,`code_not_present`]
285+
- Prints red color text for a test entry that was failed, and green color text for a test entry that was succeeded.
286+
- Show statistics for how many tests were failed, and how many succeeded.
287+
- If all tests succeeded, ending the script with RC=0, otherwise, returns RC=1.
288+
6. **Cleanup**:
289+
- Archives all result files into timestamped tar files
290+
- Moves archives to `archive/` subdirectory
291+
- Deletes original result files from reports directory
292+
293+
180294

181295
## Container Usage
182296

@@ -195,13 +309,15 @@ docker run --rm \
195309

196310
**Note:** The image is automatically built and pushed to Quay.io via GitHub Actions. See the [Automated Build with GitHub Actions](#automated-build-with-github-actions) section for details.
197311

198-
## Google Sheets Integration
312+
## Confusion Matrix automation Integrations
199313

200-
The script supports reading input data from Google Sheets and writing analysis results to Google Sheets.
314+
### Google Sheets Integration
201315

202-
### Setup
316+
The Confusion matrix automation script supports reading input data from Google Sheets and writing analysis results to Google Sheets.
203317

204-
#### Creating a Google Service Account and Sharing Your Sheet
318+
#### Setup
319+
320+
##### Creating a Google Service Account and Sharing Your Sheet
205321

206322
A Service Account is a special "robot" user that allows your application to access Google Sheets without requiring human authentication. Follow these steps to set it up:
207323

@@ -293,7 +409,7 @@ export GSHEETS_TAG="production-run-2025"
293409
- Regularly rotate service account keys if compromised
294410
- Use the principle of least privilege: only grant the minimum permissions needed (Viewer for read-only, Editor for read-write)
295411

296-
### Input Mode (Reading from Google Sheets)
412+
#### Input Mode (Reading from Google Sheets)
297413

298414
When `gsheets-mode` is set to `input` or `both`, the script will:
299415
- Read test data from Google Sheets tabs (`C_Sheet` and `Go_Sheet`)
@@ -324,7 +440,7 @@ python src/vulnerability_main_automation.py \
324440
--url http://localhost:26466/generate
325441
```
326442

327-
### Output Mode (Writing to Google Sheets)
443+
#### Output Mode (Writing to Google Sheets)
328444

329445
When `gsheets-mode` is set to `output` or `both`, the script will:
330446
- Write analysis results (confusion matrix metrics) to the `raw` tab in the output sheet
@@ -348,7 +464,7 @@ python src/vulnerability_main_automation.py \
348464
--gsheets-tag "v1.0.0"
349465
```
350466

351-
### Both Modes
467+
#### Both Modes
352468

353469
To use both input and output modes:
354470
```bash
@@ -362,13 +478,16 @@ python src/vulnerability_main_automation.py \
362478

363479
## Error Handling
364480

365-
The script handles various error conditions:
481+
Both script handles various error conditions:
366482
- Missing input files
367483
- Invalid JSON in input or response
368484
- HTTP errors (4xx, 5xx)
369485
- Connection errors
370486
- Timeouts
371487
- File I/O errors
488+
489+
490+
## Confusion Matrix Error Handling
372491
- Google Sheets API errors
373492
- Missing service account credentials
374493
- Token expiration (automatically handled by refreshing credentials)
@@ -378,22 +497,24 @@ All errors are logged with appropriate detail levels.
378497

379498
## Logging
380499

381-
The script uses Python's logging module with INFO level by default. Logs include:
500+
Both script uses Python's logging module with INFO level by default. Logs include:
382501
- File operations (read/write)
383502
- HTTP request details
384503
- Error messages with context
385504
- Success confirmations
386-
- Google Sheets operations (read/write)
387-
- Token refresh operations
505+
- Google Sheets operations (read/write) - Confusion matrix script only.
506+
- Token refresh operations - Confusion matrix script only.
388507

389508
## Output Files
390509

391-
The script generates several output files:
510+
The 2 scripts generate several output files:
392511

393512
- **Result JSON files**: `{scan_id}_{vuln_id}_{iteration}.json` - Individual scan results
513+
- **Archived reports**: `archive/report_{timestamp}.tar` - Timestamped archives of result files
514+
515+
Output files generated only by Confusion matrix automation:
394516
- **Extracted data CSV**: `extracted_data.csv` - Extracted metrics from all result files
395517
- **Merged data CSV**: `merged_data.csv` - Analysis results with confusion matrix metrics
396-
- **Archived reports**: `archive/report_{timestamp}.tar` - Timestamped archives of result files
397518

398519
## Base Image
399520

requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,4 @@ openpyxl>=3.1.0
44
scikit-learn>=1.3.0
55
gspread==6.2.1
66
google-auth-oauthlib==1.2.3
7+
colorama==0.4.6

src/analysis.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -985,7 +985,7 @@ class NoiseReducerProcessorAnalysis(AnalysisProcessor):
985985
"""
986986
in cybersecurity, false positive means it was identified wrongly as vulnerable
987987
In our use case, we're aiming on reducing noises/maximizing the identification of dataset items which are not really exploitable as much as possible (despite the existence of the vulnerable package version), hence :
988-
We defining Positive as Not Exploitable
988+
We're defining Positive as Not Exploitable
989989
TP - is not exploitable and was right
990990
FP -is not exploitable and was wrong
991991
TN - exploitable and was right

0 commit comments

Comments
 (0)