RHEcosystemAppEng
diff --git a/‎README.md‎
Lines changed: 153 additions & 32 deletions b/‎README.md‎
Lines changed: 153 additions & 32 deletions
diff --git a/‎requirements.txt‎
Lines changed: 1 addition & 0 deletions b/‎requirements.txt‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎src/analysis.py‎
Lines changed: 1 addition & 1 deletion b/‎src/analysis.py‎
Lines changed: 1 addition & 1 deletion
@@ -1,22 +1,33 @@
-# Vulnerability Automation Test Script
+# Vulnerability Automation Test Scripts
 
-A Python automation script that reads CVE scan requests from `scan.json`, sends them to a vulnerability service, and saves the results to results folder.
+## Confusion matrix Script
+A Python automation script that reads CVE scan requests from `scan.json`, sends them to a vulnerability-analysis service, and saves the results to results folder.
 
-## Features
+## Integration tests Script
+A Python automation script that reads several analysis requests tests entries from `scan-it.json` file, sends them to the vulnerability-analysis service, and match results with test entries' expectations.
+
+## Common Features
 
-- Reads `scan.json` from a configurable input directory
 - Generates payloads from templates for different languages/ecosystems
 - Sends POST requests to the vulnerability service endpoint
 - Saves results to `{scan_id}_{vuln_id}_{iteration}.json` in a configurable output directory
-- Extracts data from result files and exports to CSV format
-- Analyzes results against expected results and generates confusion matrices
+- Extracts data from result files
 - Archives reports into timestamped tar files
 - Supports command-line arguments and environment variables
 - Comprehensive error handling and logging
-- Google Sheets integration for reading input data and writing analysis results
 - Containerized for use in Tekton CI/CD pipelines
 - Automated Docker image builds via GitHub Actions with push and manual trigger options
 
+## Confusion Matrix automation features
+- For Generating Confusion Matrix, it reads `scan.json` from a configurable input directory
+- exports extracted data of analysis results to a file of CSV format.
+- Analyzes results against expected results and generates confusion matrices
+- Google Sheets integration for reading input data and writing analysis results
+## Integration Tests automation features
+- For Running Integration tests, it reads `scan-it.json` from a configurable input directory
+- Supports Running the test cases concurrently with ThreadPool of 3 workers.
+- Integration Test semantic logic to compare the actual results with tests' expectations.
+- Colorful logging based on test case failure or success.
 ## Requirements
 
 - Python 3.9+
@@ -26,6 +37,7 @@ A Python automation script that reads CVE scan requests from `scan.json`, sends
 - `scikit-learn` library (for confusion matrix calculations)
 - `gspread` library (for Google Sheets integration)
 - `google-auth-oauthlib` library (for Google Sheets authentication)
+- `colorama` library for colorful prints to console output.
 
 ## Installation
 
@@ -78,6 +90,8 @@ The built image is pushed to: `quay.io/ecosystem-appeng/auto-cm-testing:latest`
 
 ### Local Development
 
+
+#### Confusion Matrix Automation
 Basic usage with default directories (`src/input` and `src/reports`):
 ```bash
 python src/vulnerability_main_automation.py
@@ -99,7 +113,7 @@ export SERVICE_URL=http://localhost:26466/generate
 python src/vulnerability_main_automation.py --input-dir /path/to/input
 ```
 
-### Command-line Options
+##### Command-line Options
 
 - `--input-dir`: Input directory containing `scan.json` (default: `src/input` or `INPUT_DIR` env var)
 - `--output-dir`: Output directory for result JSON files (default: `src/reports` or `OUTPUT_DIR` env var)
@@ -113,7 +127,7 @@ python src/vulnerability_main_automation.py --input-dir /path/to/input
 - `--gsheets-service-account-file`: Path to Google service account JSON file (required if `gsheets-mode` is not `none`)
 - `--gsheets-tag`: Tag/label for the run (optional, for future use)
 
-### File Structure
+##### File Structure
 
 The script expects:
 - **Input**: `scan.json` (or `scan_generated.json` if generated from Google Sheets) in the input directory
@@ -147,36 +161,136 @@ Example `scan.json` structure:
 }
 ```
 
+#### Integration tests Automation
+Basic usage with default directories (`src/input` and `src/reports`):
+```bash
+python src/main_integration_tests.py
+```
+
+Specify input and output directories:
+```bash
+python src/main_integration_tests.py --input-dir /path/to/input --output-dir /path/to/output
+```
+
+Specify service URL:
+```bash
+python src/main_integration_tests.py --input-dir /path/to/input --url http://localhost:26466/generate
+```
+
+Using environment variables:
+```bash
+export SERVICE_URL=http://localhost:26466/generate
+python src/main_integration_tests.py --input-dir /path/to/input
+```
+
+##### Command-line Options
+
+- `--input-dir`: Input directory containing `scan.json` (default: `src/input` or `INPUT_DIR` env var)
+- `--output-dir`: Output directory for result JSON files (default: `src/reports` or `OUTPUT_DIR` env var)
+- `--url`: Service URL endpoint (default: `http://localhost:26466/generate` or `SERVICE_URL` env var)
+- `--timeout`: Request timeout in seconds (default: 1800 = 30 minutes or `TIMEOUT` env var)
+- `--log-level`: Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL` (default: `DEBUG` or `LOG_LEVEL` env var)
+- `--language`: Filter tests by language (e.g., `c`, `go`, `python`). If not specified, runs all languages (default: `None` or `LANGUAGE` env var)
+- `--strict_mode`: If True, Makes test comparison strict => demands both equality for label and exploitability category. If False, still demands equality for exploitability category,
+                     but allowing actual label to be part of a small list of closely related labels=>
+                     (e.g `code_not_present` in [`code_not_reachable`, `code_not_present`]).
+                     If not specified, assumed to be False by default.'
+  
+
+##### File Structure
+
+The script expects:
+- **Input**: `scan-it.json` 
+- **Output**: Result JSON files named `{scan_id}_{vuln_id}_{iteration}.json` in the output directory
+- **Config**: Configuration files in `src/config/` directory:
+    
+    - `sboms/*.sbom`: SBOM files referenced in scan configuration, for test entries that requires
+- **Templates**: Payload templates in `src/templates/` directory (e.g., `c_payload_template.json`, `go_payload_template.json`, `python_payload_template.jsob`, `java_payload_template.json`)
+
+Example `scan-it.json` structure:
+```json
+{
+  "iterations": 1,
+  "tests": [
+    {
+      "language": "c",
+      "vuln_id": "CVE-2025-1094",
+      "image": {
+        "name": "registry.redhat.io/rhel8/postgresql-13",
+        "tag": "1-196.1724180180"
+      },
+      "git": {
+        "repo": "https://github.com/postgres/postgres",
+        "ref": "REL_13_14"
+      },
+      "use_sbom": true,
+      "sbom_file": "sboms/postgresql-13-1-196.1724180180.sbom",
+      "expected_label": "vulnerable",
+      "expected_result": "Exploitable",
+      "allowed_deviation_labels": [
+        "vulnerable"
+      ],
+      "skip": false
+    },
+    {
+      "language": "go",
+      "vuln_id": "CVE-2025-22865",
+      "git": {
+        "repo": "https://github.com/kuadrant/authorino",
+        "ref": "f792cd138891dc1ead99fd089aa757fbca3aace9"
+      },
+      "use_sbom": false,
+      "expected_label": "vulnerable",
+      "expected_result": "Exploitable",
+      "allowed_deviation_labels": [
+        "vulnerable"
+      ],
+      "skip": false
+    }
+  ]
+}
+```
+
 ### Workflow
 
-The script performs the following steps:
+The 2 scripts perform the following steps
 
-1. **Input Generation** (if Google Sheets input mode is enabled):
+1. **Input Generation** (if Google Sheets input mode is enabled, Only performed by confusion matrix automation):
    - Reads test data from Google Sheets tabs (`C_Sheet` and `Go_Sheet`)
-   - Writes data to CSV files (`prodsec_expected_results_*.csv`) to synchronize with Google Sheets
+   - Writes data to CSV files (`prodsec_expected_results_*.csv`) to synchronize with Google Sheets 
    - Generates `scan_generated.json` in the input directory
 
 2. **Execution**:
-   - Reads scan configuration from `scan.json` (or `scan_generated.json`)
+   - Reads scan configuration from `scan.json` or `scan_generated.json` ( or `scan-it.json`)
    - Generates payloads from templates for each test
    - Sends POST requests to the vulnerability service
    - Saves results as `{scan_id}_{vuln_id}_{iteration}.json` files
 
 3. **Data Extraction**:
    - Extracts key metrics from successful result JSON files
-   - Exports extracted data to `extracted_data.csv`
+   - Exports extracted data to `extracted_data.csv` 
 
-4. **Analysis**:
+4. **Analysis** (Only performed by confusion matrix automation):
    - Compares extracted data against expected results
    - Generates confusion matrices (categorical and binary)
    - Calculates performance metrics (Accuracy, Precision, Recall, F1 Score)
    - Exports analysis reports to console and Excel (`merged_data.csv`)
    - Writes results to Google Sheets (if output mode is enabled)
 
-5. **Cleanup**:
-   - Archives all result files into timestamped tar files
-   - Moves archives to `archive/` subdirectory
-   - Deletes original result files from reports directory
+
+5. **Matching** (Only performed by Integration tests automation)
+   - Matching actual results of tests cases with tests cases expected results
+   - Demands an exact match on category ( e.g `Exploitable` ,`Not Exploitable`)
+   - More flexible on the label, requires either expected_label=actual_label or actual_label to be in a set of common related labels as defined in the test entry' `allowed_deviation_labels`, for example [`code_not_reachable`,`code_not_present`]
+   - Prints red color text for a test entry that was failed, and green color text for a test entry that was succeeded.
+   - Show statistics for how many tests were failed, and how many succeeded.
+   - If all tests succeeded, ending the script with RC=0, otherwise, returns RC=1.
+6. **Cleanup**:
+    - Archives all result files into timestamped tar files
+    - Moves archives to `archive/` subdirectory
+    - Deletes original result files from reports directory
+
+
 
 ## Container Usage
 
@@ -195,13 +309,15 @@ docker run --rm \
 
 **Note:** The image is automatically built and pushed to Quay.io via GitHub Actions. See the [Automated Build with GitHub Actions](#automated-build-with-github-actions) section for details.
 
-## Google Sheets Integration
+## Confusion Matrix automation Integrations
 
-The script supports reading input data from Google Sheets and writing analysis results to Google Sheets.
+### Google Sheets Integration
 
-### Setup
+The Confusion matrix automation script supports reading input data from Google Sheets and writing analysis results to Google Sheets.
 
-#### Creating a Google Service Account and Sharing Your Sheet
+#### Setup
+
+##### Creating a Google Service Account and Sharing Your Sheet
 
 A Service Account is a special "robot" user that allows your application to access Google Sheets without requiring human authentication. Follow these steps to set it up:
 
@@ -293,7 +409,7 @@ export GSHEETS_TAG="production-run-2025"
 - Regularly rotate service account keys if compromised
 - Use the principle of least privilege: only grant the minimum permissions needed (Viewer for read-only, Editor for read-write)
 
-### Input Mode (Reading from Google Sheets)
+#### Input Mode (Reading from Google Sheets)
 
 When `gsheets-mode` is set to `input` or `both`, the script will:
 - Read test data from Google Sheets tabs (`C_Sheet` and `Go_Sheet`)
@@ -324,7 +440,7 @@ python src/vulnerability_main_automation.py \
   --url http://localhost:26466/generate
 ```
 
-### Output Mode (Writing to Google Sheets)
+#### Output Mode (Writing to Google Sheets)
 
 When `gsheets-mode` is set to `output` or `both`, the script will:
 - Write analysis results (confusion matrix metrics) to the `raw` tab in the output sheet
@@ -348,7 +464,7 @@ python src/vulnerability_main_automation.py \
   --gsheets-tag "v1.0.0"
 ```
 
-### Both Modes
+#### Both Modes
 
 To use both input and output modes:
 ```bash
@@ -362,13 +478,16 @@ python src/vulnerability_main_automation.py \
 
 ## Error Handling
 
-The script handles various error conditions:
+Both script handles various error conditions:
 - Missing input files
 - Invalid JSON in input or response
 - HTTP errors (4xx, 5xx)
 - Connection errors
 - Timeouts
 - File I/O errors
+
+
+## Confusion Matrix Error Handling
 - Google Sheets API errors
 - Missing service account credentials
 - Token expiration (automatically handled by refreshing credentials)
@@ -378,22 +497,24 @@ All errors are logged with appropriate detail levels.
 
 ## Logging
 
-The script uses Python's logging module with INFO level by default. Logs include:
+Both script uses Python's logging module with INFO level by default. Logs include:
 - File operations (read/write)
 - HTTP request details
 - Error messages with context
 - Success confirmations
-- Google Sheets operations (read/write)
-- Token refresh operations
+- Google Sheets operations (read/write) - Confusion matrix script only.
+- Token refresh operations - Confusion matrix script only.
 
 ## Output Files
 
-The script generates several output files:
+The 2 scripts generate several output files:
 
 - **Result JSON files**: `{scan_id}_{vuln_id}_{iteration}.json` - Individual scan results
+- **Archived reports**: `archive/report_{timestamp}.tar` - Timestamped archives of result files
+
+Output files generated only by Confusion matrix automation:
 - **Extracted data CSV**: `extracted_data.csv` - Extracted metrics from all result files
 - **Merged data CSV**: `merged_data.csv` - Analysis results with confusion matrix metrics
-- **Archived reports**: `archive/report_{timestamp}.tar` - Timestamped archives of result files
 
 ## Base Image
 
 
@@ -4,3 +4,4 @@ openpyxl>=3.1.0
 scikit-learn>=1.3.0
 gspread==6.2.1
 google-auth-oauthlib==1.2.3
+colorama==0.4.6
@@ -985,7 +985,7 @@ class NoiseReducerProcessorAnalysis(AnalysisProcessor):
     """
       in cybersecurity, false positive means it was identified wrongly as vulnerable
       In our use case, we're aiming on reducing noises/maximizing the identification of dataset items which are not really exploitable as much as possible (despite the existence of the vulnerable package version), hence :
-      We defining Positive as Not Exploitable
+      We're defining Positive as Not Exploitable
       TP - is  not exploitable and was right
       FP  -is not exploitable and was wrong
       TN - exploitable and was right