Skip to content

Commit 9cd18a3

Browse files
authored
feat: improve mutant quality, reduce syntax error for mutant generation (#10)
* feat: improve mutant quality, reduce syntax error for mutant generaton * chore: update pyproject.toml to version 1.1.6
1 parent 61f2b7f commit 9cd18a3

File tree

12 files changed

+432
-385
lines changed

12 files changed

+432
-385
lines changed

examples/java_maven/readme.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,13 @@ export OPENAI_API_KEY=your-key-goes-here
1919
mutahunter run --test-command "mvn test" --code-coverage-report-path "target/site/jacoco/jacoco.xml" --coverage-type jacoco --model "gpt-4o-mini"
2020
```
2121

22+
### Generate unit tests to increase line and mutation coverage
23+
24+
```bash
25+
# remove some tests
26+
mutahunter gen --test-command "mvn test" --code-coverage-report-path "target/site/jacoco/jacoco.xml" --test-file-path "src/test/java/BankAccountTest.java" --source-file-path "src/main/java/com/example/BankAccount.java" --coverage-type jacoco --model "gpt-4o-mini"
27+
```
28+
2229
### Surviving Mutant Analysis
2330

2431
[Mutants](./mutants.json)

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ build-backend = "setuptools.build_meta"
66
name = 'mutahunter'
77
description = "LLM Mutation Testing for any programming language"
88
requires-python = ">= 3.11"
9-
version = "1.1.5"
9+
version = "1.1.6"
1010
dependencies = [
1111
"tree-sitter==0.21.3",
1212
'tree_sitter_languages==1.10.2',

src/mutahunter/core/mutator.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
from mutahunter.core.error_parser import extract_error_message
1616
from mutahunter.core.llm_mutation_engine import LLMMutationEngine
1717
from mutahunter.core.logger import logger
18-
from mutahunter.core.prompts.user import MUTANT_ANALYSIS
18+
from mutahunter.core.prompts.mutant_generator import MUTANT_ANALYSIS
1919
from mutahunter.core.report import MutantReport
2020
from mutahunter.core.router import LLMRouter
2121
from mutahunter.core.runner import TestRunner
@@ -309,7 +309,7 @@ def prepare_mutant_file(
309309
):
310310
with open(mutant_path, "wb") as f:
311311
f.write(modified_byte_code)
312-
self.logger.info(f"Mutant file prepared: {mutant_path}")
312+
self.logger.debug(f"Mutant file prepared: {mutant_path}")
313313
return mutant_path
314314
else:
315315
self.logger.error(
@@ -322,7 +322,7 @@ def run_test(self, params: Dict[str, str]) -> Any:
322322
Runs the test command on the given parameters.
323323
"""
324324
self.logger.info(
325-
f"Running test command: {params['test_command']} for mutant file: {params['replacement_module_path']}"
325+
f"'{params['test_command']}' - '{params['replacement_module_path']}'"
326326
)
327327
return self.test_runner.run_test(params)
328328

src/mutahunter/core/prompts/factory.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,7 @@
22
Module for generating prompts based on the programming language.
33
"""
44

5-
from mutahunter.core.prompts.system import SYSTEM_PROMPT
6-
from mutahunter.core.prompts.user import USER_PROMPT
5+
from mutahunter.core.prompts.mutant_generator import SYSTEM_PROMPT, USER_PROMPT
76

87

98
class PromptFactory:
Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
SYSTEM_PROMPT = """
2+
You are an AI Agent part of the Software Quality Assurance Team. Your task is to mutate the {{language}} code provided to you. You will be provided with the Abstract Syntax Tree (AST) of the source code for contextual understanding. This AST will help you understand the entire source code. Make sure to read the AST before proceeding with the mutation.
3+
4+
## Mutation Guidelines
5+
1. Modify Core Logic:
6+
- Conditional Statements: Introduce incorrect conditions (e.g., `if (a < b)` changed to `if (a <= b)`).
7+
- Loop Logic: Alter loop conditions to cause infinite loops or early termination.
8+
- Avoid "Malicious Mutations" such as introducing non-terminating constructs like `while(true)` which halt the program and prevent further testing.
9+
- Avoid mutations that change too much of the application's logic, which can disrupt the system excessively, potentially concealing subtler issues and leading to unproductive testing outcomes.
10+
- Calculations: Introduce off-by-one errors or incorrect mathematical operations.
11+
2. Alter Outputs:
12+
- Return Values: Change the expected return type (e.g., returning `null` instead of an object).
13+
- Response Formats: Modify response structure (e.g., missing keys in a JSON response).
14+
- Data Corruption: Return corrupted or incomplete data.
15+
3. Change Method Calls:
16+
- Parameter Tampering: Pass incorrect or malicious parameters.
17+
- Function Replacement: Replace critical functions with no-op or harmful ones.
18+
- Dependency Removal: Omit critical method calls that maintain state or security.
19+
4. Simulate Failures:
20+
- Exception Injection: Introduce runtime exceptions (e.g., `NullPointerException`, `IndexOutOfBoundsException`).
21+
- Resource Failures: Simulate failures in external resources (e.g., database disconnection, file not found).
22+
5. Modify Data Handling:
23+
- Parsing Errors: Introduce parsing errors for data inputs (e.g., incorrect date formats).
24+
- Validation Bypass: Disable or weaken data validation checks.
25+
- State Alteration: Incorrectly alter object states, leading to inconsistent data.
26+
6. Introduce Boundary Conditions:
27+
- Array Indices: Use out-of-bounds indices.
28+
- Parameter Extremes: Use extreme values for parameters (e.g., maximum integers, very large strings).
29+
- Memory Limits: Introduce large inputs to test memory handling.
30+
7. Timing and Concurrency:
31+
- Race Conditions: Alter synchronization to create race conditions.
32+
- Deadlocks: Introduce scenarios that can lead to deadlocks.
33+
- Timeouts: Simulate timeouts in critical operations.
34+
8. Replicate Known CVE Bugs:
35+
- Buffer Overflow: Introduce buffer overflows by manipulating array sizes.
36+
- SQL Injection: Allow unsanitized input to be passed to SQL queries.
37+
- Cross-Site Scripting (XSS): Introduce vulnerabilities that allow JavaScript injection in web responses.
38+
- Cross-Site Request Forgery (CSRF): Bypass anti-CSRF measures.
39+
- Path Traversal: Modify file access logic to allow path traversal attacks.
40+
- Insecure Deserialization: Introduce vulnerabilities in deserialization logic.
41+
- Privilege Escalation: Modify role-based access controls to allow unauthorized actions.
42+
"""
43+
44+
USER_PROMPT = """
45+
## Abstract Syntax Tree (AST) for Context
46+
```ast
47+
{{ast}}
48+
```
49+
50+
## Response
51+
The output should be formatted as a YAML object that corresponds accurately with the $Mutants type as defined in our system's schema. Below are the Pydantic class definitions that outline the structure and details required for the YAML output.
52+
```python
53+
from pydantic import BaseModel, Field
54+
from typing import List
55+
56+
class SingleMutant(BaseModel):
57+
function_name: str = Field(..., description="The name of the function where the mutation was applied.")
58+
type: str = Field(..., description="The type of the mutation operator used.")
59+
description: str = Field(..., description="A brief description detailing the mutation applied.")
60+
original_line: str = Field(..., description="The original line of code before mutation. Exclude any line numbers and ensure proper formatting for YAML literal block scalar.")
61+
mutated_line: str = Field(..., description="The mutated line of code, annotated with a comment explaining the mutation. Exclude any line numbers and ensure proper formatting for YAML literal block scalar.")
62+
63+
class Mutants(BaseModel):
64+
changes: List[SingleMutant] = Field(..., description="A list of SingleMutant instances each representing a specific mutation change.")
65+
```
66+
Key Points to Note:
67+
- Field Requirements: Each field in the SingleMutant and Mutants classes must be populated with data that comply strictly with the descriptions provided.
68+
- Formatting: Ensure that when creating the YAML output, the structure mirrors that of the nested Pydantic models, with correct indentation and hierarchy representing the relationship between Mutants and SingleMutant.
69+
Mutant Details:
70+
- Function Name should clearly state where the mutation was inserted.
71+
- Type should reflect the category or nature of the mutation applied.
72+
- Description should offer a concise yet descriptive insight into what the mutation entails and possibly its intent or impact.
73+
- Original and Mutated Lines should be accurate reproductions of the code pre- and post-mutation but without line numbers, while the mutated line must include an explanatory comment inline.
74+
75+
## Function Block to Mutate
76+
Lines Covered: {{covered_lines}}. Only mutate lines that are covered by execution.
77+
Note that line numbers have been manually added for reference. Do not include these line numbers in your response. Mutated line must be valid and syntactically correct after replacing the original line. Do not generate multi-line mutants.
78+
```{{language}}
79+
{{function_block}}
80+
```
81+
82+
## Task
83+
Produce between 1 and {{maximum_num_of_mutants_per_function_block}} mutants for the provided function block. Make use of the mutation guidelines specified in the system prompt, focusing on designated mutation areas. Ensure that the mutations are meaningful and provide valuable insights into the code quality and test coverage.
84+
85+
## Example Output
86+
```yaml
87+
changes:
88+
- function_name: ...
89+
type: ...
90+
description: ...
91+
original_line: |
92+
...
93+
mutated_line: |
94+
...
95+
```
96+
"""
97+
98+
MUTANT_ANALYSIS = """
99+
## Related Source Code:
100+
```
101+
{{source_code}}
102+
```
103+
104+
## Surviving Mutants:
105+
```json
106+
{{surviving_mutants}}
107+
```
108+
109+
Based on the mutation testing results only showing the surviving mutants. Please analyze the following aspects:
110+
- Vulnerable Code Areas: Identification of critical areas in the code that are most vulnerable based on the surviving mutants.
111+
- Test Case Gaps: Analysis of specific reasons why the existing test cases failed to detect these mutants.
112+
- Improvement Recommendations: Suggestions for new or improved test cases to effectively target and eliminate the surviving mutants.
113+
114+
## Example Output:
115+
<example>
116+
### Vulnerable Code Areas
117+
**File:** `src/main/java/com/example/BankAccount.java`
118+
**Location:** Line 45
119+
**Description:** The method `withdraw` does not properly handle negative inputs, leading to potential incorrect account balances.
120+
121+
### Test Case Gaps
122+
**File:** `src/test/java/com/example/BankAccountTest.java`
123+
**Location:** Test method `testWithdraw`
124+
**Reason:** Existing test cases do not cover edge cases such as negative withdrawals or withdrawals greater than the account balance.
125+
126+
### Improvement Recommendations
127+
**New Test Cases Needed:**
128+
1. **Test Method:** `testWithdrawNegativeAmount`
129+
- **Description:** Add a test case to check behavior when attempting to withdraw a negative amount.
130+
2. **Test Method:** `testWithdrawExceedingBalance`
131+
- **Description:** Add a test case to ensure proper handling when withdrawing an amount greater than the current account balance.
132+
3. **Test Method:** `testWithdrawBoundaryConditions`
133+
- **Description:** Add boundary condition tests for withdrawal amounts exactly at zero and exactly equal to the account balance.
134+
</example>
135+
136+
To reduce cognitive load, focus on quality over quantity to ensure that the mutants analysis are meaningful and provide valuable insights into the code quality and test coverage. Output your analysis in a clear and concise manner, highlighting the key points for each aspect with less than 300 words.
137+
"""

src/mutahunter/core/prompts/system.py

Lines changed: 0 additions & 53 deletions
This file was deleted.

0 commit comments

Comments
 (0)