A comprehensive toolkit for malware analysis that includes two main components:
malware_analysis.py
: A powerful static analysis tool for malware samplesyara_rules_generator.py
: An interactive tool for generating YARA rules based on malware characteristics
This toolkit combines advanced static analysis capabilities with custom YARA rule generation to enhance malware detection and analysis workflows.
A Python-based tool for performing advanced static analysis on malware samples. This tool extracts file information, analyzes PE files, extracts strings, calculates entropy, matches YARA rules, integrates with VirusTotal, disassembles binaries, detects packers, extracts network artifacts, and generates detailed reports.
- File Information: Extracts file size, type, and cryptographic hashes (MD5, SHA1, SHA256)
- PE File Analysis: Analyzes PE file structures, sections, imports, exports, and entry points
- Entropy Analysis: Calculates entropy for PE sections to detect obfuscation/encryption
- String Analysis: Extracts and analyzes ASCII/Unicode strings
- YARA Integration: Scans files against custom or predefined YARA rules
- VirusTotal Integration: Checks samples against VirusTotal's database
- Disassembly: Provides assembly code analysis using Capstone
- Network Artifacts: Extracts IPs, domains, and URLs
- Packer Detection: Identifies common packers and obfuscation techniques
- ClamAV Integration: Scans files using ClamAV antivirus
- Metadata Extraction: Analyzes metadata from various file formats
- Comprehensive Reporting: Generates detailed JSON/CSV reports
- Automated String Extraction: Extracts printable strings from executables using the 'strings' command
- AI-Powered Rule Generation: Leverages Google's Gemini API to generate intelligent YARA rules
- Customizable Analysis: Supports custom descriptions to guide rule generation
- Flexible Output: Save generated rules to specified output files
- Environment Variable Support: Configurable API key via command line or environment variable
- Python 3.6 or higher
- ClamAV installation (optional)
- VirusTotal API key (optional)
# Core dependencies
argparse # Command-line argument parsing
hashlib # File hash calculations
magic # File type identification
pefile # PE file analysis
yara-python # YARA rule processing
capstone # Binary disassembly
requests # API interactions
pdfminer # PDF analysis
olefile # OLE file analysis
Pillow # Image file analysis
csv # Report generation
re # Pattern matching
subprocess # External tool integration
google.generativeai # Gemini API integration for YARA rule generation
- Python 3.6 or higher
- ClamAV installation (optional)
- VirusTotal API key (optional)
- 'strings' command installed on the system
- Google Gemini API key (for yara_rules_generator.py)
- Clone the repository:
git clone https://github.com/sharmaniraj009/malware-analysis-automation.git
cd malware-analysis-automation
- Install required Python packages:
pip install -r requirements.txt
- (Optional) Install ClamAV for additional scanning capabilities:
# Ubuntu/Debian
sudo apt-get install clamav
# Update virus definitions
sudo freshclam
Get help on available options:
python malware_analysis.py --help
Basic file analysis:
python malware_analysis.py --file malware.exe
Comprehensive analysis with all features:
python malware_analysis.py --file malware.exe --yara custom_rules.yar --virustotal YOUR_API_KEY --pe --strings --disassembly --network --packer --clamav --metadata --report report.json
The tool automatically generates YARA rules by analyzing executable files:
python yara_rules_generator.py exe_file [options]
Required Arguments:
exe_file
: Path to the executable file to analyze
Optional Arguments:
--api-key KEY
: Your Gemini API key (can also use GOOGLE_API_KEY environment variable)-o, --output-file FILE
: Output file to save the generated YARA rule-d, --description DESC
: Custom description to guide the analysis and rule generation
Example Usage:
# Using command line API key
python yara_rules_generator.py sample.exe --api-key YOUR_API_KEY -o rule.yar -d "Generate rule based on executable analysis"
# Using environment variable for API key
export GOOGLE_API_KEY=your_api_key
python yara_rules_generator.py sample.exe -o rule.yar
Notes:
- Ensure the 'strings' command is installed on your system
- Either provide the API key via --api-key or set the GOOGLE_API_KEY environment variable
- The tool will display appropriate error messages if requirements are not met
Option | Description | Required |
---|---|---|
--file | Path to the malware sample | Yes |
--yara | Path to YARA rules file | No |
--virustotal | VirusTotal API key | No |
--pe | Enable PE file analysis | No |
--strings | Enable string extraction and analysis | No |
--disassembly | Enable binary disassembly | No |
--network | Extract network indicators | No |
--packer | Enable packer detection | No |
--clamav | Enable ClamAV scanning | No |
--metadata | Extract file metadata | No |
--compare | Compare with known samples | No |
--report | Generate analysis report | No |
--format | Report format (json/csv) | No |
-
Basic Analysis:
python malware_analysis.py malware.exe
-
With YARA Rules:
python malware_analysis.py malware.exe --yara rules.yar
-
With VirusTotal:
python malware_analysis.py malware.exe --virustotal YOUR_API_KEY
-
Full Analysis:
python malware_analysis.py malware.exe --yara rules.yar --virustotal YOUR_API_KEY --pe --strings --disassembly --network --packer --report report.json --format json
-
Full Analysis with Advanced Features:
python malware_analysis.py malware.exe --yara rules.yar --virustotal YOUR_API_KEY --pe --strings --disassembly --network --packer --memory --sandbox SANDBOX_API_KEY --threads 4 --report report.json --format json
Analyzing file: malware.exe
File Information:
file_name: malware.exe
file_size: 102400
file_type: PE32 executable (GUI) Intel 80386, for MS Windows
md5: 5a414e3a7b3b7b7b7b7b7b7b7b7b7b7b
sha1: 7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b
sha256: 7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b7b
Extracted Strings (first 10):
HelloWorld
GetProcAddress
LoadLibrary
...
PE File Analysis:
Entry Point: 0x1000
Sections:
.text: VA=0x1000, Size=0x1000, Entropy=6.12
.data: VA=0x2000, Size=0x500, Entropy=4.56
Imports (first 10):
GetProcAddress
LoadLibrary
...
Section Entropy:
.text: 6.12
.data: 4.56
YARA Rule Matches:
Rule1
Rule2
Checking VirusTotal...
VirusTotal Results:
Malicious: 45
Suspicious: 3
Undetected: 10
Disassembly (first 10 lines):
0x1000: mov eax, ebx
0x1002: call 0x2000
...
Network Artifacts:
IPs: ['192.168.1.1', '10.0.0.1']
Domains: ['example.com', 'malware.domain']
URLs: ['http://example.com/malware', 'https://malware.domain/payload']
Packer Detection:
Suspicious sections (high entropy): ['.text', '.data']
Report saved to report.json
The tool relies on the following Python libraries:
pefile
: For PE file analysis.python-magic
: For file type identification.yara-python
: For YARA rule matching.capstone
: For disassembly.requests
: For VirusTotal API integration.
Install them using:
pip install -r requirements.txt
To use the VirusTotal integration:
- Sign up for a VirusTotal account and obtain an API key.
- Pass the API key using the
--virustotal
flag when running the tool.
The tool can generate reports in JSON or CSV format. Use the --report
flag to specify the output file and --format
to choose the format (json
or csv
).
Example:
python malware_analysis.py malware.exe --report report.json --format json
Contributions are welcome! If you'd like to add new features or improve the tool, follow these steps:
- Fork the repository.
- Create a new branch for your feature or bugfix.
- Submit a pull request with a detailed description of your changes.
This project is licensed under the MIT License. See the LICENSE file for details.
This tool is intended for educational and research purposes only. Use it responsibly and ensure you have proper authorization to analyze any files. The authors are not responsible for any misuse of this tool.
For questions, issues, or feature requests, please open an issue on the GitHub repository.
Let me know if you need further assistance!