Skip to content

Conversation

@GYFX35
Copy link
Owner

@GYFX35 GYFX35 commented Sep 6, 2025

This commit introduces a new tool, the Mobile App Analyzer, which is designed to perform static analysis on Android APK files.

The new tool has the following features:

  • Decompiles APKs using apktool.
  • Extracts permissions from the AndroidManifest.xml using aapt and checks them against a list of high-risk permissions.
  • Reuses the existing sensitive_data_scanner to find hardcoded secrets in the decompiled source code.
  • Reuses the social_media_analyzer's scam detector to find suspicious URLs and other scam-related indicators in the app's files.

A command-line interface is provided in mobile_analyzer_main.py, and documentation has been added in README_mobile_analyzer.md.

Limitations:

  • This implementation only supports Android APK files. Analysis for Apple (iOS) apps is not included.
  • The analysis is purely static and does not involve running the application.

Summary by Sourcery

Add a new Mobile App Analyzer for static security scanning of Android APKs, including decompilation, high-risk permission detection, secret scanning, and scam indicator analysis via a CLI

New Features:

  • Introduce a Mobile App Analyzer tool for static analysis of Android APK files
  • Decompile APKs with apktool and extract permissions using aapt
  • Detect high-risk permissions against a predefined list
  • Scan decompiled code for hardcoded secrets via the existing sensitive_data_scanner
  • Identify scam indicators in app files using the social_media_analyzer's scam detector
  • Expose the analysis workflow through a command-line interface in mobile_analyzer_main.py

Documentation:

  • Add README_mobile_analyzer.md with tool overview, prerequisites, usage instructions, and output interpretation

This commit introduces a new tool, the Mobile App Analyzer, which is designed to perform static analysis on Android APK files.

The new tool has the following features:
- Decompiles APKs using `apktool`.
- Extracts permissions from the `AndroidManifest.xml` using `aapt` and checks them against a list of high-risk permissions.
- Reuses the existing `sensitive_data_scanner` to find hardcoded secrets in the decompiled source code.
- Reuses the `social_media_analyzer`'s scam detector to find suspicious URLs and other scam-related indicators in the app's files.

A command-line interface is provided in `mobile_analyzer_main.py`, and documentation has been added in `README_mobile_analyzer.md`.

Limitations:
- This implementation only supports Android APK files. Analysis for Apple (iOS) apps is not included.
- The analysis is purely static and does not involve running the application.
@sourcery-ai
Copy link

sourcery-ai bot commented Sep 6, 2025

Reviewer's Guide

Implements a new Mobile App Analyzer tool with static analysis capabilities for Android APK security scanning. The PR adds core decompilation and permission analysis logic, integrates existing secret and scam scanners, provides a CLI orchestrator, and includes detailed documentation.

Sequence diagram for Mobile App Analyzer CLI workflow

sequenceDiagram
    actor User
    participant "mobile_analyzer_main.py"
    participant "analyzer.py"
    participant "apktool (external)"
    participant "aapt (external)"
    participant "sensitive_data_scanner"
    participant "social_media_analyzer"
    User->>"mobile_analyzer_main.py": Run analyzer with APK path
    "mobile_analyzer_main.py"->>"analyzer.py": decompile_apk(apk_path, output_dir)
    "analyzer.py"->>"apktool (external)": Decompile APK
    "analyzer.py"-->>"mobile_analyzer_main.py": Success/Failure
    "mobile_analyzer_main.py"->>"analyzer.py": get_permissions(apk_path)
    "analyzer.py"->>"aapt (external)": Extract permissions
    "analyzer.py"-->>"mobile_analyzer_main.py": Permissions list
    "mobile_analyzer_main.py"->>"analyzer.py": check_high_risk_permissions(permissions)
    "analyzer.py"-->>"mobile_analyzer_main.py": High-risk permissions
    "mobile_analyzer_main.py"->>"analyzer.py": scan_for_sensitive_data(output_dir)
    "analyzer.py"->>"sensitive_data_scanner": Scan for secrets
    "analyzer.py"-->>"mobile_analyzer_main.py": Sensitive data findings
    "mobile_analyzer_main.py"->>"analyzer.py": scan_for_scam_indicators(output_dir)
    "analyzer.py"->>"social_media_analyzer": Scan for scam indicators
    "analyzer.py"-->>"mobile_analyzer_main.py": Scam findings
    "mobile_analyzer_main.py"->>User: Print analysis report
    "mobile_analyzer_main.py"->>"analyzer.py": Clean up files (optional)
Loading

File-Level Changes

Change Details Files
Introduce APK analysis core module
  • Implement APK decompilation using apktool and subprocess calls
  • Extract permissions from AndroidManifest.xml via aapt and regex, then classify high-risk ones
  • Integrate sensitive_data_scanner and social_media_analyzer for secret and scam detection
  • Traverse decompiled files with extension and size checks for safe text scanning
mobile_analyzer/analyzer.py
Add command-line interface to orchestrate analysis
  • Parse CLI arguments and validate APK path
  • Create and clean up a temporary decompiled output directory
  • Invoke each analysis step in sequence and format console output
  • Support a --keep-files flag to retain decompiled files
mobile_analyzer_main.py
Provide tool documentation and usage guide
  • Document features, prerequisites, and installation steps
  • Explain CLI usage, options, and result interpretation
  • Outline tool limitations and include a disclaimer
README_mobile_analyzer.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

Blocking issues:

  • Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'. (link)
  • Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'. (link)

General comments:

  • Consider refactoring this tool as a proper Python package (e.g., with setup.py or pyproject.toml) to eliminate the sys.path hack and support standard imports.
  • Allow configuring the output directory via a CLI flag or using a temporary directory instead of always writing to a fixed "decompiled_apk" folder to avoid collisions or residual files.
  • Replace print statements with Python's logging module (with log levels) for better control over verbosity and easier integration with other tools.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider refactoring this tool as a proper Python package (e.g., with setup.py or pyproject.toml) to eliminate the sys.path hack and support standard imports.
- Allow configuring the output directory via a CLI flag or using a temporary directory instead of always writing to a fixed "decompiled_apk" folder to avoid collisions or residual files.
- Replace print statements with Python's logging module (with log levels) for better control over verbosity and easier integration with other tools.

## Individual Comments

### Comment 1
<location> `mobile_analyzer/analyzer.py:69` </location>
<code_context>
+    print(f"Decompiling {apk_path} to {output_dir}...")
+    try:
+        # Using -f to force overwrite the output directory if it exists
+        command = ["apktool", "d", "-f", apk_path, "-o", output_dir]
+        result = subprocess.run(command, capture_output=True, text=True, check=True)
+        print("Decompilation successful.")
</code_context>

<issue_to_address>
No timeout specified for subprocess calls.

Subprocesses may hang; add a timeout to subprocess.run for reliability.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
        result = subprocess.run(command, capture_output=True, text=True, check=True)
        print("Decompilation successful.")
        return True
    except subprocess.CalledProcessError as e:
        print("Error during decompilation:")
        print(e.stderr)
        return False
    except FileNotFoundError:
        print("Error: 'apktool' not found. Make sure it is installed and in your PATH.")
        return False
=======
        # Set a timeout (e.g., 60 seconds) to prevent hanging
        try:
            result = subprocess.run(command, capture_output=True, text=True, check=True, timeout=60)
            print("Decompilation successful.")
            return True
        except subprocess.TimeoutExpired:
            print("Error: Decompilation process timed out.")
            return False
    except subprocess.CalledProcessError as e:
        print("Error during decompilation:")
        print(e.stderr)
        return False
    except FileNotFoundError:
        print("Error: 'apktool' not found. Make sure it is installed and in your PATH.")
        return False
>>>>>>> REPLACE

</suggested_fix>

### Comment 2
<location> `mobile_analyzer/analyzer.py:97` </location>
<code_context>
+
+    print(f"Extracting permissions from {apk_path}...")
+    try:
+        command = ["aapt", "dump", "permissions", apk_path]
+        result = subprocess.run(command, capture_output=True, text=True, check=True)
+
</code_context>

<issue_to_address>
Regex for permission extraction may miss edge cases.

The current regex may fail if the aapt output format changes. Please update the regex to handle variations in whitespace or formatting.

Suggested implementation:

```python
    print(f"Extracting permissions from {apk_path}...")
    try:
        command = ["aapt", "dump", "permissions", apk_path]
        result = subprocess.run(command, capture_output=True, text=True, check=True)
        # Updated regex to handle variations in whitespace and formatting
        import re
        permission_pattern = re.compile(r'^\s*permission(?:\s*:\s*|\s+)([\w\.\-]+)', re.MULTILINE)
        permissions = permission_pattern.findall(result.stdout)

```

If the function is expected to return the list of permissions, ensure you add `return permissions` at the appropriate place after extracting them. If there is existing code that processes the output, update it to use the new `permissions` variable.
</issue_to_address>

### Comment 3
<location> `mobile_analyzer_main.py:19` </location>
<code_context>
+        sys.exit(1)
+
+    # Create a directory for the decompiled code
+    output_dir = "decompiled_apk"
+    if os.path.exists(output_dir):
+        shutil.rmtree(output_dir)
</code_context>

<issue_to_address>
Hardcoded output directory may cause conflicts.

Allow users to specify the output directory or generate a unique temporary directory to avoid conflicts when running multiple analyses or if the directory already exists.

Suggested implementation:

```python
    import tempfile

    parser.add_argument("apk_path", help="The path to the APK file to analyze.")
    parser.add_argument("--keep-files", action="store_true", help="Keep the decompiled files after analysis.")
    parser.add_argument("--output-dir", type=str, default=None, help="Directory to store decompiled files. If not specified, a temporary directory will be used.")
    args = parser.parse_args()

    apk_path = args.apk_path
    if not os.path.exists(apk_path):
        print(f"Error: APK file not found at {apk_path}")
        sys.exit(1)

    # Create a directory for the decompiled code
    if args.output_dir:
        output_dir = args.output_dir
        if os.path.exists(output_dir):
            shutil.rmtree(output_dir)
        os.makedirs(output_dir)
    else:
        output_dir = tempfile.mkdtemp(prefix="decompiled_apk_")

```

```python
    print(f"--- Starting analysis for {os.path.basename(apk_path)} ---")
    print(f"Decompiled code will be stored in: {output_dir}")

    # 1. Decompile the APK
    if not analyzer.decompile_apk(apk_path, output_dir):
        print("Failed to decompile APK. Aborting analysis.")
        sys.exit(1)

```

If there is code later in the file that deletes the output directory (e.g., when --keep-files is not set), ensure it works with both user-specified and temporary directories. You may want to print a message to the user if a temporary directory is used and files are deleted.
</issue_to_address>

### Comment 4
<location> `mobile_analyzer_main.py:20` </location>
<code_context>
+
+    # Create a directory for the decompiled code
+    output_dir = "decompiled_apk"
+    if os.path.exists(output_dir):
+        shutil.rmtree(output_dir)
+    os.makedirs(output_dir)
+
+    print(f"--- Starting analysis for {os.path.basename(apk_path)} ---")
</code_context>

<issue_to_address>
Potential race condition when removing and creating output directory.

Another process could create the directory between its removal and recreation, causing issues. Consider using tempfile.TemporaryDirectory or an atomic approach.

Suggested implementation:

```python
    import tempfile

    # Create a temporary directory for the decompiled code
    with tempfile.TemporaryDirectory(prefix="decompiled_apk_") as output_dir:
        print(f"--- Starting analysis for {os.path.basename(apk_path)} ---")

        # 1. Decompile the APK
        if not analyzer.decompile_apk(apk_path, output_dir):
            print("Failed to decompile APK. Aborting analysis.")
            sys.exit(1)

```

You will need to indent all code that uses `output_dir` so it is inside the `with tempfile.TemporaryDirectory(...) as output_dir:` block.  
If you want to support the `--keep-files` option, you will need to add logic to copy the temporary directory to a permanent location if requested.
</issue_to_address>

### Comment 5
<location> `mobile_analyzer_main.py:70` </location>
<code_context>
+        print("\nNo scam indicators found.")
+
+    # Clean up the decompiled files
+    if not args.keep_files:
+        print("\nCleaning up temporary files...")
+        shutil.rmtree(output_dir)
+
+    print("\n--- Analysis Complete ---")
</code_context>

<issue_to_address>
No error handling for cleanup step.

Wrap shutil.rmtree in a try-except block to handle potential exceptions if the directory is locked or files are in use.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
    # Clean up the decompiled files
    if not args.keep_files:
        print("\nCleaning up temporary files...")
        shutil.rmtree(output_dir)

    print("\n--- Analysis Complete ---")
=======
    # Clean up the decompiled files
    if not args.keep_files:
        print("\nCleaning up temporary files...")
        try:
            shutil.rmtree(output_dir)
        except Exception as e:
            print(f"Warning: Failed to clean up temporary files in '{output_dir}'. Reason: {e}")

    print("\n--- Analysis Complete ---")
>>>>>>> REPLACE

</suggested_fix>

## Security Issues

### Issue 1
<location> `mobile_analyzer/analyzer.py:70` </location>

<issue_to_address>
**security (python.lang.security.audit.dangerous-subprocess-use-audit):** Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

*Source: opengrep*
</issue_to_address>

### Issue 2
<location> `mobile_analyzer/analyzer.py:98` </location>

<issue_to_address>
**security (python.lang.security.audit.dangerous-subprocess-use-audit):** Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

*Source: opengrep*
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +70 to +79
result = subprocess.run(command, capture_output=True, text=True, check=True)
print("Decompilation successful.")
return True
except subprocess.CalledProcessError as e:
print("Error during decompilation:")
print(e.stderr)
return False
except FileNotFoundError:
print("Error: 'apktool' not found. Make sure it is installed and in your PATH.")
return False
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): No timeout specified for subprocess calls.

Subprocesses may hang; add a timeout to subprocess.run for reliability.

Suggested change
result = subprocess.run(command, capture_output=True, text=True, check=True)
print("Decompilation successful.")
return True
except subprocess.CalledProcessError as e:
print("Error during decompilation:")
print(e.stderr)
return False
except FileNotFoundError:
print("Error: 'apktool' not found. Make sure it is installed and in your PATH.")
return False
# Set a timeout (e.g., 60 seconds) to prevent hanging
try:
result = subprocess.run(command, capture_output=True, text=True, check=True, timeout=60)
print("Decompilation successful.")
return True
except subprocess.TimeoutExpired:
print("Error: Decompilation process timed out.")
return False
except subprocess.CalledProcessError as e:
print("Error during decompilation:")
print(e.stderr)
return False
except FileNotFoundError:
print("Error: 'apktool' not found. Make sure it is installed and in your PATH.")
return False


print(f"Extracting permissions from {apk_path}...")
try:
command = ["aapt", "dump", "permissions", apk_path]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Regex for permission extraction may miss edge cases.

The current regex may fail if the aapt output format changes. Please update the regex to handle variations in whitespace or formatting.

Suggested implementation:

    print(f"Extracting permissions from {apk_path}...")
    try:
        command = ["aapt", "dump", "permissions", apk_path]
        result = subprocess.run(command, capture_output=True, text=True, check=True)
        # Updated regex to handle variations in whitespace and formatting
        import re
        permission_pattern = re.compile(r'^\s*permission(?:\s*:\s*|\s+)([\w\.\-]+)', re.MULTILINE)
        permissions = permission_pattern.findall(result.stdout)

If the function is expected to return the list of permissions, ensure you add return permissions at the appropriate place after extracting them. If there is existing code that processes the output, update it to use the new permissions variable.

sys.exit(1)

# Create a directory for the decompiled code
output_dir = "decompiled_apk"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Hardcoded output directory may cause conflicts.

Allow users to specify the output directory or generate a unique temporary directory to avoid conflicts when running multiple analyses or if the directory already exists.

Suggested implementation:

    import tempfile

    parser.add_argument("apk_path", help="The path to the APK file to analyze.")
    parser.add_argument("--keep-files", action="store_true", help="Keep the decompiled files after analysis.")
    parser.add_argument("--output-dir", type=str, default=None, help="Directory to store decompiled files. If not specified, a temporary directory will be used.")
    args = parser.parse_args()

    apk_path = args.apk_path
    if not os.path.exists(apk_path):
        print(f"Error: APK file not found at {apk_path}")
        sys.exit(1)

    # Create a directory for the decompiled code
    if args.output_dir:
        output_dir = args.output_dir
        if os.path.exists(output_dir):
            shutil.rmtree(output_dir)
        os.makedirs(output_dir)
    else:
        output_dir = tempfile.mkdtemp(prefix="decompiled_apk_")
    print(f"--- Starting analysis for {os.path.basename(apk_path)} ---")
    print(f"Decompiled code will be stored in: {output_dir}")

    # 1. Decompile the APK
    if not analyzer.decompile_apk(apk_path, output_dir):
        print("Failed to decompile APK. Aborting analysis.")
        sys.exit(1)

If there is code later in the file that deletes the output directory (e.g., when --keep-files is not set), ensure it works with both user-specified and temporary directories. You may want to print a message to the user if a temporary directory is used and files are deleted.

Comment on lines +20 to +22
if os.path.exists(output_dir):
shutil.rmtree(output_dir)
os.makedirs(output_dir)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Potential race condition when removing and creating output directory.

Another process could create the directory between its removal and recreation, causing issues. Consider using tempfile.TemporaryDirectory or an atomic approach.

Suggested implementation:

    import tempfile

    # Create a temporary directory for the decompiled code
    with tempfile.TemporaryDirectory(prefix="decompiled_apk_") as output_dir:
        print(f"--- Starting analysis for {os.path.basename(apk_path)} ---")

        # 1. Decompile the APK
        if not analyzer.decompile_apk(apk_path, output_dir):
            print("Failed to decompile APK. Aborting analysis.")
            sys.exit(1)

You will need to indent all code that uses output_dir so it is inside the with tempfile.TemporaryDirectory(...) as output_dir: block.
If you want to support the --keep-files option, you will need to add logic to copy the temporary directory to a permanent location if requested.

Comment on lines +69 to +74
# Clean up the decompiled files
if not args.keep_files:
print("\nCleaning up temporary files...")
shutil.rmtree(output_dir)

print("\n--- Analysis Complete ---")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): No error handling for cleanup step.

Wrap shutil.rmtree in a try-except block to handle potential exceptions if the directory is locked or files are in use.

Suggested change
# Clean up the decompiled files
if not args.keep_files:
print("\nCleaning up temporary files...")
shutil.rmtree(output_dir)
print("\n--- Analysis Complete ---")
# Clean up the decompiled files
if not args.keep_files:
print("\nCleaning up temporary files...")
try:
shutil.rmtree(output_dir)
except Exception as e:
print(f"Warning: Failed to clean up temporary files in '{output_dir}'. Reason: {e}")
print("\n--- Analysis Complete ---")

try:
# Using -f to force overwrite the output directory if it exists
command = ["apktool", "d", "-f", apk_path, "-o", output_dir]
result = subprocess.run(command, capture_output=True, text=True, check=True)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security (python.lang.security.audit.dangerous-subprocess-use-audit): Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

Source: opengrep

print(f"Extracting permissions from {apk_path}...")
try:
command = ["aapt", "dump", "permissions", apk_path]
result = subprocess.run(command, capture_output=True, text=True, check=True)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security (python.lang.security.audit.dangerous-subprocess-use-audit): Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

Source: opengrep

Comment on lines +122 to +126
found_high_risk = []
for perm in permissions:
if perm in HIGH_RISK_PERMISSIONS:
found_high_risk.append(perm)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Convert for loop into list comprehension (list-comprehension)

Suggested change
found_high_risk = []
for perm in permissions:
if perm in HIGH_RISK_PERMISSIONS:
found_high_risk.append(perm)
found_high_risk = [
perm for perm in permissions if perm in HIGH_RISK_PERMISSIONS
]

from mobile_analyzer import analyzer

def main():
parser = argparse.ArgumentParser(description="Analyze an Android APK for security vulnerabilities.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Use named expression to simplify assignment and conditional [×4] (use-named-expression)

@GYFX35 GYFX35 merged commit 0256233 into main Sep 6, 2025
2 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants