Add Container Browser documentation.

bpotchik · bpotchik · commit 6ca568ccbac9 · 2025-11-06T00:20:24.000-05:00
diff --git a/docs/dev/containertransforms.md b/docs/dev/containertransforms.md
@@ -0,0 +1,218 @@
+# Container Transforms
+
+Container Transforms are specialized transforms that enable Binary Ninja to extract and navigate files within container formats such as ZIP archives, disk images, and other multi-file structures. Unlike simple encoding transforms (Base64, Hex, etc.), container transforms can produce multiple output files and interact with the Container Browser UI.
+
+You can list all available container transforms (those with detection support) using:
+
+```python
+>>> [x.name for x in Transform if getattr(x, "supports_detection", False)]
+['Gzip', 'Zlib', 'Zip', 'CaRT', 'IntelHex', 'SRec', 'TiTxt', 'IMG4', 'LZFSE']
+```
+
+## Overview
+
+The Transform API provides the foundation for creating custom container decoders. Container transforms differ from standard transforms in that they:
+
+1. Support **context-aware decoding** via `perform_decode_with_context()`
+2. Can produce **multiple output files** from a single input
+3. Support **password protection** and other interactive parameters
+4. Integrate with the **Container Browser** UI for file selection
+
+## Basic Transform Structure
+
+All transforms, including container transforms, inherit from the `Transform` base class. Here's a minimal example:
+
+```python
+from binaryninja import Transform, TransformType, TransformCapabilities
+
+class MyContainerTransform(Transform):
+    transform_type = TransformType.DecodeTransform
+    capabilities = TransformCapabilities.TransformSupportsContext | TransformCapabilities.TransformSupportsDetection
+    name = "MyContainer"
+    long_name = "My Container Format"
+    group = "Container"
+
+    def can_decode(self, input):
+        """Check if this transform can decode the input"""
+        # Check for magic bytes or other signatures
+        head = input.read(0, 4)
+        return head == b"MYCN"  # Your format's magic bytes
+
+    def perform_decode_with_context(self, context, params):
+        """Context-aware extraction for multi-file containers"""
+        # Implementation details below
+        pass
+
+# Register the transform
+MyContainerTransform.register()
+```
+
+## Container Extraction Protocol
+
+Container transforms typically operate in **two phases**:
+
+### Phase 1: Discovery
+
+During discovery, the transform enumerates all available files and populates `context.available_files`:
+
+```python
+def perform_decode_with_context(self, context, params):
+    # Parse the container format
+    container = parse_my_format(context.input)
+
+    # Phase 1: Discovery
+    if not context.has_available_files:
+        file_list = [entry.name for entry in container.entries]
+        context.set_available_files(file_list)
+        return False  # More user interaction needed
+```
+
+Returning `False` indicates that the Container Browser should present these files to the user for selection.
+
+### Phase 2: Extraction
+
+Once the user selects files, the transform extracts them and creates child contexts:
+
+```python
+def perform_decode_with_context(self, context, params):
+    container = parse_my_format(context.input)
+
+    # Phase 1: Discovery (as above)
+    if not context.has_available_files:
+        # ... discovery code ...
+        return False
+
+    # Phase 2: Extraction
+    requested = context.requested_files
+    if not requested:
+        return False  # No files selected yet
+
+    complete = True
+    for filename in requested:
+        try:
+            data = container.extract(filename)
+            context.create_child(DataBuffer(data), filename)
+        except Exception as e:
+            # Create child with error status
+            context.create_child(
+                DataBuffer(b""),
+                filename,
+                result=TransformResult.TransformFailure,
+                message=str(e)
+            )
+            complete = False
+
+    return complete  # True if all files extracted successfully
+```
+
+## Complete Example: ZipPython
+
+Binary Ninja includes a reference implementation of a ZIP container transform in `api/python/transform.py`.
+
+
+
+## Transform Results and Error Handling
+
+Use `TransformResult` values to communicate extraction status:
+
+- `TransformResult.TransformSuccess`: Extraction completed successfully
+- `TransformResult.TransformNotAttempted`: Extraction not attempted
+- `TransformResult.TransformFailure`: Generic extraction failure
+- `TransformResult.TransformRequiresPassword`: File is encrypted and needs a password
+
+Set results on individual child contexts:
+
+```python
+context.create_child(
+    data=databuffer.DataBuffer(extracted_data),
+    filename="file.bin",
+    result=TransformResult.TransformSuccess,
+    message=""  # Optional success message
+)
+```
+
+## Working with Passwords
+
+Container transforms should integrate with Binary Ninja's password management system:
+
+```python
+# Get passwords from settings
+passwords = Settings().get_string_list('files.container.defaultPasswords')
+
+# Check for password in transform parameters
+if "password" in params:
+    p = params["password"]
+    pwd = p.decode("utf-8", "replace") if isinstance(p, (bytes, bytearray)) else str(p)
+    passwords.insert(0, pwd)
+
+# Try each password
+for password in passwords:
+    try:
+        content = extract_with_password(container, filename, password)
+        break  # Success!
+    except PasswordError:
+        continue  # Try next password
+```
+
+When a file requires a password that wasn't provided, use `TransformResult.TransformRequiresPassword` to signal the UI to prompt the user.
+
+## Metadata and Virtual Paths
+
+Container transforms automatically create metadata that tracks the extraction chain:
+
+```python
+# After opening a file extracted through containers:
+>>> bv.parent_view.auto_metadata['container']
+{
+    'chain': [
+        {'transform': 'Zip'},
+        {'transform': 'Base64'}
+    ],
+    'virtualPath': 'Zip(/path/to/archive.zip)::Base64(encoded_file)::extracted'
+}
+```
+
+You can also add custom metadata to child contexts:
+
+```python
+child = context.create_child(data, filename)
+if child.metadata_obj:
+    child.metadata_obj["custom_field"] = "value"
+```
+
+## Testing Container Transforms
+
+When testing your container transform, you can use the Python API directly:
+
+```python
+from binaryninja import TransformSession
+
+# Test with a file
+session = TransformSession("test_container.bin")
+
+# Process and check results
+if session.process():
+    print(f"Extraction complete: {session.current_context.filename}")
+else:
+    print("User interaction required")
+    ctx = session.current_context
+    if ctx.parent and ctx.parent.has_available_files:
+        print(f"Available files: {ctx.parent.available_files}")
+```
+
+For interactive testing in the UI:
+
+1. **Full Mode**: Settings → `files.container.mode` → "Full"
+   - Opens your container and shows all extracted files immediately
+2. **Interactive Mode**: Settings → `files.container.mode` → "Interactive"
+   - Requires clicking through each level of the container hierarchy
+
+## API Reference
+
+For complete API documentation, see:
+
+- [`Transform`](https://api.binary.ninja/binaryninja.transform-module.html#binaryninja.transform.Transform) - Base transform class
+- [`TransformContext`](https://api.binary.ninja/binaryninja.transform-module.html#binaryninja.transform.TransformContext) - Container extraction context
+- [`TransformSession`](https://api.binary.ninja/binaryninja.transform-module.html#binaryninja.transform.TransformSession) - Multi-stage extraction workflow
+- [`TransformResult`](https://api.binary.ninja/binaryninja.enums-module.html#binaryninja.enums.TransformResult) - Extraction result codes
+- [`TransformCapabilities`](https://api.binary.ninja/binaryninja.enums-module.html#binaryninja.enums.TransformCapabilities) - Transform capability flags
diff --git a/docs/dev/index.md b/docs/dev/index.md
@@ -11,6 +11,7 @@ The Binary Ninja API is available through a [Core API](#core-api), through the [
 The Python API is the most common third-party API and is used in many [public plugins](https://github.com/vector35/community-plugins). Here's a list of the most important Python API documentation resources:
 
  - [Writing Python Plugins](plugins.md)
+ - [Container Transforms](containertransforms.md) - Creating custom container/archive decoders
  - [Applying Annotations](annotation.md)
  - [Script Cookbook](cookbook.md) with common examples and concepts explained
  - [Python API Reference](https://api.binary.ninja/) (available offline via the Help menu)
diff --git a/docs/guide/index.md b/docs/guide/index.md
@@ -97,6 +97,79 @@ While Binary Ninja defaults to opening most files with sane defaults without pro
 
 Items 1, 3, 5, 6 in the above list all describe methods you can use to override the default settings and request an "Open with Options" dialog.
 
+### Container Browser
+
+![container browser](../img/container-browser.png "Container Browser"){ width="800" }
+
+The Container Browser provides an interactive way to explore and extract files from container formats such as ZIP archives, encrypted containers, and other nested file structures. When opening files that contain nested content, Binary Ninja can automatically detect and decode these containers, presenting a hierarchical tree view of all available files.
+
+Binary Ninja includes built-in support for the following container formats:
+
+- **Zip**: ZIP archives (including password-protected)
+- **Gzip**: Gzip compressed files
+- **Zlib**: Zlib compressed data
+- **CaRT**: Custom archive format for malware analysis with metadata support
+- **IntelHex**: Intel HEX format files
+- **SRec**: Motorola S-record format files
+- **TiTxt**: Texas Instruments TXT format
+- **IMG4**: Apple IMG4 container format
+- **LZFSE**: Apple LZFSE compressed data
+
+#### Container Detection Modes
+
+Binary Ninja offers three container detection modes, configurable via the [`files.container.mode`](settings.md#files.container.mode) setting:
+
+- **Full** (default): Automatically discovers all nested paths and builds a complete context tree before requesting user selection. This mode provides the most complete view of the container structure upfront.
+- **Interactive**: Requires user interaction at each level of the container hierarchy. This mode is useful when working with deeply nested containers or when you want more control over the extraction process.
+- **Disabled**: Opens the file as-is without attempting to unwrap container formats.
+
+#### Working with Containers
+
+When you open a container file in Full or Interactive mode, the Container Browser dialog displays:
+
+- **Name**: The file or entry name within the container
+- **Type**: The detected format
+- **Size**: File size in bytes
+- **Path**: The hierarchical path within the container structure
+
+The browser supports:
+
+- **Filtering**: Use the search box at the top to filter by name, type, or path
+- **Password Protection**: Binary Ninja will attempt common passwords (configurable via [`files.container.defaultPasswords`](settings.md#files.container.defaultPasswords)) before prompting for manual entry
+- **Custom Extraction**: Right-click any entry to access "Extract With" options, allowing you to apply different transforms (Base64, Hex, etc.) to decode content
+- **Metadata Display**: For containers that include embedded metadata, the associated information is displayed in the preview pane on the right.
+
+#### Virtual Paths
+
+Files opened through the Container Browser maintain a virtual path that tracks the full extraction chain. For example:
+
+```python
+>>> bv.file.virtual_path
+'Zip(.../papi_b64.zip)::Base64(papi_b64)::extracted'
+```
+
+This virtual path is also stored in the file's metadata for the 'Raw' BinaryView and can be accessed programmatically:
+
+```python
+>>> bv.parent_view.auto_metadata['container']
+{'chain': [{'transform': 'Zip'}, {'transform': 'Base64'}], 'virtualPath': 'Zip(.../papi_b64.zip)::Base64(papi_b64)::extracted'}
+```
+
+???+ Note "Note"
+     The virtual path and associated metadata are not persisted when saving to a database (`.bndb` file). Additionally, the format of the virtual path string may change in future releases.
+
+#### Settings
+
+The following settings control Container Browser behavior:
+
+- [`files.container.mode`](settings.md#files.container.mode): Controls container detection mode (Full/Interactive/Disabled)
+- [`files.container.autoOpen`](settings.md#files.container.autoOpen): Automatically opens files when there is exactly one extraction path with no required input
+- [`files.container.defaultPasswords`](settings.md#files.container.defaultPasswords): List of passwords to attempt for encrypted containers
+
+#### Adding Custom Container Support
+
+Binary Ninja's container system is extensible through the Transform API. Developers can create custom container decoders for proprietary or specialized formats. For information on implementing custom container transforms, see the [Container Transforms developer guide](../dev/containertransforms.md).
+
 ## Saving Files
 
 ![save choices >](../img/save-choices.png "Save Menu Choices"){ width="400" }
diff --git a/docs/img/container-browser.png b/docs/img/container-browser.png
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -129,6 +129,7 @@ nav:
         - 'dev/index.md'
         - Cookbook: 'dev/cookbook.md'
         - Writing Plugins: 'dev/plugins.md'
+        - Container Transforms: 'dev/containertransforms.md'
         - Automation: 'dev/batch.md'
         - BNIL / Architectures:
             - BNIL Guide&#58; Overview: 'dev/bnil-overview.md'