Scripts to extract original documents from .P7M signed files.
Report Bug
·
Request Feature
Table of Contents
These scripts scan a specified directory for .P7M files (digitally signed containers) and extract the embedded PDF documents. They identify the PDF header within the P7M file and save the extracted content as a new PDF file.
Available implementations:
- PowerShell (
Extract-P7M.ps1) - Windows-native solution - Python (
extract-p7m.py) - Cross-platform solution
- Extracts PDF files from P7M containers
- Supports recursive directory scanning
- Provides progress indication during processing
- Offers detailed mode with tabular output
- Validates input directory paths
- Includes comprehensive error handling with colored output
- Windows PowerShell 5.1 or later
- Execution policy allowing script execution (e.g., RemoteSigned)
- Python 3.6 or later
- No additional dependencies required
- Clone the repo
git clone https://github.com/mantamburini/p7m-extractor.git
.\Extract-P7M.ps1 [-Path <String>] [-Recurse] [-Detailed]
-
-Path
<String>
The directory path to scan for .P7M files. Defaults to the current working directory. Must be a valid container path. -
-Recurse
<Switch>
If specified, recursively scans subdirectories for .P7m files. -
-Detailed
<Switch>
If specified, provides detailed processing information in tabular format instead of simple console messages.
-
Extract PDFs from P7M files in the current directory:
.\Extract-P7M.ps1 -
Extract PDFs from a specific directory:
.\Extract-P7M.ps1 -Path "C:\MyDocuments" -
Extract PDFs recursively from a directory:
.\Extract-P7M.ps1 -Path "C:\MyDocuments" -Recurse -
Extract PDFs with detailed tabular output:
.\Extract-P7M.ps1 -Path "C:\MyDocuments" -Detailed
python extract-p7m.py [path] [-r] [-d]
-
path (optional)
The directory path to scan for .P7M files. Defaults to the current working directory. -
-r, --recurse
If specified, recursively scans subdirectories for .P7M files. -
-d, --detailed
If specified, provides detailed processing information in tabular format.
-
Extract PDFs from P7M files in the current directory:
python extract-p7m.py -
Extract PDFs from a specific directory:
python extract-p7m.py "/path/to/documents" -
Extract PDFs recursively from a directory:
python extract-p7m.py "/path/to/documents" -r -
Extract PDFs with detailed tabular output:
python extract-p7m.py "/path/to/documents" -d
- Without
-Detailed: Displays colored "OK " (green) for successful extractions, warnings for files without PDFs, or red error messages for processing failures. - With
-Detailed: Outputs a table with columns: FileName, Status, OutputFile, Size. Exits early if no .p7m files are found.
- Without
-d: Displays "OK " for successful extractions, "NO PDF " for files without PDFs, or error messages for processing failures. - With
-d: Outputs a table with columns: FileName, Status, OutputFile, Size. Exits early if no .p7m files are found.
See the open issues for a full list of proposed features (and known issues).
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Distributed under the GPL-3.0 License. See LICENSE for more information.
Marcello Anselmi Tamburini
Project Link: https://github.com/mantamburini/p7m-extractor