A Python script that converts WordPress eXtended RSS (WXR) export files to CSV format for easy data analysis and migration.
- Converts WordPress export files (WXR format) to CSV
- Extracts posts, pages, and custom post types
- Preserves metadata including categories, tags, custom fields
- Handles HTML content and special characters properly
- Command-line interface for easy automation
- No external dependencies (uses Python standard library only)
- Clone or download this repository
- Ensure you have Python 3.6+ installed
- No additional packages needed - uses only Python standard library
Basic usage:
python wxr_to_csv.py your_wordpress_export.xml
Specify output file:
python wxr_to_csv.py your_wordpress_export.xml -o output.csv
Include specific post types:
python wxr_to_csv.py your_wordpress_export.xml -t post page custom_post_type
Autorun script
python autorun.py
from wxr_to_csv import WXRToCSVConverter
converter = WXRToCSVConverter()
converter.convert_to_csv('export.xml', 'output.csv', ['post', 'page'])
The generated CSV file includes the following columns:
- post_id: WordPress post ID
- title: Post/page title
- post_type: Type of content (post, page, etc.)
- status: Publication status (publish, draft, etc.)
- post_date: Publication date
- post_modified: Last modification date
- creator: Author username
- link: Post URL
- post_name: URL slug
- description: Post excerpt/description
- content: Full post content (HTML)
- excerpt: Post excerpt
- categories: Categories (semicolon-separated)
- tags: Tags (semicolon-separated)
- comment_status: Comment settings
- ping_status: Pingback/trackback settings
- post_parent: Parent post ID (for hierarchical content)
- menu_order: Menu order
- is_sticky: Sticky post flag
- post_password: Password protection
- custom_fields: Custom field data
- pub_date: RSS publication date
- post_date_gmt: Publication date (GMT)
- post_modified_gmt: Modification date (GMT)
- Log into your WordPress admin dashboard
- Go to Tools → Export
- Select All content or choose specific content types
- Click Download Export File
- Use the downloaded
.xml
file with this script
usage: wxr_to_csv.py [-h] [-o OUTPUT] [-t TYPES [TYPES ...]] input_file
Convert WordPress eXtended RSS (WXR) files to CSV format
positional arguments:
input_file Path to the WXR file to convert
optional arguments:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output CSV file path (default: same name as input with .csv extension)
-t TYPES [TYPES ...], --types TYPES [TYPES ...]
Post types to include (default: post page)
Convert all posts and pages:
python wxr_to_csv.py wordpress_export.xml
Convert only blog posts:
python wxr_to_csv.py wordpress_export.xml -t post
Convert custom post types:
python wxr_to_csv.py wordpress_export.xml -t product testimonial
- "Error parsing WXR file": The XML file may be corrupted or not a valid WXR file
- "No posts found": Check that the post types you specified exist in the export
- Encoding issues: The script handles UTF-8 encoding by default
For very large WordPress exports:
- The script loads the entire XML file into memory
- Consider splitting large exports if you encounter memory issues
- You can filter by post type to reduce the dataset size
This project is open source and available under the MIT License.
Feel free to submit issues, fork the repository, and create pull requests for any improvements.