Skip to content

Conversation

@obenland
Copy link
Member

Reference: https://wordpress.org/support/topic/import-mastodon-beta/#post-18701387

Proposed changes:

  • Add support for tar.gz and tgz archive formats to the Mastodon importer
  • Implement multiple extraction methods for tar.gz files (PharData extension and system tar command)
  • Refactor archive extraction logic into dedicated methods for better code organization
  • Add proper error handling with WP_Error for consistency with WordPress conventions
  • Update user documentation to mention both ZIP and TAR.GZ format support

Other information:

  • Have you written new tests for your changes, if applicable?

Testing instructions:

  1. Go to Tools → Import → Mastodon (Beta)
  2. Request and download your Mastodon archive (it may come as a .tar.gz file)
  3. Upload the .tar.gz file using the importer
  4. Verify the import process works correctly

Alternative testing with ZIP:

  • The importer should continue to work normally with .zip files
  • Test with a regular Mastodon .zip archive to ensure backward compatibility

Server compatibility:

  • If your server has PharData support (PHP 5.3+), tar.gz extraction will work automatically
  • If PharData is not available but exec() is enabled, the system tar command will be used
  • If neither is available, a helpful error message will guide users to use ZIP format instead

Changelog entry

  • Automatically create a changelog entry from the details below.
Changelog Entry Details

Significance

  • Minor

Type

  • Added - for new features

Message

Add support for tar.gz archives in Mastodon importer.

Extends the Mastodon importer to accept tar.gz and tgz archives in addition to zip files.

Since WordPress doesn't have built-in tar.gz support, the implementation tries multiple extraction methods:
1. PHP's PharData extension (most reliable when available)
2. System tar command via exec (fallback for servers that allow it)
3. Graceful failure with helpful error messages

The extract_tar_gz() method follows WordPress conventions by returning WP_Error on failure, matching the behavior of unzip_file().
Extract the archive type detection and extraction logic from the import() method into a new extract_archive() method for better code organization and maintainability.

This makes the import() method cleaner and the archive handling logic more reusable.
Copilot AI review requested due to automatic review settings November 12, 2025 19:04
@obenland obenland self-assigned this Nov 12, 2025
@obenland obenland requested a review from pfefferle November 12, 2025 19:04
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for tar.gz and tgz archive formats to the Mastodon importer, which previously only supported ZIP files. The implementation includes multiple fallback extraction methods (PharData extension and system tar command) with proper error handling.

Key Changes:

  • Extended file type validation to accept tar.gz and tgz formats alongside ZIP
  • Refactored archive extraction into dedicated methods with multiple fallback strategies
  • Updated user-facing documentation and error messages to reflect new format support

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
includes/wp-admin/import/class-mastodon.php Implements tar.gz extraction with PharData/exec fallbacks, updates file type validation, and refactors extraction logic
.github/changelog/2453-from-description Adds changelog entry documenting the new feature

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}

$command = sprintf(
'tar -xzf %s -C %s 2>&1',
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tar command execution could be vulnerable to command injection if the file path contains shell metacharacters. While escapeshellarg() is used, consider validating that self::$archive path doesn't contain unexpected characters or use a safer extraction method.

Copilot uses AI. Check for mistakes.
);
$output = array();
$return_var = 0;
exec( $command, $output, $return_var );
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using exec() with file paths poses a security risk even with escapeshellarg(). Consider checking if the paths are within expected directories (e.g., using realpath() and verifying they're under wp_content_dir()) before executing the command.

Copilot uses AI. Check for mistakes.
@obenland obenland marked this pull request as draft November 13, 2025 02:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants