A command-line tool for converting between Parquet and CSV file formats using pandas.
- Automatic format detection: Automatically detects whether the input file is Parquet or CSV
- Bidirectional conversion: Convert Parquet to CSV or CSV to Parquet
- Flexible output naming: Auto-generates output filenames or allows custom naming
- Error handling: Comprehensive error handling with informative messages
- Force conversion: Option to force conversion even with uncertain file formats
pip install parquetconvAfter installation, you can use the parquetconv command directly from anywhere in your terminal.
Clone the repository and install:
git clone https://github.com/ToyokoLabs/parquetconv.git
cd parquetconv
pip install -e .The project uses uv for dependency management. Install dependencies with:
uv syncConvert a Parquet file to CSV:
parquetconv input.parquetConvert a CSV file to Parquet:
parquetconv input.csvpython -m parquetconv.cli input.parquet
python -m parquetconv.cli input.csvSpecify a custom output filename:
parquetconv input.parquet -o custom_output.csv
parquetconv input.csv -o custom_output.parquetForce conversion (useful when file format detection is uncertain):
parquetconv input_file --forceinput_file: Path to the input file (required)-o, --output: Custom output file path (optional)--force: Force conversion even if file format detection is uncertain-h, --help: Show help message
# Convert Parquet to CSV with auto-generated filename
parquetconv data.parquet
# Output: data.csv
# Convert CSV to Parquet with custom filename
parquetconv data.csv -o processed_data.parquet
# Convert with force flag
parquetconv unknown_file --force
# Get help
parquetconv --help- Python 3.9+
- pandas >= 2.3.2
- pyarrow >= 21.0.0
- File Detection: The tool first checks the file extension, then attempts to read the file to determine its format
- Format Conversion: Uses pandas to read the input file and convert it to the opposite format
- Output Generation: Creates the output file with an appropriate extension if not specified
The tool provides clear error messages for:
- Missing input files
- Unsupported file formats
- Read/write errors during conversion
- Invalid file content
To contribute to the project:
- Fork the repository
- Create a feature branch
- Make your changes
- Run tests (if available)
- Submit a pull request
This project is open source and available under the GNU General Public License v3.0.
Sebastian Bassi - sebastian@toyoko.io