All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Tweaked PostgreSQL adapter for CrateDB compatibility
-
Source and Target Options in YAML Configuration: YAML configs can now specify connector-specific options under
source.optionsandtarget.optionssections- Allows passing protocol-specific configuration without cluttering the main options section
- Options are passed to the respective protocol handlers for specialized processing
- Particularly useful for HTTP sources with custom headers and authentication
-
HTTP Protocol Options Support: HTTP/HTTPS sources now support comprehensive options
- Custom Headers: Add custom HTTP headers with
header.HeaderNamesyntax- Example:
header.User-Agent: "TinyETL/0.10.0" - Example:
header.Accept: "text/csv" - Example:
header.X-API-Version: "v2"
- Example:
- Basic Authentication: Configure using
auth.basic.usernameandauth.basic.passwordoptions - Bearer Token Authentication: Configure using
auth.beareroption - Supports environment variable substitution for sensitive values (e.g.,
auth.bearer: "${BEARER_TOKEN}") - Headers and authentication automatically logged (credentials masked) for debugging
- Custom Headers: Add custom HTTP headers with
-
Protocol Abstraction: All protocols now support the options parameter throughout the API
Protocol::create_source()accepts options HashMapProtocol::create_target()accepts options HashMap- File protocol: Options parameter available for future extensions
- SSH protocol: Options parameter available for future extensions
- Snowflake protocol: Options parameter available for future extensions
-
Default Configuration Template: Updated
generate-default-configoutput to include examples of source and target options- Shows commented examples of HTTP authentication methods
- Demonstrates custom header configuration
- Includes environment variable usage examples
- Added Example 18: HTTP with Authentication demonstrating various authentication methods
config.yaml- Basic authentication examplebearer_config.yaml- Bearer token authenticationcustom_headers_config.yaml- Custom HTTP headerspublic_config.yaml- Public endpoint with optional headers- Includes test HTTP server setup via Docker Compose
- Updated YAML configuration format to show source/target options structure
- Added inline documentation for HTTP authentication options
- Enhanced default config template with practical examples
- Transform Configuration Format: YAML transform configuration now requires explicit
typeandvaluefields- Old format:
transform: "expression"ortransform_file: "file.lua" - New format:
transform: { type: inline, value: "expression" }ortransform: { type: file, value: "file.lua" } - Migration required for existing YAML config files using transformations
- CLI arguments (
--transform,--transform-file) remain unchanged and fully compatible
- Old format:
- Config Generation Commands: New
generate-configandgenerate-default-configCLI subcommands for easier YAML configuration creationtinyetl generate-config [OPTIONS] <SOURCE> <TARGET>- Generate YAML config from CLI argumentstinyetl generate-default-config- Output example configuration with comments
- GitHub Actions PR Workflow: Automated CI pipeline for pull requests with formatting, linting, and testing
- Enhanced Transform Config: Improved YAML serialization for transformation configurations with tagged union format
- Supports
type: file,type: inline,type: script, andtype: noneformats - Better type safety and clarity in YAML configuration
- Explicit configuration prevents ambiguity
- Supports
- Refactored Config Module: Split YAML-specific configuration into separate
yaml_config.rsmodule for better code organization - Transform Config Structure: Updated from string-based to enum-based with proper serde support for type-safe YAML serialization
- README Updates:
- Updated version badge to 0.9.0
- Updated binary size to 15MB
- Enhanced configuration documentation with inline comments and new transform format examples
- Code Quality: Fixed numerous clippy warnings across the codebase for better maintainability
- Replaced
.last()with.next_back()for iterator optimizations - Added missing
#[allow(dead_code)]attributes for test utilities - Improved error handling and type conversions
- Replaced
- Compiler warnings and clippy lints throughout the codebase
- Inconsistent formatting in various modules
- Missing or incomplete error messages in configuration parsing
- Added comprehensive inline comments to default configuration example
- Improved YAML configuration format documentation with all options explained
- Added examples for environment variable usage in configurations
- Updated command-line help text with new subcommands
- Enhanced README with new transform configuration format
- Extracted YAML configuration logic into dedicated module for better separation of concerns
- Improved test coverage for configuration serialization/deserialization
- Enhanced type safety in transform configuration handling
- JSON as a Tier-1 Datatype: Full support for JSON columns across all connectors
- Added
Jsonvariant toDataTypeandValueenums in schema system - JSON values stored as
serde_json::Valueinternally - Schema files now accept
type: jsonfor column definitions - JSON default values supported in schema files
- Type inference automatically detects JSON values
- Comprehensive test coverage for JSON operations
- Added
- PostgreSQL: Maps JSON to native
JSONBtype for optimal performance - MySQL: Maps JSON to native
JSONtype - SQLite: Stores JSON as
TEXT(SQLite's standard approach) - DuckDB: Maps JSON to native
JSONtype - MSSQL: Stores JSON as
NVARCHAR(MAX) - Snowflake: Maps JSON to
VARIANT(Snowflake's semi-structured data type) - CSV: Serializes JSON to compact string representation
- JSON: Preserves JSON objects natively
- Parquet: Stores JSON as UTF8 strings via Arrow
- Avro: Stores JSON as string type
- ODBC: Stores JSON as
NVARCHAR(MAX)
- Transformer (Lua): JSON values converted to strings for Lua script processing
- Arrow integration: JSON mapped to
ArrowDataType::Utf8for compatibility - Schema validation: JSON type fully integrated with validation pipeline
- Added Example 17: SQLite JSON to Parquet demonstrating JSON column handling
- Includes preview mode demonstration and round-trip validation
- Fixed JSON serialization in Parquet writer to output proper JSON strings instead of Rust debug format
- Fixed
RUST_LOG=debugenvironment variable now properly enables debug logging
- ODBC connector support for broader database compatibility
- Fixed MySQL JSON columns being read as NULL values - now properly extracted as strings
- Resolved cargo install issues
- Various minor bug fixes
0.5.0 - 2025-11-12
- BREAKING: Schema inference now defaults all columns to
nullable: truefor safety- Previous behavior could incorrectly infer
NOT NULLconstraints based on limited sample data - This prevents constraint violations when appending data with different null patterns
- Users requiring strict
NOT NULLconstraints must now use explicit schema files - Affects all database connectors (DuckDB, SQLite, MySQL, PostgreSQL, MSSQL)
- Previous behavior could incorrectly infer
- Fixed NOT NULL constraint violations when appending to existing DuckDB tables
- Resolved issue where schema inferred from first batch caused failures on subsequent batches with NULL values
- Added prominent notes about nullable default behavior in README
- Clarified that explicit schema files are required for strict validation
0.4.0 - 2025-11-11
- DuckDB connector (source and destination)
- Internal schema types migrated to Arrow datatypes
0.3.1 - 2025-11-11
- MySQL source support - can now read data from MySQL databases
- Additional test coverage for improved reliability
0.3.0 - 2025
- Initial release with CSV, JSON, Parquet, SQLite, MySQL (target), PostgreSQL, MSSQL, and Avro support
- File, HTTP, SSH, and Snowflake protocol support
- YAML configuration files
- Schema validation
- Environment variable support for secrets
- Data transformation capabilities