target-adbc/
├── target_adbc/ # Main package
│ ├── __init__.py # Package initialization and version
│ ├── target.py # TargetADBC class (main entry point + config schema)
│ └── sinks.py # ADBCSink class (data processing)
│
├── tests/ # Test suite
│ ├── __init__.py
│ ├── conftest.py # Pytest fixtures
│ ├── test_target.py # Unit tests
│ └── e2e/ # End-to-end tests
│ ├── test_duckdb.py # DuckDB integration tests
│ └── test_sqlite.py # SQLite integration tests
│
├── examples/ # Example configurations and usage
│ ├── duckdb_config.json.example
│ ├── postgresql_config.json.example
│ ├── sqlite_config.json.example
│ ├── sample_input.jsonl
│ └── quickstart.sh # Quick start demonstration script
│
├── .github/
│ └── workflows/
│ └── test.yml # CI workflow
│
├── pyproject.toml # Project metadata and dependencies
├── uv.lock # Dependency lock file
├── meltano.yml # Meltano project configuration
├── .pre-commit-config.yaml # Pre-commit hooks configuration
├── README.md # Main documentation
├── CONTRIBUTING.md # Contribution guidelines
├── CHANGELOG.md # Version history
├── LICENSE # Apache 2.0 license
└── .gitignore # Git ignore rules
The main entry point that:
- Defines the configuration schema using Singer SDK's typing system
- Initializes the Singer target
- Validates configuration
- Manages the lifecycle of sinks
- Provides CLI interface
Key class: TargetADBC
Handles data processing:
- Receives batches of records
- Converts Singer schemas to PyArrow schemas
- Manages ADBC connections
- Performs bulk inserts using
adbc_ingest - Handles table creation and management
Key class: ADBCSink
Singer Messages (stdin)
↓
TargetADBC.cli()
↓
Message Parser (SDK)
↓
ADBCSink.process_batch()
↓
PyArrow Table Conversion
↓
ADBC Connection
↓
Database (DuckDB, PostgreSQL, SQLite, etc.)
Singer JSON Schema → PyArrow DataType → Database Type
---------------------------------------------------
integer → int64() → BIGINT
number → float64() → DOUBLE
string → string() → VARCHAR
boolean → bool_() → BOOLEAN
date-time → timestamp() → TIMESTAMP
date → date32() → DATE
time → time64() → TIME
object → string() → VARCHAR (JSON)
array → list_() → ARRAY
config.json
↓
TargetADBC.config_jsonschema (validation)
↓
Target.__init__(config)
↓
Sink receives config
↓
ADBC connection created
- Update
ADBCSink._json_type_to_arrow()- add mapping - Update
ADBCSink._convert_value()- add conversion logic - Add test case
- Add property to
TargetADBC.config_jsonschemaintarget.py - Use in
ADBCSinkorTargetADBC - Document in README.md
- Add test case
The target is designed to work with any ADBC driver without code changes!
Simply:
- Install the ADBC driver:
pip install adbc-driver-{database} - Configure the driver name in config.json
- Provide appropriate connection parameters
- Configuration handling
- Type conversion logic
- DuckDB: Full data loading pipeline
- SQLite: Full data loading pipeline
- Sample Singer messages
- Test database configurations
# Setup (using uv)
uv sync
# Development cycle
1. Make changes
2. Run tests: uv run pytest
3. Check types: uv run mypy target_adbc
4. Check style: uv run ruff check target_adbc
5. Format: uv run ruff format target_adbc
# Pre-commit hooks
pre-commit install
pre-commit run --all-files
# Testing
uv run pytest # All tests
uv run pytest tests/test_target.py # Unit tests only
uv run pytest tests/e2e/ # E2E tests only
uv run pytest --cov=target_adbc # With coverage
# Running
cat examples/sample_input.jsonl | uv run target-adbc --config examples/duckdb_config.json.example- singer-sdk: Singer specification implementation and base classes
- adbc-driver-manager: ADBC connection management
- pyarrow: Arrow data format for efficient data transfer
- duckdb: DuckDB support (included by default)
- pytest: Testing framework
- coverage: Code coverage
- mypy: Static type checking
- ruff: Linting and formatting
- pyarrow-stubs: Type stubs for PyArrow
- Processes data in batches (default: 10,000 rows)
- Uses columnar format (PyArrow) - more memory efficient
- Connection pooling handled by ADBC driver
- Bulk insert using ADBC's
adbc_ingest(much faster than row-by-row) - Direct Arrow format (no serialization overhead)
- Batch processing reduces network round-trips
- Configurable batch size for memory/speed tradeoff
- Supports streaming (doesn't load all data into memory)
- Driver-specific optimizations (e.g., DuckDB's parallel loading)
- Avoid storing passwords in config files (use environment variables)
- Use secure connection methods (SSL/TLS) when available
- Follow database-specific security best practices
- No data is logged or persisted outside the target database
- Connection credentials are not exposed in logs
- Type conversion prevents SQL injection (parameterized inserts)