Skip to content

Commit 057163a

Browse files
committed
add molt-fetch skill
1 parent 718f70e commit 057163a

File tree

1 file changed

+140
-0
lines changed
  • skills/onboarding-and-migrations/molt-fetch

1 file changed

+140
-0
lines changed
Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
---
2+
name: molt-fetch
3+
description: Guide for using molt fetch to migrate data from PostgreSQL, MySQL, Oracle, or MSSQL to CockroachDB. Use when running molt fetch commands, configuring storage backends, handling fetch failures/resumption, or chaining fetch with verify.
4+
compatibility: Requires molt binary. Source DB must be accessible. For Oracle, CGO and Oracle Instant Client required.
5+
metadata:
6+
author: cockroachdb
7+
version: "1.0"
8+
---
9+
10+
# molt fetch
11+
12+
Bulk data migration from source databases (PostgreSQL, MySQL, Oracle, MSSQL) to CockroachDB.
13+
14+
## Basic Structure
15+
16+
```bash
17+
molt fetch \
18+
--source "<source-conn>" \
19+
--target "<crdb-conn>" \
20+
--bucket-path "s3://bucket/prefix" # or --direct-copy or --local-path
21+
[options]
22+
```
23+
24+
## Storage Backends (pick one)
25+
26+
| Option | When to use |
27+
|--------|-------------|
28+
| `--bucket-path "s3://..."` | AWS S3 (also `gs://` for GCS, `azure://` for Azure) |
29+
| `--direct-copy` | No intermediate storage; fastest for accessible networks |
30+
| `--local-path "/tmp/molt"` + `--local-path-listen-addr "0.0.0.0:9005"` | CRDB must reach the listen addr |
31+
32+
Cloud auth: pass `--use-implicit-auth` for IAM/ADC/managed identity, or set `AWS_ACCESS_KEY_ID`/`GOOGLE_APPLICATION_CREDENTIALS` env vars.
33+
34+
## Table Handling (`--table-handling`)
35+
36+
| Value | Behavior |
37+
|-------|----------|
38+
| `none` (default) | Append to existing tables |
39+
| `drop-on-target-and-recreate` | Drop + recreate from source schema; enables auto schema creation |
40+
| `truncate-if-exists` | Truncate before loading; errors if table missing |
41+
42+
## Import Mode
43+
44+
**IMPORT INTO** (default): Table goes OFFLINE during load. Highest throughput.
45+
46+
**COPY FROM** (`--use-copy`): Table stays ONLINE. Use with `--direct-copy`. Cannot use compression.
47+
48+
```bash
49+
# Zero-downtime load
50+
molt fetch --source "..." --target "..." --direct-copy --use-copy
51+
```
52+
53+
## Key Flags
54+
55+
```bash
56+
# Filtering
57+
--table-filter "customers|orders" # POSIX regex for tables to include
58+
--table-exclusion-filter "temp_.*" # exclude pattern
59+
--schema-filter "public" # PostgreSQL only
60+
61+
# Performance
62+
--table-concurrency 4 # parallel tables (default: 4)
63+
--export-concurrency 4 # export threads (default: 4)
64+
--row-batch-size 100000 # rows per SELECT (default: 100k)
65+
66+
# Schema
67+
--type-map-file "types.json" # custom type mappings
68+
--transformations-file "transforms.json" # column exclusions, table aliases
69+
70+
# Logging
71+
--log-file "migration.log" # or "stdout"
72+
--logging debug # info (default), debug, trace
73+
--metrics-listen-addr "0.0.0.0:3030" # Prometheus scrape endpoint
74+
```
75+
76+
## Source-Specific Prerequisites
77+
78+
**MySQL**: GTID mode required (`gtid_mode=ON`, `enforce_gtid_consistency=ON`). `ONLY_FULL_GROUP_BY` must be off. Or use `--ignore-replication-check`.
79+
80+
**Oracle**: Binary must be built with `CGO_ENABLED=1 -tags="cgo source_all"`. Oracle Instant Client in `LD_LIBRARY_PATH`.
81+
82+
**PostgreSQL**: Replication privileges needed, or `--ignore-replication-check`.
83+
84+
## Common Workflows
85+
86+
### 1. Validate before migrating
87+
```bash
88+
molt fetch --dry-run --source "..." --target "..." --bucket-path "s3://..."
89+
# Exports 1 row, imports, verifies, cleans up. Returns immediately.
90+
```
91+
92+
### 2. Full migration with schema creation
93+
```bash
94+
molt fetch \
95+
--source "postgresql://user:pass@pg:5432/db" \
96+
--target "postgresql://root@crdb:26257/db" \
97+
--bucket-path "s3://mybucket/migration" \
98+
--table-handling drop-on-target-and-recreate \
99+
--table-filter "customers|orders|payments" \
100+
--log-file migration.log
101+
```
102+
103+
### 3. Resume after failure
104+
```bash
105+
# List available continuation tokens
106+
molt fetch tokens --fetch-id "abc-123" --target "postgresql://root@crdb:26257/db"
107+
108+
# Resume all failed tables
109+
molt fetch \
110+
--source "..." --target "..." \
111+
--bucket-path "s3://mybucket/migration" \
112+
--fetch-id "abc-123" \
113+
--non-interactive
114+
```
115+
116+
### 4. Validate flag syntax without connecting
117+
```bash
118+
molt fetch --compile-only --source "..." --target "..." --bucket-path "..."
119+
# Returns JSON: {"status":"ok","message":"arguments parsed successfully"}
120+
```
121+
122+
## Error Recovery
123+
124+
| Error | Cause | Fix |
125+
|-------|-------|-----|
126+
| "GTID-based replication not enabled" | MySQL missing GTID | Enable `gtid_mode=ON` or add `--ignore-replication-check` |
127+
| "Column mismatch" | Schema diverged | Fix target schema manually or use `--type-map-file` |
128+
| Silent IMPORT INTO | CockroachDB import running | `SHOW JOBS` on CRDB to check progress |
129+
| "timestamp in the future" | Docker/Mac clock drift | Sync clocks between hosts |
130+
131+
## Gotchas
132+
133+
- COPY mode: cannot use `--compression gzip`; must use `--compression none` (or omit, default is none with copy)
134+
- Table is **offline** during IMPORT INTO — use `--use-copy` for zero downtime
135+
- Schema changes between runs require starting from scratch
136+
- `--fetch-id` continuation tokens live in the target's exceptions table
137+
- For MySQL, `--ignore-replication-check` skips GTID validation but replication-dependent features won't work
138+
- After fetch, run `molt verify` to confirm data integrity
139+
140+
See [flags reference](references/flags.md) for the full flag list.

0 commit comments

Comments
 (0)