Skip to content

Commit b07241f

Browse files
committed
Update Kafka loader guide with new options and Docker example
- Add --reorg-topic, --start-block=latest, --state-dir, --auth options - Update reorg message format with last_valid_hash field - Add working Docker example with auth and volume mounts
1 parent cefd9ef commit b07241f

File tree

1 file changed

+36
-16
lines changed

1 file changed

+36
-16
lines changed

apps/kafka_streaming_loader_guide.md

Lines changed: 36 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -55,8 +55,12 @@ uv run python apps/kafka_streaming_loader.py \
5555
| `--amp-server URL` | `grpc://127.0.0.1:1602` | AMP server URL (use `grpc+tls://gateway.amp.staging.thegraph.com:443` for staging) |
5656
| `--kafka-brokers` | `localhost:9092` | Kafka broker addresses |
5757
| `--network NAME` | `anvil` | Network identifier (e.g., `ethereum-mainnet`, `anvil`) |
58-
| `--start-block N` | Latest block | Start streaming from this block |
58+
| `--start-block N` | Resume from state | Block number or `latest` to start from |
59+
| `--reorg-topic NAME` | Same as `--topic` | Separate topic for reorg messages |
5960
| `--label-csv PATH` | - | CSV file for data enrichment |
61+
| `--state-dir PATH` | `.amp_state` | Directory for LMDB state storage |
62+
| `--auth` | - | Enable auth using `~/.amp/cache` or `AMP_AUTH_TOKEN` env var |
63+
| `--auth-token TOKEN` | - | Explicit auth token |
6064

6165
## Message Format
6266

@@ -81,13 +85,14 @@ On blockchain reorganizations, reorg events are sent:
8185
```json
8286
{
8387
"_type": "reorg",
84-
"network": "ethereum",
88+
"network": "ethereum-mainnet",
8589
"start_block": 19000100,
86-
"end_block": 19000110
90+
"end_block": 19000110,
91+
"last_valid_hash": "0xabc123..."
8792
}
8893
```
8994

90-
Consumers should invalidate data in the specified block range.
95+
Consumers should invalidate data in the specified block range. Use `--reorg-topic` to send these to a separate topic (useful for Snowflake Kafka connector which requires strict schema per topic).
9196

9297
## Examples
9398

@@ -142,7 +147,7 @@ token_address,symbol,name,decimals
142147

143148
Without the CSV file, `token_symbol`, `token_name`, and `token_decimals` will be `null` in the output.
144149

145-
### Stream from Latest Blocks
150+
### Stream from Latest Block
146151

147152
```bash
148153
uv run python apps/kafka_streaming_loader.py \
@@ -151,7 +156,9 @@ uv run python apps/kafka_streaming_loader.py \
151156
--topic eth_live_logs \
152157
--query-file apps/queries/all_logs.sql \
153158
--raw-dataset 'edgeandnode/ethereum_mainnet' \
154-
--network ethereum-mainnet
159+
--network ethereum-mainnet \
160+
--start-block latest \
161+
--auth
155162
```
156163

157164
### Local Development (Anvil)
@@ -183,19 +190,32 @@ uv run python apps/kafka_consumer.py anvil_logs localhost:9092 my-group
183190

184191
```bash
185192
# Build image
186-
docker build -f Dockerfile.kafka -t amp-kafka-loader .
193+
docker build -f Dockerfile.kafka -t amp-kafka .
194+
195+
# Run loader (with auth via env var)
196+
docker run -d \
197+
--name amp-kafka-loader \
198+
--network kafka-net \
199+
-e AMP_AUTH_TOKEN \
200+
-v $(pwd)/apps/queries:/data/queries \
201+
-v $(pwd)/.amp_state:/data/state \
202+
amp-kafka \
203+
--amp-server 'grpc+tls://gateway.amp.staging.thegraph.com:443' \
204+
--kafka-brokers kafka:9092 \
205+
--topic erc20_transfers \
206+
--query-file /data/queries/erc20_transfers_activity.sql \
207+
--raw-dataset 'edgeandnode/ethereum_mainnet' \
208+
--network ethereum-mainnet \
209+
--state-dir /data/state \
210+
--start-block latest \
211+
--auth
187212

188-
# Run loader
189-
docker run --rm \
190-
--network host \
191-
amp-kafka-loader \
192-
--kafka-brokers localhost:9092 \
193-
--topic my_topic \
194-
--query-file apps/queries/my_query.sql \
195-
--raw-dataset anvil \
196-
--start-block 0
213+
# Check logs
214+
docker logs -f amp-kafka-loader
197215
```
198216

217+
Note: Ensure your Kafka container is on the same Docker network (`kafka-net`) and advertises the correct listener (`kafka:9092`).
218+
199219
## Getting Help
200220

201221
```bash

0 commit comments

Comments
 (0)