|
| 1 | +# Register schema and publish messages |
| 2 | + |
| 3 | +Scripts to register schemas on Pulsar topics and publish test messages for schema validation testing. All connection settings come from a `.env` file. |
| 4 | + |
| 5 | +## Prerequisites |
| 6 | + |
| 7 | +- **pulsarctl** – for schema registration (e.g. `brew install pulsarctl`) |
| 8 | +- **Python 3** – for the generic Avro encoder used by `publish-messages.sh` |
| 9 | +- **curl** – used by the publish script to POST messages |
| 10 | +- **pip install avro** – required by `encode_avro_generic.py` for encoding |
| 11 | + |
| 12 | +## Setup |
| 13 | + |
| 14 | +1. Copy the example env file and fill in your Pulsar details: |
| 15 | + |
| 16 | + ```bash |
| 17 | + cp .env.example .env |
| 18 | + ``` |
| 19 | + |
| 20 | +2. Edit `.env` and set (use `KEY=value` with no spaces around `=`): |
| 21 | + |
| 22 | + | Variable | Required | Description | |
| 23 | + |----------|----------|-------------| |
| 24 | + | `PULSAR_ADMIN_URL` | Yes | Pulsar admin service URL (e.g. StreamNative Cloud or your cluster). | |
| 25 | + | `PULSAR_TOPIC` | Yes | Full topic name: `persistent://tenant/namespace/topic-name`. | |
| 26 | + | `PULSAR_AUTH_TOKEN` | Yes | JWT for admin and REST publish. | |
| 27 | + | `NUM_MESSAGES` | No | Default number of messages for `publish-messages.sh` (default: 15). | |
| 28 | + |
| 29 | +--- |
| 30 | + |
| 31 | +## Register a schema |
| 32 | + |
| 33 | +**Script:** `register-schema.sh` |
| 34 | + |
| 35 | +Uploads a schema to a topic and turns on schema validation for that topic’s namespace. |
| 36 | + |
| 37 | +**Usage:** |
| 38 | + |
| 39 | +```bash |
| 40 | +./register-schema.sh <schema-file> [topic] |
| 41 | +``` |
| 42 | + |
| 43 | +| Argument | Required | Description | |
| 44 | +|----------|----------|-------------| |
| 45 | +| `schema-file` | Yes | Path to the schema JSON file (relative to this directory or absolute). | |
| 46 | +| `topic` | No | Full topic name. If omitted, uses `PULSAR_TOPIC` from `.env`. | |
| 47 | + |
| 48 | +**Examples:** |
| 49 | + |
| 50 | +```bash |
| 51 | +# Use topic from .env |
| 52 | +./register-schema.sh schema-test-topic.json |
| 53 | +./register-schema.sh schema-test-topic-avro.json |
| 54 | + |
| 55 | +# Override topic |
| 56 | +./register-schema.sh schema-test-topic-avro.json persistent://other-tenant/ns/my-topic |
| 57 | +``` |
| 58 | + |
| 59 | +**Included schema files:** |
| 60 | + |
| 61 | +- `schema-test-topic.json` – JSON schema (TestMessage: name, topic) |
| 62 | +- `schema-test-topic-avro.json` – AVRO schema (TestMessage: name, topic) |
| 63 | +- `schema-test-topic-avro-name-only.json` – AVRO schema (single field: name) |
| 64 | + |
| 65 | +--- |
| 66 | + |
| 67 | +## Publish messages |
| 68 | + |
| 69 | +**Script:** `publish-messages.sh` |
| 70 | + |
| 71 | +Sends messages to a topic by running a **generic Avro encoder** once per message (using the schema file you pass) and POSTing the binary output to the Pulsar admin REST API. |
| 72 | + |
| 73 | +**Why encoding?** When a topic has an Avro schema, the broker expects the message body to be **Avro binary** (the same format a real Avro producer would send). You cannot POST raw JSON or plain text and have it accepted as a valid Avro message. The generic encoder produces that binary from your schema (and optional record data) so the curl request body is in the correct format. |
| 74 | + |
| 75 | +**Usage:** |
| 76 | + |
| 77 | +```bash |
| 78 | +./publish-messages.sh <schema-file> [topic] [num-messages] [data-file] |
| 79 | +``` |
| 80 | + |
| 81 | +| Argument | Required | Description | |
| 82 | +|----------|----------|-------------| |
| 83 | +| `schema-file` | Yes | Path to the Avro schema (JSON or Pulsar format). Same files you use with `register-schema.sh`. Relative to this directory or absolute. | |
| 84 | +| `topic` | No | Full topic name. If omitted, uses `PULSAR_TOPIC` from `.env`. | |
| 85 | +| `num-messages` | No | Number of messages to send. Default: 15 or `NUM_MESSAGES` from `.env`. | |
| 86 | +| `data-file` | No | JSON file with one record matching the schema. If omitted, the encoder generates default values from the schema (e.g. empty strings, zeros). | |
| 87 | + |
| 88 | +**Examples:** |
| 89 | + |
| 90 | +```bash |
| 91 | +# Default topic and 15 messages (encoder uses default record from schema) |
| 92 | +./publish-messages.sh schema-test-topic-avro.json |
| 93 | +./publish-messages.sh schema-test-topic-avro-name-only.json |
| 94 | + |
| 95 | +# Override topic and count |
| 96 | +./publish-messages.sh schema-test-topic-avro-name-only.json persistent://tenant/ns/my-topic 10 |
| 97 | + |
| 98 | +# Use a custom JSON record for each message |
| 99 | +./publish-messages.sh schema-test-topic-avro.json persistent://tenant/ns/my-topic 5 my-record.json |
| 100 | +``` |
| 101 | + |
| 102 | +**Generic encoder:** `encode_avro_generic.py` |
| 103 | + |
| 104 | +- Works with **any** Avro record schema. You pass the schema file path (and optionally a JSON record). |
| 105 | +- Accepts raw Avro schema JSON or Pulsar format (`{"type":"AVRO","schema":"..."}`). |
| 106 | +- Without a data file: builds a default record from the schema (strings → `""`, numbers → `0`, etc.) so you can publish without writing record JSON. |
| 107 | + |
| 108 | +--- |
| 109 | + |
| 110 | +## Example workflows |
| 111 | + |
| 112 | +**Register full Avro schema and publish valid messages:** |
| 113 | + |
| 114 | +```bash |
| 115 | +./register-schema.sh schema-test-topic-avro.json |
| 116 | +./publish-messages.sh schema-test-topic-avro.json |
| 117 | +``` |
| 118 | + |
| 119 | +**Register name-only schema and publish matching messages:** |
| 120 | + |
| 121 | +```bash |
| 122 | +./register-schema.sh schema-test-topic-avro-name-only.json |
| 123 | +./publish-messages.sh schema-test-topic-avro-name-only.json |
| 124 | +``` |
| 125 | + |
| 126 | +**Test schema validation :** |
| 127 | + |
| 128 | +Register the full schema, then publish name-only messages (wrong schema). This should lead to a java io exception when you try to consume from that topic because the messages in the topic do not match the schema. |
| 129 | +```bash |
| 130 | +./register-schema.sh schema-test-topic-avro.json |
| 131 | +./publish-messages.sh schema-test-topic-avro-name-only.json |
| 132 | +# Expect "rejected" for each message |
| 133 | +``` |
| 134 | + |
| 135 | +--- |
| 136 | + |
| 137 | +## Adding your own schema |
| 138 | + |
| 139 | +- Create a schema file (see existing `schema-*.json` for format) and pass it to `register-schema.sh`. |
| 140 | +- Use the **same schema file** with `publish-messages.sh` to publish messages. The generic encoder works with any Avro record schema. Optionally pass a JSON data file so each message uses your record instead of default values. |
0 commit comments