Skip to content

Conversation

DL1231
Copy link
Contributor

@DL1231 DL1231 commented Sep 30, 2025

This patch adds documentation for nullable struct encoding in the protocol specification.

For nullable structs, the encoding is defined as follows:

  • The first byte is an INT8 null indicator (-1 for null, 1 for non-null)
  • If non-null, the remaining bytes contain the serialization of each field in order
  • If null, no additional bytes follow the null indicator

@github-actions github-actions bot added triage PRs from the community clients small Small PRs labels Sep 30, 2025
BYTES, COMPACT_BYTES, NULLABLE_BYTES, COMPACT_NULLABLE_BYTES,
RECORDS, COMPACT_RECORDS, new ArrayOf(STRING), new CompactArrayOf(COMPACT_STRING)};
RECORDS, COMPACT_RECORDS, new ArrayOf(STRING), new CompactArrayOf(COMPACT_STRING),
new Schema()};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you generate the website with this change?

Copy link
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DL1231 : Thanks for the PR. This is a bit more complicated than I originally thought. Left a few comments.


@Override
public String documentation() {
return "Represents a composite object or null. " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should first say sth like "A struct is named by a string with a capitalized first letter and consists of one or more fields".

public final class Schema extends Type {
public final class Schema extends DocumentedType {

private static final String STRUCT_TYPE_NAME = "NULLABLE_STRUCT";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current Schema class is written for a struct that's not nullable since the write() method just writes all fields without the null indicator. We should name this STRUCT and create a new NullableStruct version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without the null indicator

The documentation for other nullable types also needs to be updated to mention the null indicator.

STRING, COMPACT_STRING, NULLABLE_STRING, COMPACT_NULLABLE_STRING,
BYTES, COMPACT_BYTES, NULLABLE_BYTES, COMPACT_NULLABLE_BYTES,
RECORDS, COMPACT_RECORDS, new ArrayOf(STRING), new CompactArrayOf(COMPACT_STRING)};
RECORDS, COMPACT_RECORDS, new ArrayOf(STRING), new CompactArrayOf(COMPACT_STRING),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple of existing issues with the generated schemas in https://kafka.apache.org/protocol#protocol_api_keys.

  1. The current RECORDS class is written as NULLABLE_RECORDS. So, we should rename it NULLABLE_RECORDS. We also need to create a new RECORDS class that's not nullable. In SchemaGenerator, we should distinguish between RECORDS and NULLABLE_RECORDS. Currently, the generated schema only has RECORDS for both nullable and non-nullable fields, which is misleading. Ditto for COMPACT_RECORDS.

  2. For ArrayOf, it takes nullable as the input, but its name is always ARRAY. We need to add the nullable name. Ditto for CompactArrayOf. Currently, the generator schema only uses [] to represent an array. We need a way to denote whether it's nullable and compact.

  3. When generating the schema for a struct, we just fold in the fields in the struct like the following without indicating whether that struct is nullable or not. We need a way to make that clear.

  assignment => [topic_partitions] _tagged_fields 
    topic_partitions => topic_id [partitions] _tagged_fields 
      topic_id => UUID
      partitions => INT32

*/
package org.apache.kafka.common.protocol.types;

import org.apache.kafka.common.protocol.types.Type.DocumentedType;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few things that we need to fix in clients/src/main/resources/common/message/README.md.

  1. We need to add type struct in Field Types.
  2. In Nullable Fields, it says uuid is nullable. This is incorrect (the generator throws an exception if a uuid filed is nullable) and needs to be removed.
  3. We need to add struct as a nullable type.

@github-actions github-actions bot removed the triage PRs from the community label Oct 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants