Skip to content

[Go] Add fory struct tags for field metadata #3005

@chaokunyang

Description

@chaokunyang

Feature Request

Add support for fory struct tags to provide field-level metadata for performance and space optimization during xlang serialization.

Is your feature request related to a problem? Please describe

Currently, Fory's Go xlang serialization treats all struct fields uniformly:

  1. Null checks are always performed - Even for fields that are never nil, Fory writes a null/ref flag (1 byte per field)
  2. Reference tracking is always applied (when enabled globally) - Even for fields that won't be shared/cyclic, objects are tracked with hash lookup cost
  3. Field names use meta string encoding - In schema evolution mode, field names are encoded using meta string compression, but for fields with long names, this still takes space

These defaults ensure correctness but introduce unnecessary overhead when the developer has more specific knowledge about their data model.

Describe the solution you'd like

Add support for fory struct tags with field metadata:

type Foo struct {
    // Field F1: non-nullable (default), no ref tracking (default)
    // Tag ID 0 provides compact encoding in schema evolution mode
    F1 string `fory:"id=0"`
    
    // Field F2: non-nullable (default), no ref tracking (default)
    F2 Bar `fory:"id=1"`
    
    // Field F3: nullable field that may contain nil values
    F3 *string `fory:"id=2,nullable"`
    
    // Field Parent: shared reference that needs tracking (e.g., for circular refs)
    Parent *Node `fory:"id=3,ref,nullable"`
    
    // Field with long name: tag ID provides significant space savings
    VeryLongFieldNameThatWouldTakeManyBytes string `fory:"id=4"`
    
    // Explicit opt-out: use field name encoding but get nullable optimization
    OptionalField *string `fory:"id=-1,nullable"`
}

Tag Syntax

`fory:"id=<int>[,nullable][,ref]"`
Tag Component Description
id=N Required. Tag ID for field encoding. N >= 0 uses tag ID; N = -1 uses field name
nullable Optional. Field can be nil. Required for pointer types
ref Optional. Enable reference tracking for this field

Design Decision: Required id

The id tag is required when using fory struct tags:

  • id=0 to id=N: Use tag ID encoding (compact)
  • id=-1: Explicit opt-out, use field name encoding

Rationale:

  1. Explicit control: Using fory tag means opting into explicit control
  2. Runtime validation: Can check for duplicate IDs during registration
  3. Proven pattern: Similar to protobuf field numbers, JSON tags

Optimization Details

1. Non-nullable (Default) Optimization

When nullable is NOT specified:

  • Skip writing the null flag entirely (1 byte saved per field)
  • Directly serialize the field value
  • Panic if pointer field is nil without nullable tag

2. No Ref Tracking (Default) Optimization

When ref is NOT specified:

  • Skip reference tracking map operations
  • Skip ref flag when combined with non-nullable
  • For pointer types with potential circular refs, add ref tag

3. Tag ID Optimization

When id=N where N >= 0:

  • Field name encoded as varint instead of meta string
  • Significant space savings for long field names

Space savings:

Field Name Meta String (approx) Tag ID
F1 ~2 bytes 1 byte
UserName ~6 bytes 1 byte
TransactionID ~10 bytes 1 byte

Implementation Notes

  1. Tag Parsing:

    // In fory/type.go or new file fory/field.go
    type FieldMeta struct {
        ID       int
        Nullable bool
        Ref      bool
    }
    
    func parseForygTag(tag string) (*FieldMeta, error) {
        // Parse "id=0,nullable,ref" format
    }
  2. Registration Integration:

    // During type registration, parse fory tags
    func (f *Fory) RegisterType(t reflect.Type) error {
        for i := 0; i < t.NumField(); i++ {
            field := t.Field(i)
            if tag, ok := field.Tag.Lookup("fory"); ok {
                meta, err := parseForyTag(tag)
                // Store metadata for serialization
            }
        }
    }
  3. Codegen Integration:

    //go:generate fory gen -type Foo
    
    // Generated code respects fory tags
    func (f *Foo) ForyEncode(writer *fory.Writer) error {
        // F1: id=0, non-nullable, no ref
        writer.WriteTagID(0)
        writer.WriteString(f.F1)
        
        // F3: id=2, nullable, no ref
        writer.WriteTagID(2)
        if f.F3 == nil {
            writer.WriteNullFlag()
        } else {
            writer.WriteNotNullFlag()
            writer.WriteString(*f.F3)
        }
        // ...
    }
  4. Validation:

    • Panic if duplicate tag IDs (>= 0) in same struct
    • Panic if id < -1
    • Panic if pointer field is nil without nullable tag
    • Warning (or panic in strict mode) if pointer type without nullable

Compatibility with Existing Tags

The fory tag can coexist with other tags:

type User struct {
    Name  string `json:"name" fory:"id=0"`
    Email string `json:"email" fory:"id=1"`
    Age   *int   `json:"age,omitempty" fory:"id=2,nullable"`
}

Performance Impact

For a struct with 10 fields using default settings (non-nullable, no ref tracking):

  • Space savings: ~20 bytes per object (null + ref flags)
  • CPU savings: 10 fewer hash map operations per serialization

For reflection-based serialization:

  • Tag parsing overhead is one-time during registration
  • Runtime serialization uses cached metadata

For codegen-based serialization:

  • Zero runtime overhead for metadata
  • All optimizations applied at code generation time

Additional context

This is the Go equivalent of Java's @ForyField annotation. See Java issue #3000 for the original design discussion.

Protocol spec: https://fory.apache.org/docs/specification/fory_xlang_serialization_spec

Metadata

Metadata

Assignees

No one assigned

    Labels

    goPull requests that update go code

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions