-
Notifications
You must be signed in to change notification settings - Fork 352
Description
Feature Request
Extend the #[derive(ForyObject)] macro to support #[fory()] field attributes for performance and space optimization during xlang serialization.
Is your feature request related to a problem? Please describe
Currently, Fory's Rust xlang serialization treats all struct fields uniformly:
- Null checks are always performed - Even for fields that are never null, Fory writes a null/ref flag (1 byte per field)
- Reference tracking is always applied (when enabled globally) - Even for fields that won't be shared/cyclic, objects are tracked with hash lookup cost
- Field names use meta string encoding - In schema evolution mode, field names are encoded using meta string compression, but for fields with long names, this still takes space
These defaults ensure correctness but introduce unnecessary overhead when the developer has more specific knowledge about their data model.
Describe the solution you'd like
Extend the #[fory()] attribute to support field-level metadata:
use fory::ForyObject;
#[derive(ForyObject)]
struct Foo {
// Field f1: non-nullable (default), no ref tracking (default)
// Tag ID 0 provides compact encoding in schema evolution mode
#[fory(id = 0)]
f1: String,
// Field f2: non-nullable (default), no ref tracking (default)
#[fory(id = 1)]
f2: Bar,
// Field f3: nullable field that may contain null values
#[fory(id = 2, nullable = false)]
f3: Option<String>,
// Field f4: shared reference that needs tracking (e.g., for circular refs)
#[fory(id = 3, ref = true, nullable)]
parent: Option<Rc<Node>>,
// Field with long name: tag ID provides significant space savings
#[fory(id = 4)]
very_long_field_name_that_would_take_many_bytes: String,
// Explicit opt-out: use field name encoding but get nullable optimization
#[fory(id = -1, nullable)]
optional_field: Option<String>,
}Attribute Syntax
#[fory(
id = <i32>, // REQUIRED: Tag ID for field encoding, check in rust ForyObject macro
// >= 0: Use tag ID encoding
// -1: Use field name encoding (opt-out)
nullable, // Optional: Field can be None (default: false)
// Required for Option<T> types
ref, // Optional: Track references (default: false)
// Useful for Rc<T>, Arc<T>, circular references
)]Design Decision: Required id
The id attribute is required when using #[fory()] on a field:
id = 0toid = N: Use tag ID encoding (compact)id = -1: Explicit opt-out, use field name encoding- When no id is configured, use field name encoding
Rationale:
- Explicit control: Using
#[fory()]means opting into explicit control - Compile-time validation: Proc macro can check for duplicate IDs
- Proven pattern: Similar to protobuf field numbers
Optimization Details
1. Non-nullable (Default) Optimization
When nullable is NOT specified:
- Skip writing the null flag entirely (1 byte saved per field)
- Directly serialize the field value
- Compile error if field type is
Option<T>withnullable=true - Only
Option<T>is nullable by default, other fields must usenullablemacr attrs to mark as nullable.
2. No Ref Tracking (Default) Optimization
When ref is NOT specified:
- Skip reference tracking map operations
- Skip ref flag when combined with non-nullable
- For
Rc<T>/Arc<T>, ref is true by default, consider addingref=falseif no shared refs are possible
3. Tag ID Optimization
When id = N where N >= 0:
- Field name encoded as varint instead of meta string
- Significant space savings for long field names
Space savings:
| Field Name | Meta String (approx) | Tag ID |
|---|---|---|
f1 |
~2 bytes | 1 byte |
user_name |
~6 bytes | 1 byte |
transaction_id |
~10 bytes | 1 byte |
Implementation Notes
-
Proc Macro Enhancement:
// In fory-derive/src/object.rs #[proc_macro_derive(ForyObject, attributes(fory))] pub fn derive_fory_object(input: TokenStream) -> TokenStream { // Parse #[fory(id = N, nullable, ref)] attributes // Generate optimized serialization code based on attributes }
-
Code Generation:
// Generated code for #[fory(id = 0)] (non-nullable, no ref) fn serialize_field_f1(&self, writer: &mut Writer) { // No null check, no ref tracking writer.write_string(&self.f1); } // Generated code for #[fory(id = 2, nullable)] fn serialize_field_f3(&self, writer: &mut Writer) { match &self.f3 { Some(v) => { writer.write_not_null(); writer.write_string(v); } None => writer.write_null(), } }
-
Compile-time Validation:
- Error if duplicate tag IDs (>= 0) in same struct
- Error if
id < -1 - Error if
Option<T>field withoutnullable - Warning if
Rc<T>/Arc<T>withoutref(potential circular ref issues)
-
Runtime Validation:
- Panic if non-nullable field serialized with None value (shouldn't happen in Rust)
Example: Generated Code
#[derive(ForyObject)]
struct Foo {
#[fory(id = 0)]
name: String,
#[fory(id = 1, nullable)]
nickname: Option<String>,
}
// Generates approximately:
impl ForySerialize for Foo {
fn serialize(&self, writer: &mut Writer) -> Result<()> {
// Field: name (id=0, non-nullable, no ref)
writer.write_tag_id(0);
writer.write_string(&self.name)?;
// Field: nickname (id=1, nullable, no ref)
writer.write_tag_id(1);
match &self.nickname {
Some(v) => {
writer.write_byte(NOT_NULL_FLAG);
writer.write_string(v)?;
}
None => writer.write_byte(NULL_FLAG),
}
Ok(())
}
}Performance Impact
For a struct with 10 fields using default settings (non-nullable, no ref tracking):
- Space savings: ~20 bytes per object (null + ref flags)
- CPU savings: 10 fewer hash map operations per serialization
- Zero runtime overhead for metadata (all compile-time via proc macro)
Additional context
This is the Rust equivalent of Java's @ForyField annotation. See Java issue #3000 for the original design discussion.
Protocol spec: https://fory.apache.org/docs/specification/fory_xlang_serialization_spec