Skip to content

feat: add type description support and refactor key expression protocol#91

Merged
YuanYuYuan merged 20 commits intomainfrom
dev/type-description
Feb 13, 2026
Merged

feat: add type description support and refactor key expression protocol#91
YuanYuYuan merged 20 commits intomainfrom
dev/type-description

Conversation

@YuanYuYuan
Copy link
Collaborator

@YuanYuYuan YuanYuYuan commented Jan 29, 2026

Summary

This PR adds comprehensive type description support to ros-z and refactors the key expression backend system for improved maintainability.

Type Description Support

Implements ROS 2 type description discovery, enabling dynamic message handling without compile-time schema knowledge.

New Features

ros-z-schema crate:

  • RIHS01 type hash computation (matches ROS 2 reference implementation)
  • Type description data structures and serialization
  • Message schema building and validation
  • Support for nested types, arrays, and bounded sequences

Type Description Service:

  • GetTypeDescription service for schema discovery
  • Automatic schema registration for publishers
  • Client for querying remote node type descriptions
  • Integration with ZNode via .with_type_description_service()

Dynamic Message API:

  • create_dyn_pub() - publish with runtime schemas
  • create_dyn_sub_auto() - subscribe with automatic schema discovery
  • create_dyn_sub() - subscribe with known schemas
  • CDR serialization/deserialization for dynamic messages

Console Enhancements:

  • Dynamic subscriber with schema auto-discovery
  • Message formatter for human-readable output
  • Real-time topic monitoring with type information

Testing

  • Integration tests: type_description_integration.rs
  • Interop tests: type_description_interop.rs (with ros2cli)
  • Comprehensive coverage of nested types, services, and edge cases

Protocol Refactoring

Simplifies the key expression system by removing the trait-based backend abstraction.

Changes

Crate Reorganization:

  • Renamed ros-z-keyexprros-z-protocol
  • Moved backend implementations to ros-z-protocol::format
  • Consolidated entity types and QoS definitions

API Simplification:

  • Removed KeyExprBackend trait
  • Direct use of KeyExprFormat enum (RmwZenoh, Ros2Dds)
  • Simplified builder APIs (removed .with_backend::<B>())
  • Cleaner key expression generation

Documentation:

  • Renamed "backends" → "key expression formats"
  • New mdBook chapter: keyexpr_formats.md
  • Updated all examples to use new API

Migration

Before:

let pub = node.create_pub::<String>("topic")
    .with_backend::<Ros2DdsBackend>()
    .build()?;

After:

// Key expression format set at context level
let ctx = ZContextBuilder::default()
    .with_keyexpr_format(KeyExprFormat::Ros2Dds)
    .build()?;

let pub = node.create_pub::<String>("topic").build()?;

Breaking Changes

  • Removed ros-z-keyexpr crate (use ros-z-protocol)
  • Removed KeyExprBackend trait and .with_backend() methods
  • Removed crate::backend module from ros-z
  • Python message module regeneration (no functional impact)

Compatibility

  • ✅ Works with ROS 2 Jazzy, Humble, Rolling
  • ✅ Interoperates with rmw_zenoh_cpp
  • ✅ All existing tests pass

Files Changed

New Crates:

  • crates/ros-z-schema/ - Type description and hashing
  • crates/ros-z-protocol/ - Key expression formats (replaces ros-z-keyexpr)

Major Additions:

  • crates/ros-z/src/dynamic/type_description*.rs - Type description service/client
  • crates/ros-z-tests/tests/type_description_*.rs - Integration tests
  • book/src/chapters/keyexpr_formats.md - Documentation

Deleted:

  • crates/ros-z-keyexpr/ - Replaced by ros-z-protocol
  • crates/ros-z/src/backend/ - Functionality moved to ros-z-protocol

@YuanYuYuan YuanYuYuan force-pushed the dev/type-description branch from d69cf4b to 7afdd17 Compare February 9, 2026 07:22
@github-actions
Copy link

github-actions bot commented Feb 9, 2026

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-02-13 05:22 UTC

@YuanYuYuan YuanYuYuan force-pushed the dev/type-description branch 2 times, most recently from 1578a08 to 6198730 Compare February 9, 2026 12:31
@YuanYuYuan YuanYuYuan changed the title feat: add message schema and support type description feat: add type description support and refactor key expression protocol Feb 9, 2026
@YuanYuYuan YuanYuYuan force-pushed the dev/type-description branch 3 times, most recently from 107e62b to 8b437c5 Compare February 13, 2026 02:14
Create independent ros-z-keyexpr crate for key expression handling:
- no_std compatible (uses core + alloc)
- Clean API with KeyExprFormat enum (rmw_zenoh/ros2dds)
- 33 comprehensive unit tests covering edge cases
- KeyExprFormatter trait for format implementations

Fix critical key expression bug:
- ALL topic key expressions now use strip_slashes() behavior
- Preserves internal slashes for publishers, subscriptions, services, clients
- Previously only services preserved slashes, breaking pub/sub with multi-segment names
- Mangling (/ → %) is ONLY used in liveliness tokens, NOT topic keys
- Matches rmw_zenoh_cpp TopicInfo behavior exactly

Key expression format distinction:
- Topic keys: strip_slashes() for all entity types (human-readable, hierarchical)
- Liveliness: mangle_name() for all fields (unambiguous parsing)

Additional changes:
- Update .gitignore for crates/ reorganization (e781dce follow-up)
- Remove auto-generated Python types from git tracking
- Add ros-z-keyexpr to workspace members
- Update Cargo.lock

Tests verify:
- Multi-segment topic names (/ns/topic/name)
- Service names with slashes (/talker/get_type_description)
- Round-trip parse/generate correctness
- Edge cases and error handling

Fixes issue where subscriptions with multi-segment topic names
don't receive messages due to key expression mismatch.
Update mdBook documentation to reflect the ros-z-keyexpr crate and new API:

- Rename backends.md → keyexpr_formats.md with updated content
- Update all API examples: .with_backend() → .keyexpr_format()
- Explain key expression behavior (strip_slashes vs mangle_name)
- Add troubleshooting section for multi-segment topics
- Document the ros-z-keyexpr crate and its purpose
- Update SUMMARY.md, introduction.md, and troubleshooting.md

This prepares the documentation for the upcoming API migration where
ros-z will use ros-z-keyexpr instead of the internal backend module.
Rename ros-z-keyexpr to ros-z-protocol to better reflect its purpose
as the language-agnostic protocol layer. Remove redundant KeyExprBackend
trait from ros-z and migrate everything to use ros-z-protocol's
KeyExprFormat enum for cleaner runtime format selection.

This simplifies the architecture by having a single source of truth
for key expression formats and entity types, making the protocol layer
more suitable for FFI bindings while keeping ros-z as the Rust-specific
implementation layer.
Update key expression formats chapter to reference ros-z-protocol
instead of ros-z-keyexpr. Update all API examples to use the new
ZContextBuilder API and KeyExprFormat from ros-z-protocol.
Resolve all conflicts and API changes from backend refactoring.
Updates ros-z-console integration, mdbook tests, and scripts to work
with new KeyExprFormatter trait system.
Update rmw-zenoh-rs and ros-z-protocol for backend refactoring:
- Add ros-z-protocol dependency to workspace and rmw-zenoh-rs
- Convert QoS types between ros_z_protocol and ros_z where needed
- Replace entity methods with free functions (endpoint_gid, entity_get_endpoint)
- Fix clippy errors (needless_update, explicit_auto_deref)
- Disable obsolete service_backend test (backend system removed)
- Fix LivelinessKE Display usage in example

Tests now compile and run (24/28 passing, investigating remaining failures).
The type_description_interfaces package was introduced in ROS 2 Iron/Jazzy
and doesn't exist in Humble. The test already had #![cfg(feature = "ros-interop")]
but was missing the Humble exclusion guard, causing CI failures on Humble builds.

This adds #![cfg(not(ros_humble))] to skip the entire test file when building
for Humble, matching the conditional package generation in ros-z-msgs/build.rs.
ROS 2's rmw_zenoh uses conditional encoding - QoS parameters that match
defaults are left empty in liveliness tokens. The decoder was failing to
parse these empty fields, causing InvalidReliability/InvalidDurability errors
and preventing ros-z from discovering ROS 2 publishers/subscribers.

Changes:
- Update QosProfile::decode() to treat empty fields as default values
- reliability: empty → BestEffort (default)
- durability: empty → Volatile (default)
- Add documentation explaining ROS 2's conditional encoding

This fixes discovery errors like:
  Failed to parse liveliness token: QosDecodeError(InvalidReliability)
  Failed to parse liveliness token: QosDecodeError(InvalidDurability)
The default QoS reliability was incorrectly set to BestEffort, breaking
ROS 2 interop tests. ROS 2's default QoS profile uses Reliable reliability,
and changing this caused publishers/subscribers to have incompatible QoS
settings, preventing message delivery.

Changes:
- QosReliability: Change default from BestEffort to Reliable
- Update comments to reflect ROS 2's actual defaults
- Matches rmw_zenoh_cpp default QoS profile behavior

This fixes interop test failures where ros-z subscribers with BestEffort QoS
couldn't receive messages from ROS 2 publishers using Reliable QoS.
ROS 2 Humble doesn't support type hashing (introduced in Jazzy).
When the `no-type-hash` feature is enabled (via `humble` feature),
TypeHash now returns "TypeHashNotSupported" instead of RIHS01 format.

Changes:
- Add `no-type-hash` feature to ros-z-protocol
- Update TypeHash::to_rihs_string() to return "TypeHashNotSupported" with cfg
- Update TypeHash::from_rihs_string() to parse "TypeHashNotSupported"
- Enable `ros-z-protocol/no-type-hash` when ros-z's `no-type-hash` is active

This fixes the type hash format for Humble, though Humble interop tests
still have service discovery issues unrelated to type hashing.
The QoS encoding format was incompatible with rmw_zenoh_cpp, causing
"Received liveliness token with invalid qos keyexpr" errors in ROS 2 Humble
interop tests.

Root cause:
- ros-z was using incorrect QoS value encoding (0/1 instead of RMW's 1/2)
- ros-z format was wrong: :[reliability]:,[depth]:[durability]:,:,,
- rmw_zenoh format: [reliability]:[durability]:[history],[depth]:[deadline]:[lifespan]:[liveliness]

Changes:
- Rewrite QosProfile::encode() to match rmw_zenoh_cpp format exactly
- Use RMW value encoding: Reliable=1, BestEffort=2, TransientLocal=1, Volatile=2
- Implement conditional encoding (empty strings for default values)
- Update QosProfile::decode() to parse the correct format
- Change field order and delimiters to match rmw_zenoh

This fixes all ROS 2 Humble interop test failures by ensuring ros-z generates
liveliness tokens that rmw_zenoh_cpp can parse correctly.

Tested: test_ros2_pub_to_ros_z_sub now passes (receives messages from ROS 2)
Add #![cfg(not(ros_humble))] attribute to type_description_integration.rs
to properly skip these tests on Humble, which doesn't support type
description interfaces.

The build.rs already detects Humble and emits cfg(ros_humble), this change
ensures the tests are actually excluded from compilation on Humble.

Test results:
- All 19 ROS 2 Humble interop tests pass when run with --distro humble flag
- Type description tests properly skipped (3 tests skipped total)
- No more type hash mismatch errors on Humble
The test_dynamic_pub_dynamic_sub_with_type_discovery test was flaky in CI
due to a race condition where the publisher would finish and drop before the
subscriber could discover its schema via the type description service.

**Problem:**
- Publisher published 10 messages over ~1 second then exited
- Subscriber started at 500ms but took time to create node
- By ~1000ms when subscriber tried schema discovery, publisher was gone
- This caused "timed out waiting on a channel" errors in CI

**Root Cause:**
Publisher task completed and dropped the publisher object, removing its
liveliness tokens before the subscriber could query the type description service.

**Solution (Deterministic):**
- Publish 100 messages (enough to cover test duration) instead of 10
- Abort publisher task immediately after receiving messages
- No arbitrary delays - publisher runs exactly as long as needed

**Testing:**
- Tested 10 runs with -j 4: 10/10 passes (~2s per run)
- Jazzy: 28/28 tests pass with -j 4
- Humble: 19/19 tests pass with -j 4
- Fixes CI failure scenario
… description

Added comprehensive integration tests for ros-z-console to verify dynamic
subscriber functionality with automatic type description discovery.

Test coverage:
- Dynamic subscription to std_msgs/String with schema discovery
- Complex message types (sensor_msgs/LaserScan) with nested structures
- Multiple simultaneous topic subscriptions
- Type hash verification (RIHS01 format)
- Message decoding with JSON output validation

Test infrastructure:
- TestRouter for isolated Zenoh router per test (unique ports)
- ProcessGuard for automatic cleanup of background processes
- Helper to spawn ros2 topic pub with rmw_zenoh_cpp

Tests verify that ros-z-console correctly:
- Uses node.create_dyn_sub_auto() for schema discovery
- Requests type descriptions from publishers
- Decodes messages using discovered schemas
- Outputs topic_subscribed and message_received events

Dependencies added:
- once_cell for port allocation
- nix for process signal handling
- serial_test for sequential execution
Applied feature gating pattern from ros-z-tests to ros-z-console:
- Added ros-interop feature flag to Cargo.toml
- Added #![cfg(feature = "ros-interop")] to test file
- Updated CI workflow to run ros-z-console tests with feature flag
- Fixed type name assertions (std_msgs/msg/String not ::)

Tests now properly gated behind ros-interop feature:
- Without feature: 0 tests run (properly excluded)
- With feature: 3 tests run and pass

CI integration:
- ros-z-console tests run alongside ros-z-tests in test.yml
- Runs on both Humble and Jazzy distros
- Uses cargo nextest with --features ros-interop
Tests were interfering when run in parallel by nextest in CI:
- All tests used the same topic names (/chatter, /scan)
- Same ROS 2 domain ID (0) caused cross-test message contamination
- test_dynamic_subscriber_std_msgs_string received "multi-topic test"
  instead of "hello from ros2" from the multi-topic test

Solution:
- Use unique topic names per test to prevent interference
- test_dynamic_subscriber_std_msgs_string: /chatter_string_test
- test_dynamic_subscriber_sensor_msgs_laser_scan: /scan_laserscan_test
- test_dynamic_subscriber_multiple_topics: /chatter_multi_test, /scan_multi_test

All tests now pass when run in parallel with nextest.

Fixes CI failure in Interop Tests workflow on Jazzy.
Type description service is not supported in ROS 2 Humble.
Tests were timing out after 60s in CI on Humble distro.

Changes:
- Add humble feature to ros-z-console Cargo.toml
- Add build.rs to set ros_humble cfg when humble feature enabled
- Add #![cfg(not(ros_humble))] to dynamic_subscriber_test.rs
- Update CI workflow to pass humble feature when testing on Humble

Tests now properly excluded on Humble:
- test_dynamic_subscriber_std_msgs_string
- test_dynamic_subscriber_sensor_msgs_laser_scan
- test_dynamic_subscriber_multiple_topics

Fixes timeout failures in Interop Tests (Humble) workflow.
@YuanYuYuan YuanYuYuan merged commit a39b145 into main Feb 13, 2026
17 checks passed
@YuanYuYuan YuanYuYuan deleted the dev/type-description branch February 13, 2026 05:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant