fix(relayer): improve merkle tree sync error logging#401
Closed
fix(relayer): improve merkle tree sync error logging#401
Conversation
…yz#7018) Co-authored-by: Danil Nemirovsky <4614623+ameten@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Danil Nemirovsky <4614623+ameten@users.noreply.github.com>
Co-authored-by: Danil Nemirovsky <4614623+ameten@users.noreply.github.com>
Co-authored-by: Danil Nemirovsky <4614623+ameten@users.noreply.github.com>
Signed-off-by: pbio <10051819+paulbalaji@users.noreply.github.com>
…os (hyperlane-xyz#7047) Co-authored-by: Danil Nemirovsky <4614623+ameten@users.noreply.github.com>
…xyz#7055) Co-authored-by: Danil Nemirovsky <4614623+ameten@users.noreply.github.com>
Fix linter issues in fork-specific Rust code: - Use std::io::Error::other() for simpler error construction - Use unwrap_or_default() instead of unwrap_or(Type::zero()) - Replace unwrap() with expect() with descriptive messages - Add missing documentation for public modules and fields - Remove redundant closures in error mapping - Use From::from for infallible conversions - Add tokio "tracing" feature to Cargo.toml Note: There is a remaining compilation issue with tokio::task::Builder API that prevents full compilation. This appears to be a pre-existing issue that needs separate investigation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
…372) * claude: feat(kaspa): verify validator signatures before counting toward threshold Add signature verification for deposit signatures to prevent misconfigured validators from causing bridge transaction failures. Changes: - Add validator_ism_addresses field to RelayerStuff config - Refactor collect_with_threshold to accept optional validation function - Add signature verification in get_deposit_sigs that checks recovered signer against expected ISM address from config - Only count verified signatures toward threshold This ensures that even if a validator returns a signature signed with the wrong key, the relayer will reject it and continue waiting for valid signatures. Fixes #129 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * claude: refactor: remove verbose debug log for successful signature verification * claude: feat(kaspa): parse validatorIsmAddresses from config * format and remove debug * claude: feat(relayer): make validator signature verification optional Make signature verification conditional on validatorIsmAddresses being populated. If the list is empty, skip verification entirely. This allows gradual rollout and testing before enforcement. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
#374) Downgrade repetitive per-validator logs from info to debug level to prevent log spam in production: - base_builder.rs:228: validator storage locations (logs per validator) - multisig.rs:68: successful validator index returns (logs per validator) Also removes verbose checkpoint_syncers field from success log since the count provides sufficient information. These logs fire for every validator on every message, causing excessive verbosity at info level. Debug level is more appropriate for this detailed diagnostic information. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
… log size (#375) The Transaction struct's derived Debug implementation printed all tx_hashes, which can contain hundreds of 64-char hashes. This caused single log lines to exceed 40KB. Replace derived Debug with a custom implementation that shows only essential fields (uuid, tx_hashes_count, status, submission_attempts). Closes dymensionxyz/hyperlane-deployments#134 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
Retries are expected behavior in RPC clients, not warnings. Change the log level from warn to debug to reduce log noise in production while maintaining debuggability via RUST_LOG configuration. Closes dymensionxyz/hyperlane-deployments#134 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
…debug (#377) Rate limiting is expected when using public RPC providers. Change log level from info to debug for both JsonRpcError and SerdeJson rate limit detection paths to reduce log noise while maintaining debuggability. Closes dymensionxyz/hyperlane-deployments#134 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
#378) * progress checkpoint: code might be broken * claude: feat(relayer): improve logging for message submission and confirmation Add structured logging to help diagnose Base and BSC transaction issues: - debug log before submitting process transaction (message_id, gas_limit, destination) - info log after successful tx submission (message_id, tx_id, gas_used) - error log with full context on submission failure - debug log when checking delivery status - warn log with has_tx_outcome field when reverted/reorged to identify missing broadcasts These logs help diagnose: - Whether transactions are being broadcast at all - What gas limits are being used - Whether the issue is at submission or confirmation stage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
Cap the maximum backoff time for message retries to 15 minutes. This ensures stuck messages are retried promptly rather than waiting hours between attempts. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
…nd_threshold (#383) The CosmosNativeIsm's validators_and_threshold() function only handled MerkleRootMultisigIsm but not MessageIdMultisigIsm, even though module_type() correctly recognizes both ISM types. This caused relayer to fail with "ISM not a multi sig ism" error when processing messages on Cosmos chains configured with MessageIdMultisigIsm. Add handling for MessageIdMultisigIsm in validators_and_threshold() to match the existing MerkleRootMultisigIsm case. Fixes: https://github.com/dymensionxyz/hyperlane-deployments/issues/141 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
…381) The hardcode crate has hyperlane-core listed as a dependency but doesn't actually use it anywhere. This change removes the dependency to reduce coupling between libs/kaspa and rust/main. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
Move Hyperlane domain ID constants from libs/kaspa/lib/hardcode/src/hl.rs to a new dedicated crate rust/main/chains/dymension-kaspa-hl-constants. This refactoring: - Makes the hardcode crate purely Kaspa-focused - Places HL integration constants in the rust/main workspace - Breaks circular dependencies by creating a minimal constants crate - Updates all imports across both workspaces The new dymension-kaspa-hl-constants crate has no dependencies and is imported by both dymension-kaspa and the libs/kaspa crates (core, validator). Related to issue #140 in hyperlane-deployments repo. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
Deduplicates kaspa-* dependencies by moving them from dymension-kaspa crate to rust/main workspace dependencies. This eliminates duplication and reduces maintenance burden while keeping the same git rev (9ff5d0f). Note: The libs/kaspa workspace keeps its own definitions as Cargo doesn't support cross-workspace dependency inheritance. Related to dymensionxyz/hyperlane-deployments#140 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
…sion-kaspa (#386) Re-export dym_kas_core::message as dymension_kaspa::hl_message to create semantic separation between pure Kaspa libs and Hyperlane integration layer. - Add pub use dym_kas_core::message as hl_message in dymension-kaspa lib.rs - Update imports in rust/main to use dymension_kaspa::hl_message instead of dym_kas_core::message - Files in dymension/libs/kaspa continue using corelib::message (no circular deps) - Both workspaces build successfully This provides clear naming that indicates Hyperlane-specific functionality while avoiding circular dependencies between workspaces. Related to dymensionxyz/hyperlane-deployments#140 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
… bridge crate (#389) This PR splits the core module into two parts: - Pure Kaspa functionality stays in libs/kaspa/lib/core - Bridge/HL-dependent logic moves to new libs/kaspa/lib/bridge crate The new bridge crate contains: - message.rs - HL message parsing - deposit.rs - DepositFXG struct - withdraw.rs - WithdrawFXG struct - payload.rs - MessageIDs encoding - confirmation.rs - ConfirmationFXG struct - util.rs - Address<->H256 conversion - user/ - User deposit/payload utilities After this change: - libs/kaspa/lib/core has ZERO hyperlane dependencies - libs/kaspa/lib/bridge depends on both core (pure Kaspa) and HL - relayer/validator/tooling updated to use bridge crate - dymension-kaspa re-exports bridge as kas_bridge This completes Phase 2 of dependency inversion. Refs: #140 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
PR #389 moved confirmation and deposit modules from core to the new bridge crate. This fix updates hyperlane-base to import from the correct location and adds the bridge dependency. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
…main (#387) - Move relayer source files to rust/main/chains/dymension-kaspa/src/kas_relayer/ - Update imports to use dym_kas_core, dym_kas_bridge, dym_kas_hardcode - Remove relayer from libs/kaspa workspace - Update hyperlane-base to use dymension_kaspa::kas_relayer This reduces the HL dependencies in libs/kaspa, working towards the goal of keeping only pure Kaspa code there. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
…t/main (#388) - Move validator module from libs/kaspa/lib/validator to rust/main/chains/dymension-kaspa/src/kas_validator - Update imports: crate::error -> crate::kas_validator::error, corelib -> dym_kas_core, bridge -> dym_kas_bridge, api_rs -> dym_kas_api, secp256k1 -> kaspa_bip32::secp256k1 - Update dymension-kaspa lib.rs to expose kas_validator module - Update validator_server.rs to use local kas_validator imports - Remove dym-kas-validator dep from dymension-kaspa and hyperlane-base Cargo.toml - Update libs/kaspa kms crate: remove validator dep, define KaspaSecpKeypair locally - Update libs/kaspa tooling: depend on dymension-kaspa for signer module, add ethers feature to hyperlane-core - Remove validator from libs/kaspa workspace members 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
…ain/dymension-kaspa (#391) Move the bridge crate from libs/kaspa/lib/bridge to rust/main/chains/dymension-kaspa/src/kas_bridge. This continues the refactoring effort to separate pure Kaspa code (libs/kaspa) from Hyperlane integration code (rust/main). The bridge module contains HL-dependent types for deposit/withdraw/confirmation payloads, message parsing, and user-facing deposit functionality. Changes: - Move bridge module files to dymension-kaspa as kas_bridge - Update imports from prost to hyperlane_cosmos_rs::prost - Update dym_kas_bridge:: references to crate::kas_bridge:: - Update secp256k1 imports to kaspa_bip32::secp256k1 - Add workflow-core to workspace dependencies - Remove bridge from libs/kaspa workspace members - Update tooling to use dymension_kaspa::kas_bridge:: 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
* update docs * restore `"accountAddressType": "Bitcoin"`
* claude: fix(dymension-kaspa): add ethers feature for signature recovery Enable the ethers feature on hyperlane-core dependency to make the `recover()` method available on SignedType<T>. Also clean up unused imports and variables in validators.rs, and fix import ordering in libs/kaspa/tooling. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * claude: refactor(kaspa): extract PSKT utilities to dym_kas_core Phase 2A of Kaspa refactoring - move pure Kaspa PSKT utilities from rust/main to libs/kaspa/lib/core: - input_sighash_type() and is_valid_sighash_type() for standard sighash - PopulatedInput type alias and PopulatedInputBuilder struct - utxo_reference_from_populated_input() for mass calculation - estimate_mass() for transaction mass estimation These are pure Kaspa utilities with no Hyperlane dependencies, making libs/kaspa/lib/core more self-contained for PSKT operations. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * claude: chore(kaspa): remove dead code and unnecessary re-exports Clean up after PSKT utilities extraction: - Remove dead populated_input.rs module (only contained re-exports) - Remove unnecessary re-exports in sweep.rs and hub_to_kaspa.rs - Update imports to use dym_kas_core::pskt directly - Rename test_foo to test_recipient_address_roundtrip 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* claude: refactor(kaspa): move tooling to rust/main/utils/kaspa-tools Move the kaspa-tools crate from libs/kaspa/tooling to rust/main/utils/kaspa-tools since it has Hyperlane dependencies and doesn't belong in the pure Kaspa library. - Add kaspa-tools to rust/main workspace - Update import paths (corelib -> dym_kas_core) - Add secp256k1 to workspace dependencies 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * claude: refactor(kaspa): remove tooling and HL deps from libs/kaspa Remove tooling crate from libs/kaspa workspace and remove all Hyperlane dependencies to keep libs/kaspa as a pure Kaspa library. - Remove tooling from workspace members - Remove hyperlane-cosmos-rs dependency - Remove cometbft workspace dependencies - Clean up comment artifacts in client.rs - Regenerate Cargo.lock without HL packages After this change, libs/kaspa has only one hyperlane mention remaining: a cross-reference URL in CONTRIBUTING.md (acceptable). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Update documentation references to use the new kaspa-tools location: - libs/kaspa/tooling -> rust/main (cargo run -p kaspa-tools --) Affected files: - dymension/validators/bridge/README_bridge_kaspa.md - dymension/tests/kaspa_hub_test_kas/commands.sh 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
…395) * claude: refactor(kaspa): collapse hl-constants into dymension-kaspa Remove the separate dymension-kaspa-hl-constants crate and move all constants into dymension-kaspa/src/consts.rs. - Merge domain ID constants into consts.rs - Update all internal imports to use crate::consts - Maintain backward compatibility via `pub use consts as hl_domains` - Remove dymension-kaspa-hl-constants from workspace 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * claude: refactor(kaspa): move validator_server into kas_validator module Move validator_server.rs to kas_validator/server.rs for better organization. - Rename validator_server.rs -> kas_validator/server.rs - Update imports to use super:: for sibling modules - Re-export from kas_validator module 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * claude: refactor(kaspa): rename kas_bridge to bridge Rename kas_bridge module to simply 'bridge' - clearer semantic naming that describes the module's purpose (bridge operation data types: DepositFXG, WithdrawFXG, ConfirmationFXG, message parsing). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * claude: refactor(kaspa): rename bridge to ops The name 'bridge' was too generic since the entire project is about a bridge. The module contains operation types (DepositFXG, WithdrawFXG, ConfirmationFXG) - the wire formats exchanged between validators and relayers. 'ops' (operations) better describes its purpose. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
- Delete prometheus/ module - was copy-pasted from hyperlane-cosmos but never used (dead code) - Remove unused Cargo.toml dependencies: ripemd, once_cell, itertools, protobuf, pin-project - Remove dangling mod withdraw_test reference in kas_validator/mod.rs - Remove unused KEY_MESSAGE_IDS constant from libs/kaspa/core The prometheus module contained MetricsChannel/MetricsChannelFuture for instrumenting gRPC clients, but these were never wired up. The identical code already exists in hyperlane-cosmos if needed. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…#399) - Remove unused imports from libs/kaspa/lib/core/src/balance.rs (kaspa_wallet_core::prelude::*, std::sync::Arc) - Prefix unused parameter with underscore in client.rs (_domain_kas) - Remove #![allow(unused)] from ops/user/deposit.rs and clean up all unused imports in that file 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Improve logging clarity when relayer fails to build metadata due to merkle tree sync issues. This makes it much more obvious when the relayer cannot relay messages because the origin chain merkle tree is not synced (typically due to RPC connectivity issues). Changes: - Upgrade log level from info to error when merkle tree is empty - Add detailed error messages explaining common causes - Include origin/destination domain context in error messages - Add debug logging when tree count is zero 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Author
|
Recreating with correct base branch (main-dym) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
infotoerrorwhen merkle tree is empty/not synced, making RPC connectivity issues immediately obvioushighest_known_leaf_index()Context
When the relayer cannot reach the origin chain RPC, the local merkle tree sync fails silently, resulting in cryptic "Unable to reach quorum" errors. This PR makes it immediately clear that the root cause is merkle tree sync failure.
Test plan
🤖 Generated with Claude Code