Skip to content

Multiple horcrux signers can result in double proposing #6331

@evan-forbes

Description

@evan-forbes

it was observed that a validator on mocha was seeing double proposals and one node constantly falling behind. It's very unclear why this was occurring, and this did not occur on all horcrux signers on mocha afaiu.

here are some interesting log snippets:

2-18T10:16:37+00:00 host-173-201-36-182.example.com celestia-appd[1663341]: 10:16AM ERR failed to process message err="error invalid proposal signature" height=9269939 module=consensus msg_type=*consensus.ProposalMessage peer= round=0
2025-12-18T10:16:37+00:00 host-173-201-36-182.example.com celestia-appd[1663341]: 10:16AM INF received proposal module=consensus proposal={"Type":32,"block_id":{"hash":"2545B23F528815902906FAE0C5D4AEA6647144D25A331C07D18E1489257222B9","parts":{"hash":"94FCCBCD0FE3C97F453850B9E2747567D060CDEDE181423A55F25CC9A6259B30","total":118}},"height":9269939,"pol_round":-1,"round":0,"signature":"LKeEeKVmADxIP16f3R3BUgR1mtzPZnX3zNOVp5JZ39Pfsmn/qVwMhB+e6cYGPYs7e21Hlz94ceDQUobnzfB8DQ==","timestamp":"2025-12-18T10:16:37.299188041Z"} proposer=C822706DA92B375B6793FEC5FCAD04BB5AFE142A

notice the peer = "" and the proposer being that validtor's address

which means their own signature was invalid

we also see invalid part proofs, as the part proofs have a different root. This likely means that some parts from the second proposal are attempting to be added.

after the issue starts we normally see:

2025-12-18T10:21:24+00:00 host-173-201-36-182.example.com celestia-appd[1671334]: 10:21AM INF commit is for a block we do not know about; set [1671334]: 10:21AM INF added commitment height=9269984 module=propagation round=0
2025-12-18T10:21:24+00:00 host-173-201-36-182.example.com celestia-appd[1671334]: 10:21AM INF received complete proposal block hash=8C1D806564C43386E255E10ED8539AFE3E6B9F824D2E25CC574065947E7CCFB4 height=9269984 module=consensus
2025-12-18T10:21:24+00:00 host-173-201-36-182.example.com celestia-appd[1671334]: 10:21AM INF finalizing commit of block hash={} height=9269984 ...
2025-12-18T10:21:24+00:00 host-173-201-36-182.example.com celestia-appd[1671334]: 10:21AM INF finalized block ... height=9269984

followed by this 5-6s later

2025-12-18T10:21:30+00:00 host-173-201-36-182.example.com celestia-appd[1671334]: 10:21AM ERR failed signing vote err="signerEndpoint returned error #0: error saving last sign state initiated: [mocha-4] Progress already started on block 9269987.0.2, skipping 9269985.0.3" height=9269985 module=consensus ... "validator_address":"C822706DA92B375B6793FEC5FCAD04BB5AFE142A","validator_index":5}

this is a bit interesting since the timeout for endpoint signers is hardcoded to 5s

Metadata

Metadata

Assignees

Labels

WS: Maintenance 🔧includes bugs, refactors, flakes, and tech debt etcWS: V6V6 increases the square size, block size, tx size. Decreases unbonding period. Decreases inflation.bugSomething isn't working

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions