Skip to content

Conversation

nfrisby
Copy link
Contributor

@nfrisby nfrisby commented Jul 28, 2025

Add a new Markdown document explaining and motivating opcert issue numbers.

@nfrisby
Copy link
Contributor Author

nfrisby commented Jul 28, 2025

Copy link
Member

@amesgen amesgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

- The protocol state maintains a mapping from X to Z, the OCIN of the youngest header issued by X on this chain.
The protocol rules allow exactly two identities to issue a header extending some chain when X is elected, either Y=Z or Y=Z+1.
- The Cardano tiebreaker rules favor the greater OCIN when comparing two headers from X in the same slot.
Without this preference in the tiebreaker, whether X is able to increment their OCIN on the honest chain is merely a network race between the adversary's header/block and the SPO's header/block.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth to mention (maybe in a footnote) that the VRF tiebreaker can't help here because it is determined by {ColdKey} and the slot, which both are identical here.

Comment on lines +52 to +53
If Cardano did not involve OCINs, then an adversary would benefit from an acquired HotKey until its KesPeriod expired, which could be up to 90 days.
On the other hand, an SPO---especially one with a lot of delegated stake---can increment the OCIN much sooner than 90 days after an attack (assuming they notice it).
Copy link
Member

@amesgen amesgen Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth to mention that there are potential alternatives, like submitting the OCIN update via a tx, or (conceptually similar) via a stake pool re-registration certificate (would need to be enriched). This is also how eg Peras voting (and I presume Leios voting) will handle this scenario.

These approaches generally have longer latency to come into effect, but work even if the adversary is relatively strong and can orphan blocks that the corrupted pool is producing (they just need one nearby election), preventing the on-chain counter from increasing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that was a bad omission on my part, thanks for catching it! It severely trims the potential 90 day period down to 36 hours + tx latency for the cost of a small tx fee---right?

would need to be enriched

I haven't though through it, but it seems like a minor change. Have you already considered some details? I'll ask Andre.

Copy link
Member

@amesgen amesgen Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It severely trims the potential 90 day period down to 36 hours + tx latency for the cost of a small tx fee---right?

I assume you mean for the purpose of the existing hot keys, so not Peras etc? Not without enriching stake pool registration certificates in some way; currently, there is no way to directly invalidate compromised KES keys with the status quo (which makes sense given that there is the mechanism you describe in this document). Eg you can't change your cold key, that one is permanent (see PoolParams for what is contained in a RegPool :: PoolCert, ppId is only used to identify the pool to be (re-)registered; but you can't "replace" it).

Also, it takes time for stake pool registration certificates to take effect, namely at least two epochs. (It is delayed by one epoch compared to new registration as compared to re-registrations, to give a forewarning to delegators; but this seems more relevant for reward-related parameters, so could maybe be changed.) Additionally, there could be problems when throughput is very high (or chain quality is low) and the SPO has difficulties getting the tx included in a block (but at least it doesn't have to be their own block).

One thing one can of course even do today is to register a new VRF key, so the leadership schedule is private again. But the adversary can still extract the VRF proof from a published block and use it for creating their own blocks (and potentially diffuse it faster, assuming good network infrastructure). On the flip side, it means that the adversary can't create blocks if the honest operator doesn't publish any blocks, so they can at least stop the adversary from doing harm this way.

would need to be enriched

I haven't though through it, but it seems like a minor change. Have you already considered some details? I'll ask Andre.

I guess one could track the issue number in the PoolParams (or a Maybe Hash to blacklist a particular KES key).

For eg Peras, PoolParams would track the public Peras voting key, and the operator would just register a different one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, OK. I was mentally thinking that the old cold key signing the new cold key would let us circumvent some of these delays. But that's skipping some practical considerations---eg "opcerts" would be chains of signatures, which is no good. I suppose I was just loosely envisioning a Frankenstein of the existing opcert issue number mechanism.

I'm now thinking that I should have interpreted "enriched" your original text as "Letting a tx (monoidally) update the opcert issue number map after a forecast window of delay".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... but that tx would be still need to be invalid if it tries to increase (the stable value, regardless of pending txs) by more than one.


The unusual scenario might be visualized as follows, where the chain on the left is the losing fork and the chain on the right is the winning fork.
The unfortunate property here is that X has issued a header with Y=1 _after_ having issued a header with Y=2.
If X had good reason to increment Y to 2, then that means X does not necessarily control the payload of their first subsequent block on the winning fork: the adversary can also issue a block with Y=1 and the header validity rules prevent X from skipping ahead from Z=0 to Y=2 on this chain.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If X had good reason to increment Y to 2

What would be a good reason here? Is it "being corrupted twice within 36h"? (Is that a good reason?)

More generally, is "Incrementing your OCIN twice in a short timeframe is not something you should (need to) do, but if you do, non-obvious stuff can happen?" your intended takeaway from this section?

Copy link
Contributor Author

@nfrisby nfrisby Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah. That's somewhat-colloquial English for "there was a true need for X to increment Y twice". I'll rewrite. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your intended takeaway from this section?

Yeah, that seems like a fair summary.

Copy link
Member

@dnadales dnadales left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some questions. Do you think this section could be added to the "Consensus Protocol" section (soon to be chapter :D) of the reorg PR.

In addition, the names X, Y, Z make the text a bit hard to follow for me. I'd be happy to propose an alternative and push it to this branch 👍


## Syntax

- {FooBar} is the definition of FooBar, useful for navigation via CTRL+F.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably define this in a single place.


## Definition

In the most granular sense, a party producing Cardano blocks is identified by a pair X and Y.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not something like:

Suggested change
In the most granular sense, a party producing Cardano blocks is identified by a pair X and Y.
In the most granular sense, a party producing Cardano blocks is identified by a pair (`cold`, `opcert`)

The benefit of identities including an OCIN Y rather than merely being determined by a ColdKey X is that an SPO can increment their OCIN whenever they suspect the adversary could have acquired their current HotKey (eg they discover evidence of unauthorized software running on their block-producing node).
This response will be effective because the header validity and tiebreaker rules involve the OCIN.

- The protocol state maintains a mapping from X to Z, the OCIN of the youngest header issued by X on this chain.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Z is introduced out of the blue. I also wonder if we should use more representative names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X, Y, Z does make it seem like they're a sequence of variables of the same type rather than unrelated

X2 --> A1["(X,0) invalid!"]
```

The fact that Z can only increase along a chain by at most one per header bounds the rate at which X can increment their Z as a chain grows.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The fact that Z can only increase along a chain by at most one per header bounds the rate at which X can increment their Z as a chain grows.
The fact that Z can only increase along a chain by at most one per header, bounds the rate at which X can increment their Z as a chain grows.

Otherwise it might read as "header bounds"

```

The fact that Z can only increase along a chain by at most one per header bounds the rate at which X can increment their Z as a chain grows.
If that rate weren't bounded, an adversary X could generate infinite identities Y whenever they wanted, and the Cardano tiebreaker would favor each new one.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we assuming X controls the cold as well? If not, how could it increment the opcert number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: should have been "adversarial X".

If X issues a header with an incremented Y+1 that doesn't end up on the honest chain (ie is orphaned), they'll simply issue their next header with the same Y+1 on the winning fork.

On the other hand, if unusual circumstances cause a party X to increment its OCIN multiple times within the same stability window (12 to 36 hr on Cardano, shorter when Peras succeeds), then it's possible that a short fork might discard N>1 of those increments.
In that case, X's next header on the winning fork cannot be its greatest ever OCIN (ie Z on the losing short fork), but rather N-1 less than that.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In that case, X's next header on the winning fork cannot be its greatest ever OCIN (ie Z on the losing short fork), but rather N-1 less than that.
In that case, X's next header on the winning fork cannot be its greatest ever OCIN (ie Z on the losing short fork), but rather N-1 less than that.

Sorry but I don't get this bit, in particular the "N-1 less than that" bit.

Thus there will be an extended interval during which the adversary can abuse X's elections, if X actually did need to increment their OCIN N times.

The unusual scenario might be visualized as follows, where the chain on the left is the losing fork and the chain on the right is the winning fork.
The unfortunate property here is that X has issued a header with Y=1 _after_ having issued a header with Y=2.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The unfortunate property here is that X has issued a header with Y=1 _after_ having issued a header with Y=2.
The unfortunate property here is that X has issued a header with Y=1 _after_ having issued a header with Y=2.

I don't get this bit. Are we talking about issuing a header on different forks? But in any case, how can the same party issue Y=1 after increasing the opcert number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only increased their OCIN on the losing chain not on the winning chain. Now that they've seen the outcome of the short fork, they're now contributing to the winning chain. They cannot use their latest OCIN, since that's only valid on the losing fork. So they awkwardly have to use one of their previous OCINs that wasn't increased in order to contribute to the best chain they can (as Praos requires), since Z was never incremented (or was incremented fewer times) on the fork that won the contest.

In the most common circumstance, this makes no difference to the SPO.
If X issues a header with an incremented Y+1 that doesn't end up on the honest chain (ie is orphaned), they'll simply issue their next header with the same Y+1 on the winning fork.

On the other hand, if unusual circumstances cause a party X to increment its OCIN multiple times within the same stability window (12 to 36 hr on Cardano, shorter when Peras succeeds), then it's possible that a short fork might discard N>1 of those increments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this paragraph (or this section more widely?) could do with a little more explanation, or maybe just some rearranging, because it's not immediately obvious why this is true


Different forks might disagree on some ColdKeys's Z.
In the most common circumstance, this makes no difference to the SPO.
If X issues a header with an incremented Y+1 that doesn't end up on the honest chain (ie is orphaned), they'll simply issue their next header with the same Y+1 on the winning fork.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"i.e."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was an IOHK style guide at some point that specified ie, eg, etc. 🤷 I personally took a liking to it and I've never seen a more recent style guide override it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants