Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
## Introduction

This document explains and motivates operational certificate issue numbers (aka opcert issue numbers), their relationship with header validity, block tiebreakers, and short forks.

## Syntax

- {FooBar} is the definition of FooBar, useful for navigation via CTRL+F.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably define this in a single place.

- X, Y, and Z are variable names whose scope does not extend beyond this document.

## Definition

In the most granular sense, a party producing Cardano blocks is identified by a pair X and Y.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not something like:

Suggested change
In the most granular sense, a party producing Cardano blocks is identified by a pair X and Y.
In the most granular sense, a party producing Cardano blocks is identified by a pair (`cold`, `opcert`)


- {X} is a {ColdKey}; the ledger state includes the public half via the stake pool's registration.
- {Y} is an {OperationalCertificateIssueNumber}; the Cardano header includes one, signed by X.
(The initialism {OCIN} abbreviates OperationalCertificateIssueNumber in this document but is not broadly used outside of it.)

An {OperationalCertificate} (aka opcert) grants a {HotKey} the right to issue blocks on behalf of some X.
An OperationalCertificate is present in each Cardano header and contains the OCIN Y.
The ColdKey-HotKey indirection enables the stake pool operator (aka {SPO}) to store their ColdKey offline, so that it's plausible ColdKeys will never be acquired by the adversary excepty through physical access or coercion (eg burglary/bribery).

## Consequences and Motivation

The benefit of identities including an OCIN Y rather than merely being determined by a ColdKey X is that an SPO can increment their OCIN whenever they suspect the adversary could have acquired their current HotKey (eg they discover evidence of unauthorized software running on their block-producing node).
This response will be effective because the header validity and tiebreaker rules involve the OCIN.

- The protocol state maintains a mapping from X to Z, the OCIN of the youngest header issued by X on this chain.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Z is introduced out of the blue. I also wonder if we should use more representative names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X, Y, Z does make it seem like they're a sequence of variables of the same type rather than unrelated

The protocol rules allow exactly two identities to issue a header extending some chain when X is elected, either Y=Z or Y=Z+1.
- The Cardano tiebreaker rules favor the greater OCIN when comparing two headers from X in the same slot.
Without this preference in the tiebreaker, whether X is able to increment their OCIN on the honest chain is merely a network race between the adversary's header/block and the SPO's header/block.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth to mention (maybe in a footnote) that the VRF tiebreaker can't help here because it is determined by {ColdKey} and the slot, which both are identical here.

Conservatively, the adversary is assumed to have better network connectivity, and so theoretically could prevent an honest party from incrementing the OCIN until the leaked HotKey expires up to 90 days later (see next section).

The fact that Z is not allowed to decrease along a chain prevents the adversary that acquired a HotKey with precedence Y from extending alternative chains whenever X is elected unless that chain branches off from the honest chain _before_ X incremented their OCIN to Y+1 on the honest chain---see the visualization below.
An SPO is thereby able to effectively revoke the rights from a HotKey that the adversary might have acquired, thereby re-establishing control of their blocks' predecessor and payload on the honest chain.

```mermaid
flowchart TD
I["Z(X)=0"] --> X1[...] --> X2["Z(X)=1"] --> X3[...]
I --> Y1["(X,0) ok"]
X1 --> Z1["(X,0) ok"]
X2 --> A1["(X,0) invalid!"]
```

The fact that Z can only increase along a chain by at most one per header bounds the rate at which X can increment their Z as a chain grows.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The fact that Z can only increase along a chain by at most one per header bounds the rate at which X can increment their Z as a chain grows.
The fact that Z can only increase along a chain by at most one per header, bounds the rate at which X can increment their Z as a chain grows.

Otherwise it might read as "header bounds"

If that rate weren't bounded, an adversary X could generate infinite identities Y whenever they wanted, and the Cardano tiebreaker would favor each new one.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we assuming X controls the cold as well? If not, how could it increment the opcert number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: should have been "adversarial X".


## Advantage over Mere ColdKey-HotKey Indirection

Completely orthogonally to OCINs, each HotKey is confined to one interval of slots, the {KesPeriod}, which is ~90 days on Cardano.
This achieves _forward-security_ ([Wikipedia](https://en.wikipedia.org/wiki/Forward_secrecy)) with respect to long-term time scales without requiring SPOs to access their ColdKey unbearably often (recall that it's physically offline).

If Cardano did not involve OCINs, then an adversary would benefit from an acquired HotKey until its KesPeriod expired, which could be up to 90 days.
On the other hand, an SPO---especially one with a lot of delegated stake---can increment the OCIN much sooner than 90 days after an attack (assuming they notice it).
Comment on lines +52 to +53
Copy link
Member

@amesgen amesgen Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth to mention that there are potential alternatives, like submitting the OCIN update via a tx, or (conceptually similar) via a stake pool re-registration certificate (would need to be enriched). This is also how eg Peras voting (and I presume Leios voting) will handle this scenario.

These approaches generally have longer latency to come into effect, but work even if the adversary is relatively strong and can orphan blocks that the corrupted pool is producing (they just need one nearby election), preventing the on-chain counter from increasing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that was a bad omission on my part, thanks for catching it! It severely trims the potential 90 day period down to 36 hours + tx latency for the cost of a small tx fee---right?

would need to be enriched

I haven't though through it, but it seems like a minor change. Have you already considered some details? I'll ask Andre.

Copy link
Member

@amesgen amesgen Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It severely trims the potential 90 day period down to 36 hours + tx latency for the cost of a small tx fee---right?

I assume you mean for the purpose of the existing hot keys, so not Peras etc? Not without enriching stake pool registration certificates in some way; currently, there is no way to directly invalidate compromised KES keys with the status quo (which makes sense given that there is the mechanism you describe in this document). Eg you can't change your cold key, that one is permanent (see PoolParams for what is contained in a RegPool :: PoolCert, ppId is only used to identify the pool to be (re-)registered; but you can't "replace" it).

Also, it takes time for stake pool registration certificates to take effect, namely at least two epochs. (It is delayed by one epoch compared to new registration as compared to re-registrations, to give a forewarning to delegators; but this seems more relevant for reward-related parameters, so could maybe be changed.) Additionally, there could be problems when throughput is very high (or chain quality is low) and the SPO has difficulties getting the tx included in a block (but at least it doesn't have to be their own block).

One thing one can of course even do today is to register a new VRF key, so the leadership schedule is private again. But the adversary can still extract the VRF proof from a published block and use it for creating their own blocks (and potentially diffuse it faster, assuming good network infrastructure). On the flip side, it means that the adversary can't create blocks if the honest operator doesn't publish any blocks, so they can at least stop the adversary from doing harm this way.

would need to be enriched

I haven't though through it, but it seems like a minor change. Have you already considered some details? I'll ask Andre.

I guess one could track the issue number in the PoolParams (or a Maybe Hash to blacklist a particular KES key).

For eg Peras, PoolParams would track the public Peras voting key, and the operator would just register a different one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, OK. I was mentally thinking that the old cold key signing the new cold key would let us circumvent some of these delays. But that's skipping some practical considerations---eg "opcerts" would be chains of signatures, which is no good. I suppose I was just loosely envisioning a Frankenstein of the existing opcert issue number mechanism.

I'm now thinking that I should have interpreted "enriched" your original text as "Letting a tx (monoidally) update the opcert issue number map after a forecast window of delay".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... but that tx would be still need to be invalid if it tries to increase (the stable value, regardless of pending txs) by more than one.


When issuing a new HotKey merely because the current one expired, an SPO does not need to increment their OCIN.
Generally, there's no downside if they do so, other than increasing the likelihood of some awkward situations described in the next section---even then it's a risk that arises only once per 90 days per stake pool.

*Remark*.
[The current Cardano documentation](https://developers.cardano.org/docs/operate-a-stake-pool/cardano-key-pairs/#stake-pool-cold-keys) disagrees, stating it's crucial to increment when rotating.
Moreover, the relevant CLI tool always increments ([here](https://github.com/IntersectMBO/cardano-api/blob/4a6ce60b0028e3062d666980574aebf6acfee9b3/cardano-api/src/Cardano/Api/Certificate/Internal/OperationalCertificate.hs#L134) and [here](https://github.com/IntersectMBO/cardano-cli/blob/2124f5ab210ef57a6ed25c8cd383b36f927b6415/cardano-cli/src/Cardano/CLI/EraIndependent/Node/Run.hs#L292-L304)).
We suspect either a past miscommunication or else those authors considered the nuance to not be worth the risk, favoring the conservative/robust approach of always incrementing.

## Interaction with Short Forks

Different forks might disagree on some ColdKeys's Z.
In the most common circumstance, this makes no difference to the SPO.
If X issues a header with an incremented Y+1 that doesn't end up on the honest chain (ie is orphaned), they'll simply issue their next header with the same Y+1 on the winning fork.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"i.e."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was an IOHK style guide at some point that specified ie, eg, etc. 🤷 I personally took a liking to it and I've never seen a more recent style guide override it.


On the other hand, if unusual circumstances cause a party X to increment its OCIN multiple times within the same stability window (12 to 36 hr on Cardano, shorter when Peras succeeds), then it's possible that a short fork might discard N>1 of those increments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this paragraph (or this section more widely?) could do with a little more explanation, or maybe just some rearranging, because it's not immediately obvious why this is true

In that case, X's next header on the winning fork cannot be its greatest ever OCIN (ie Z on the losing short fork), but rather N-1 less than that.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In that case, X's next header on the winning fork cannot be its greatest ever OCIN (ie Z on the losing short fork), but rather N-1 less than that.
In that case, X's next header on the winning fork cannot be its greatest ever OCIN (ie Z on the losing short fork), but rather N-1 less than that.

Sorry but I don't get this bit, in particular the "N-1 less than that" bit.

Thus there will be an extended interval during which the adversary can abuse X's elections, if X actually did need to increment their OCIN N times.

The unusual scenario might be visualized as follows, where the chain on the left is the losing fork and the chain on the right is the winning fork.
The unfortunate property here is that X has issued a header with Y=1 _after_ having issued a header with Y=2.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The unfortunate property here is that X has issued a header with Y=1 _after_ having issued a header with Y=2.
The unfortunate property here is that X has issued a header with Y=1 _after_ having issued a header with Y=2.

I don't get this bit. Are we talking about issuing a header on different forks? But in any case, how can the same party issue Y=1 after increasing the opcert number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only increased their OCIN on the losing chain not on the winning chain. Now that they've seen the outcome of the short fork, they're now contributing to the winning chain. They cannot use their latest OCIN, since that's only valid on the losing fork. So they awkwardly have to use one of their previous OCINs that wasn't increased in order to contribute to the best chain they can (as Praos requires), since Z was never incremented (or was incremented fewer times) on the fork that won the contest.

If X had good reason to increment Y to 2, then that means X does not necessarily control the payload of their first subsequent block on the winning fork: the adversary can also issue a block with Y=1 and the header validity rules prevent X from skipping ahead from Z=0 to Y=2 on this chain.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If X had good reason to increment Y to 2

What would be a good reason here? Is it "being corrupted twice within 36h"? (Is that a good reason?)

More generally, is "Incrementing your OCIN twice in a short timeframe is not something you should (need to) do, but if you do, non-obvious stuff can happen?" your intended takeaway from this section?

Copy link
Contributor Author

@nfrisby nfrisby Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah. That's somewhat-colloquial English for "there was a true need for X to increment Y twice". I'll rewrite. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your intended takeaway from this section?

Yeah, that seems like a fair summary.


```mermaid
flowchart TD
I["Z(X)=0"] --> X1["Z(X)=0"] --> X2["Z(X)=1"] --> X3["Z(X)=1"] --> X4["Z(X)=2"]
I --> Y1["Z(X)=0"] --> Y2["Z(X)=0"] --> Y3["Z(X)=0"] --> Y4["Z(X)=0"] --> Y5["Z(X)=1"]
```
1 change: 1 addition & 0 deletions docs/website/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ const sidebars = {
'for-developers/Glossary',
'for-developers/ComponentDiagram',
'for-developers/BlockBlockDiagram',
'for-developers/OperationalCertificateIssueNumber',
'for-developers/CardanoPraosBasics',
'for-developers/Ticking',
'for-developers/CivicTime',
Expand Down
Loading