Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .idea/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions .idea/.name

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 19 additions & 0 deletions .idea/inspectionProfiles/Project_Default.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/inspectionProfiles/profiles_settings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/solana-improvement-documents.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

44 changes: 44 additions & 0 deletions .idea/workspace.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

198 changes: 198 additions & 0 deletions proposals/0363-simple-alpenglow-clock.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
---
simd: '0363'
title: Simple Alpenglow Clock
authors:
- Roger Wattenhofer
- ksn6
category: Standard
type: Core
status: Review
created: 2025-09-19
---

## Summary

Since the clock sysvar computation is incompatible with Alpenglow, we need a new
design. In this document we suggest a simple replacement solution, which should
be accurate enough for all use cases.
Comment on lines +15 to +17
Copy link
Copy Markdown

@topointon-jump topointon-jump Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just throwing this out there - should we consider deprecating the unix_timestamp and epoch_start_timestamp clock sysvar fields entirely? We could mark them as deprecated in the next release, and set them to zero in a feature gate a little while after that.

If we are considering changing the semantics of the timestamps, we should also consider getting rid of them entirely.

Is there a good reason not to do this? Will this break a lot of use-cases today? Genuinely asking. If removing these fields would break a lot of important use-cases then we should probably keep them, if not then it's worth considering.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unix_timestamp is used very broadly within defi, its probably not feasible to ever remove that. would there be another trustless way to access the current time on chain? a notion of time is required for things like interest calculations.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I don't know how many applications are using the clock (and how). I asked people about this before I wrote this SIMD, but nobody was able to tell me. The consensus was then to have some reasonably good solution no matter what. This is basically what this SIMD is about.

Copy link
Copy Markdown

@topointon-jump topointon-jump Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OliverNChalk if the timestamp fields are removed, then slot number could be used as a rough proxy for time. I'm not sure if this is sufficient.

Copy link
Copy Markdown

@OliverNChalk OliverNChalk Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would require us to expose the configured slot time target and would likely drift fairly significantly over time? Not to mention this still breaks marginfi, kamino, and all other money markets, not to mention meme coin launchpads/AMMs that use time as a trigger for pool unlock.

In an ideal world we have decent accuracy in all cases. In a okay world we have this proposal which gives us decent accuracy in the average case. I don't think it's possible to remove the concept of time from DEFI as time is very fundamental to finance (i.e. funding/interest rates). If we remove time then now all of defi will have to ask Pyth for the current time which introduces another trust assumption making all of solana finance dependent on a 3rd party chain

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the intended uses of today's block timestamp are these

  1. enforcing paper contract dates on chain (eg. stake account lockups)
  2. real-world (tax) accounting

all other uses are a misuse of this timestamp as it is not what the consumer believes it to be. that said, its misuse is prevalent, so we must retain its availability as its removal would be extremely disruptive to the ecosystem today

attempts to use slot_height * target_block_time will fall apart since they lack corrective inputs. each time we drive down the block time target would need treatment. we've already accumulated several hours worth of slot-time/wall-time skew due to outages

that the clock sysvar timestamp is unsuitable for alpenglow is fine, so long as the solution to alpenglow clock is capable of emitting a timestamp suitable for populating the clock sysvar timestamp


## Motivation

So far, individual votes also include a timestamp, and the clock sysvar was
updated every slot by calculating the stake-weighted median of validator vote
provided times. With Alpenglow, individual votes do not go on chain anymore.
Consequently, we need a new clock design.


## New Terminology

No new terms, but the following definition is updated:

- Clock Variable in Solana Sysvar Cluster Data (unix_timestamp) is computed
differently. The variable is still accessed by Clock::get() or by passing the
sysvar account to a program as an account parameter.


## Detailed Design

In Alpenglow, the current block leader includes an updated integer clock value
(Rust system time in nanoseconds) in its block *b* in slot *s* in the block
footer. This value is bounded by the clock value in the parent block.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At what point should we get the system clock time? When it is the time to create our leader slot? When we actually send out the block footer? Or just refer to what the current clock sysvar says?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You insert your system time when you produce the slice with the clock info in the block marker.

Copy link
Copy Markdown
Contributor

@ksn6 ksn6 Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean - what precisely do you mean by "system time when you produce the slice?"

E.g. is this:

  • The beginning of block production for a slot?
  • The end of block production for a slot, post block building?
  • When we actually create the block footer, post block building, and pre-shredding for turbine dissemination?

The specific choice isn't really enforceable, although, for default implementation purposes / consistency, it's probably worth opining on the specifics in the SIMD.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these are possible (and enforceable). This depends on how we organize meta-data in a block (i.e., our discussion in Chicago). If we may change the parent, then the header (the first slice) is the wrong place, but the footer (the last slice) should be alright.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, it looks like nothing prevents a validator from reporting a timestamp that is off by +/- 50 ms, equating to an honest validator reporting a timestamp that is at the beginning of block production vs. maybe even somewhere in the middle of block production or even the end of block production.

To somewhat make the notion of a clock uniform across validator implementations, we should probably specify roughly at what point in a leader's production of a block the timestamp should be captured.

If this can somehow be enforced, that's even better imho.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, it looks like nothing prevents a validator from reporting a timestamp that is off by +/- 50 ms...

Well, I would say it can even be up to 800 ms wrongly reported. Nothing we can do here.

But we established that the clock should be in the block footer, and we established that it should be captured before putting it it there, so what's still unclear?

Copy link
Copy Markdown
Contributor

@ksn6 ksn6 Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these are possible (and enforceable).

Well, I would say it can even be up to 800 ms wrongly reported. Nothing we can do here.

This part is a bit unclear - these two statements seem a bit contradictory. I agree with the second statement you're making re. being off by up to 800 ms, though.

But we established that the clock should be in the block footer, and we established that it should be captured before putting it it there, so what's still unclear?

The part that's unclear - at what exact part of the leader block production phase should we have leaders set the block timestamp?

Should the timestamp be associated with when the leader starts producing their block? When the leader conclusively knows what the block parent is? When the leader actually constructs the footer itself?

I'm aware that it's impossible to enforce the particular point in time, as you pointed out re. the 800 ms piece. This being said, it's worth having a sane default.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"We assume that a correct leader inserts its correct local time..."

Maybe this is not clear enough? When you insert the time value in the footer you insert your local time at that point. This way we have the least (additional) skew. Note that we will still have a systematic lag since the slice then has be encoded, sent through Rotor/Turbine, and decoded on the receiver side. We might consider to take that systematic lag into account.

Specifically, let the parent block be in slot *p* < *s*, and let the clock value
of slot *p* be *c* (in nanoseconds). For the clock value in slot *s* to be
correct, the clock value of block *b* must be strictly higher than *c*, and at
most *c*+(*s*–*p*) x 2*T*, where 2*T* is two times the block time, currently *T*
= 400 ms. We assume that a correct leader inserts its correct local time as long
as it is within the allowed bounds. If the correct local time is out of bounds,
the leader inserts the minimum or maximum allowed time.

If the clock value for slot s is not within the bounds (*c*, *c*+(*s*–*p*) x
2*T*], the proposed block *b* is invalid, and validators vote skip. Currently
2*T* = 8e8.

## Analysis

The design of this clock is simple on purpose to keep the overhead low. To the
best of our knowledge no program needs super accurate clocks, usually an
accuracy in the order of a few seconds is perfectly sufficient.

The standard Alpenglow assumption is that we have less than 20% byzantine stake.
With at most 20% crashed stake in addition, we have at least 60% stake which is
correct. The 60% correct stake can correct any clock inaccuracies introduced by
the 20% byzantine stake. Slots are supposed to take 400 ms, and in reality slot
times are close to that value.

For the analysis, let us assume that in each leader window, a leader is chosen
randomly according to our worst-case distribution, i.e., 20% of the leaders are
byzantine (assuming worst case behavior, i.e. corrupt leaders either always halt
the clock or push it maximally), 20% are skipped (the whole leader window is
only 400 ms instead of 1,600 ms), and 60% of the leaders are correct (fix the
clock as much as possible with the given bounds).

We simulate the above algorithm with these worst-case assumptions. In the
simulation, the average clock skew we witness is about 1 second. For a 1 hour
window, the worst clock skew (the average largest skew in a 1 hour window) is
about 10 seconds. Such a high clock skew can happen if we are unlucky and we
experience several consecutive byzantine leader windows with the leaders either
halting the clocks or advancing them maximally.

In practice, we will probably see much lower levels of byzantine and crashed
leaders, which brings the average clock skew to around 20 ms.

## Implementation

We next articulate implementation details for the Alpenglow clock.

## Block Marker Modifications

In proposing a block, a leader will include a special marker called a "block
footer," which stores a UNIX timestamp (in nanoseconds). As of the writing of
this SIMD, the `block_producer_time_nanos` field of `BlockFooterV1` stores the
clock:

```
/// Version 1 block footer containing production metadata.
///
/// The user agent bytes are capped at 255 bytes during serialization to prevent
/// unbounded growth while maintaining reasonable metadata storage.
///
/// # Serialization Format
/// ```text
/// ┌─────────────────────────────────────────┐
/// │ Producer Time Nanos (8 bytes) │
/// ├─────────────────────────────────────────┤
/// │ User Agent Length (1 byte) │
/// ├─────────────────────────────────────────┤
/// │ User Agent Bytes (0-255 bytes) │
/// └─────────────────────────────────────────┘
/// ```
#[derive(Clone, PartialEq, Eq, Debug)]
pub struct BlockFooterV1 {
pub block_producer_time_nanos: u64,
pub block_user_agent: Vec<u8>,
}
```

Upon receiving a block footer from a leader, non-leader validators will:

- Locally update the clock sysvar associated with the bank
- During replay, validate the bounds of this clock sysvar with respect to the
parent bank's clock sysvar and proceed as outlined in "Detailed Design"
Comment on lines +116 to +120
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question about this - how does replay work? Do we update the clock sysvar from the block footer and then start replaying? If so, doesn't that mean we can't start replaying until we've received the block footer?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it happens when we replay the block footer. So effectively we are setting the clock sysvar at the end of the block for use in the child block

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"during replay" should be more precise?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, useful to clarify this.

@rogerANZA - I updated the doc. The relevant part:

Non-leader validators must apply the following block footer logic after transactions in a block have been replayed:

  • Locally update the clock sysvar associated with the bank
  • Validate the bounds of this clock sysvar with respect to the parent bank's clock sysvar as outlined in "Detailed Design"

For avoidance of doubt - while replaying transactions in a block with slot S, the clock sysvar has value specified in S.parent’s block footer.


## Alpenglow Clock Variable Storage

The clock sysvar, as currently implemented, has second-resolution. This SIMD
proposes a clock with greater resolution (nanoseconds). Accordingly, we
introduce an off-curve account where we store the clock value upon processing
the block footer (at the end of a block).

In particular, we calculate the off-curve account as follows:

```rust
/// The off-curve account where we store the Alpenglow clock. The clock sysvar
/// has seconds resolution while the Alpenglow clock has nanosecond resolution.
static NANOSECOND_CLOCK_ACCOUNT: LazyLock<Pubkey> = LazyLock::new(|| {
let (pubkey, _) = Pubkey::find_program_address(
&[b"alpenclock"],
&agave_feature_set::alpenglow::id(),
);
pubkey
});
```

where `Pubkey::find_program_address` is a part of `solana-address-1.0.0`. In
practice, the `Pubkey` ends up having value
`BKRDmw2hTDSxQK4mitpK7eCWkNUvCvnaWqm1NZmGDTUm`.

The clock sysvar is updated at the end of each block by simply dividing the
nanosecond timestamp by `1_000_000_000` and rounding down (i.e., integer
division). For now, we do not expose the nanosecond clock to SPL programs via a
sysvar; we may choose to do so in the future.


## Alternatives Considered

We discussed several designs for more complicated and potentially more accurate
clocks. In particular we considered clock constructions where nodes would only
consider a block correct if their internal clock (driven by the Internet Network
Time Protocol NTP) was “close enough” to the clock value proposed by the leader.
However, this design has many problems, and one can possibly even disadvantage
the next leader by pushing the clock to the allowed limit.

Alternatively, for a highly accurate clock, we need to have the nodes solve
an approximate agreement problem. This is similar to the current clock design,
where all (or a selected
random sample of nodes) publish their local times, and then we (repeatedly) take
the median of the reported values. In principle, such an approach will give us a
very accurate clock. However, the cost of such a clock would be high. We believe
it is better if programs work with just reasonably accurate clocks.

## Impact

Developers and validators must not rely on a highly precise system clock. If an
application relies on a highly accurate clock, it should consider alternative
sources.


The semantics of the clock sysvar have slightly changed: The clock value does
no longer represent the start time of the current block. It now represents the
time at which the last slice of the parent block was produced.

Also, while the new clock has a nanosecond resolution, we compute a second
resolution clock by dividing by 1_000_000_000 and rounding down.

## Security Considerations

Comment on lines +184 to +185
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming my understanding is correct that:

  • If no blocks are produced then the next leader can set the time to be the same as last (no change) or up to 2 times time since last block;
  • OR; If no slots are produced (no skips) that the block will simply just halt with no chance for the next leader to place the correct time.

Then I think the following risk scenario is worth considering:

  • Money market (MM), let say marginfi, uses pyth pull oracles where a user is able to update the oracle and then use the MM.
  • Chain halts for 1 hour, now the clock is 1 hour behind, imagine the chain halt coincided with market volatility.
  • Attacker uses 1 hour old pyth payload to open insolvent positions on MM.

This is an example where the clock is not off by a few seconds but can be off by hours. In this scenario most current MMs would be vulnerable as they use the following code:

        let price = price_feed_account
            .get_price_no_older_than_with_custom_verification_level(
                clock,
                max_age,
                feed_id,
                MIN_PYTH_PUSH_VERIFICATION_LEVEL,
            )
            .map_err(|e| {
                debug!("Pyth push oracle error: {:?}", e);
                let error: MarginfiError = e.into();
                error
            })?;

Which in turn relies on the following check:

        check!(
            price
                .publish_time
                .saturating_add(maximum_age.try_into().unwrap())
                >= clock.unix_timestamp,
            GetPriceError::PriceTooOld
        );

This check will be completely broken until the clock catches up, allowing stale prices to be pushed.

Caveat

The same issue exists in the current clock which I believe will have the price for block N use the votes from N-1 which of course will be pre halt and thus stale. That said, it will correct within a few slots as opposed to this clock which will have a much longer vulnerability window.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming my understanding is correct that:

  • If no blocks are produced then the next leader can set the time to be the same as last (no change) or up to 2 times time since last block;

Basically, yes. However, the new value has to be strictly higher (just by 1 tick though).

... Chain halts for 1 hour ...

Wait-what? So you're saying that the whole chain is down for a full hour? None of expected 9,000 blocks in that hour appended? I would say that in this case we have much bigger problems, don't we?

What we could do of course is to narrow the time the leader can choose in such a case, in the most extreme case even narrow it down to basically 1 hour +/- 400 ms. This is an oversimplification because if the chain was really down for 1 hour, our emergency protocol would kick in, and slots would eventually get exponentially longer)

But, essentially, we could change the formula, and give the leader a narrower window of choice if the chain was down for a very long time.

Would that make it better?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the core protocol is sound and desirable in 99.99% of operation. In the 0.01% where we have a chain halt it would be ideal if:

  • The protocol restarted with the correct time
  • OR; There was a way for onchain applications to detect that a halt/emergency mode had occurred

The former of the two is desirable as it protects applications that rely on semi accurate time without those applications needing to change their logic.

So you're saying that the whole chain is down for a full hour? None of expected 9,000 blocks in that hour appended? I would say that in this case we have much bigger problems, don't we?

It's entirely feasible that the Solana blockchain halts during a liquidation cascade that promptly reverts. Now if money markets come back online after the recovery they should have minimal bad debt. However, if the clock lags then these money markets will observe all of those prices as the clock catches up. In this case an attacker will be able to open positions at the maximally depressed prices resulting in much more bad debt and potentially blowing up all on chain lending protocols in the worst case.

The reason chain halts are bad are:

  1. Continuous services stop (CLOBs, payments). not scary
  2. Protocol assumptions breakdown (liquidations, oracle update timeliness). very scary

Unfortunately the only way to address 2.liquidations is to have 100% uptime. However, we can address oracle update timeliness by providing an accurate post restart clock.

But, essentially, we could change the formula, and give the leader a narrower window of choice if the chain was down for a very long time.

Would that make it better?

This sounds perfect, how easy is this to achieve?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds perfect, how easy is this to achieve?

Not difficult. The question is how exactly we would be doing it.

  1. The simplest way is narrowing the window (as suggested in the previous answer).

There might be more elaborate ways (also easy to implement):

  1. For instance, after a long outage without any blocks, have the next 3 or 5 (or your favorite odd number) leaders propose a time and we take the median as a new starting time. But there are downsides to this solution, in particular we would have 3 (or 5) blocks without a good time.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE solution 1: For a 1 hour outage, you say that choosing a new time in [epsilon,2 hours] is too much freedom. What makes most sense? [1 hour - delta, 1 hour + delta] for what delta? What's the maximum delta that still makes sense? (A larger delta is better because it increases the chances that the actual time is in the interval.)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I guess the question might boil down to, is it worse to:

  • Take a dependency on NTP (but only during a restart/major slot skip).
  • OR; Accept it's possible for the clocking to be 30m+ out of sync while its catching up.

If we take a dependency on NTP conditional on some critical failure having occurred prior, what do you see as the increased risk to the protocol? I would assume we now have a risk that a % of our validators have poisoned NTP and thus we cannot restart the protocol until the NTP issues are manually corrected/overridden?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A dependency on NTP is a big NO from my side. It directly affects consensus, and I cannot prove correctness of consensus anymore. And the 30 minute skew you only get after a 30 minute outage (which we will never have, fingers crossed).

BUT: It's good that you raised the question, and that we are aware of it. I will (eventually) think more seriously about it. At this point, I would suggest that we eventually have a special program which can be used to forward the blockchain time. Anybody can suggest to vote to jump to a future time, and then all the stakers can suggest a time, and then we take the median of that. If enough stake participates (send their time vote to the program in the allowed time frame), we jump there. I don't think we need to immediately implement this, but it's good to have the thought ready if we need it.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that sounds fair. Only thing to flag is that it would be ideal if the clock time was fixed before any defi transactions were processed. For instance if it took 32 slots (random number i chose) to fix the clock then ideally for the first 32 slots no user txs are processed to prevent the looting of vulnerable defi protocols.

Though agreed, ideally we don't have a 30 minute outage ever again... or at least it doesn't coincide with large market volatility

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could try to expose this information to the protocols and have the vulnerable protocols do this logic of dismissing all transactions for some time. But I would definitely not want to halt all user transactions just because some protocols have a bug.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a way to expose that the clock is/may be out of sync to the application layer then I agree that would solve this. My fear was that doing so would be roughly equivalent to reaching consensus on the clock time itself but if that's not the case then perfect.

A byzantine player could bribe a long series of consecutive leaders to either
halt the clock (or advance it 1s per slot) during a longer interval. If a
program accessing the clock is poorly designed and relies on a very accurate
clock, such an extended bribe could be profitable. If at the end of the bribe
period the clock was Delta (seconds) off, then it will take about Delta seconds
for the clock to go back to normal.

## Backwards Compatibility

This new clock is not as accurate as the current clock. So those users that run
programs that access the clock might need to adapt their code if they need a
high accuracy. We got the impression that this is not an issue in practice.

Loading