Skip to content

Conversation

m30m
Copy link
Contributor

@m30m m30m commented May 2, 2025

Summary

A set of APIs to expose recent history of processed events in fortuna

Rationale

How has this been tested?

  • Current tests cover my changes
  • Added new tests
  • Manually tested the code

@vercel
Copy link

vercel bot commented May 2, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
api-reference ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 19, 2025 11:18pm
component-library ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 19, 2025 11:18pm
entropy-debugger ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 19, 2025 11:18pm
entropy-explorer ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 19, 2025 11:18pm
insights ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 19, 2025 11:18pm
proposals ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 19, 2025 11:18pm
staking ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 19, 2025 11:18pm

.route("/metrics", get(metrics))
.route("/ready", get(ready))
.route("/v1/chains", get(chain_ids))
.route("/v1/explorer", get(explorer))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name of this endpoint should be a noun indicating the kind of thing being returned.

(we follow REST conventions on http endpoints. https://restfulapi.net/resource-naming/ )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case like /v1/logs or something would be appropriate. and ? query parameters are fine

self.by_time
.insert(current_time, (chain_id.clone(), sequence));

if self.by_time.len() > Self::MAX_HISTORY {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this point you're going to kick out the earliest one from all of the different indexes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but not a priority TBH. As long as we restart every few months (which we definitely do), this will not become an issue.

ChainAndSequence,
ChainAndTimestamp,
Timestamp,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think you should take this mode parameter. Instead, take all the filters and intersect them. it's much more natural (and powerful)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Either of the indexing ideas ^ will make this functionality much easier to support)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with great power comes great responsibility :D
I wanted to avoid footguns in the ui. For example we know the transaction hash but the chain id is set to something else by mistake, I'd rather return an error than an empty set.

}

#[derive(Clone)]
pub struct ProcessParams {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Observed { tx_hash: TxHash },
FailedToReveal { reason: String },
Revealed { tx_hash: TxHash },
Landed { block_number: BlockNumber },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we need so many different types of log entries. I think at the end of the day (after we upgrade the contracts), requests can be in literally 2 states: pending (meaning we've seen it but haven't sent the callback) or complete. If it's complete, the result may or may not be an error.

The representation you are using for those two states (a vector of these log entries) has a much larger state space. I don't see why that's necessary

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also I presume you're going to add a bunch of additional fields to these log entries? E.g., all the stuff emitted in the request event, all the stuff emitted in the callback

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, more fields will be added. I think more verbosity is helpful for internal debugging + calculating more metrics such as our landing latency, observation latency, etc. Many things can go wrong between each of these steps and knowing the latest state is very powerful

pub by_hash: HashMap<TxHash, Vec<RequestKey>>,
pub by_chain_and_time: BTreeMap<(ChainId, DateTime<chrono::Utc>), RequestKey>,
pub by_time: BTreeMap<DateTime<chrono::Utc>, RequestKey>,
pub by_request_key: HashMap<RequestKey, RequestJournal>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doing all of this indexing manually kind of sucks. I think there are two other reasonable approaches:

  1. use a free text search library like indicium and throw all the searchable attributes in as strings. You could include a prefix to indicate type like:

"chain_id:blast sequence:7 tx:01249e"

you still have to manually do the time ordering and the eviction stuff. However, I think that's pretty easy to do. You can make by_time: BTreeMap<RequestKey, RequestJournal> and then implement Ord for RequestKey

indicium https://docs.rs/indicium/0.6.5/indicium/index.html also gives you fuzzy matching and autocomplete which are potentially useful.

  1. use sqlite and make an empty database on server start

TimedJournalLog::with_current_time(JournalLog::Observed {
tx_hash: event.tx_hash,
}),
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should use a channel here to asynchronously send the events over to a separate thread that adds them to history. If we lock here, then someone could potentially down the keeper thread by spamming the /v1/explorer endpoint. If that happens, I'd rather drop events from the history than fail callbacks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more thing to watch out for: this code will execute multiple times for every single sequence number, and in some cases the event information will differ. (specifically, if there's a chain reorg, then the same sequence number will represent two different events)

Copy link
Contributor

@jayantk jayantk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! I suspect there will still be issues with the event logging bit in process_event, but best thing to do is probably to put this out there and see what happens.

## Build & Test

We use sqlx query macros to check the SQL queries at compile time. This requires
a database to be available at build time. Create a `.env` file in the root of the project with the following content:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do this via the config file? it's confusing to have multiple different sources of configuration

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i looked at the sqlx docs and it doesn't seem that straightforward to use the config :(

I guess the main thing here is it would be nice if people can show up and the project builds as normal. Can we put in a sane default for where the database goes and commit that to the repo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can commit this .env file here I guess 🤔

@@ -0,0 +1,3 @@
#!/bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need a script for this? It's used one time in CI, and the command it runs seems standard

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-commit didn't have any good config for changing directory

.await,
));
}
return Err(RestError::InvalidQueryString);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the queries are going to come from a text box, and users may type into that incrementally, and you could imagine we would want to live-update results as they type

sequence INTEGER NOT NULL,
created_at DATETIME NOT NULL,
last_updated_at DATETIME NOT NULL,
state VARCHAR(10) NOT NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think using an integer type and a Rust enum would be better than a string here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I don't feel strongly about this)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this because we can just open up the sqlite db and it's readable

request_tx_hash)
.execute(pool)
.await
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm we may also need a "recovered" state that's the same as completed but triggers if there's a failure that is then manually recovered. Don't need to handle in this PR though.

.await,
));
}
return Err(RestError::InvalidQueryString);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway i can see the argument the other way too here, so no need to change anything for this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants