Skip to content

Persistent to Hasql #1938

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: master
Choose a base branch
from
Open

Persistent to Hasql #1938

wants to merge 21 commits into from

Conversation

Cmdv
Copy link
Contributor

@Cmdv Cmdv commented Jan 30, 2025

Hasql Migration

Summary

The migration from Persistent to Hasql delivered a 3-4x performance improvement in epoch processing time, with some gains in cache efficiency.

Processing Time Comparison

Before (Persistent)

  • Epoch 432: 29 minutes
  • Epoch 433: 32 minutes
  • Average: ~30.5 minutes per epoch

After (Hasql)

  • Epoch 441: 8m 14s
  • Epoch 442: 9m 19s
  • Average: ~8.7 minutes per epoch

Improvement: 73% reduction in processing time (3.5x faster)

Resource Utilisation

Database Connections

  • Before: ORM connection pooling with abstraction overhead
  • After: Direct PostgreSQL connection management
  • Benefit: Reduced connection overhead and more efficient resource usage

Technical Factors Contributing to Performance

1. Eliminated ORM Overhead

  • No runtime type reflection
  • No query generation overhead
  • Direct SQL statement preparation

2. Optimised SQL Generation

  • Hand-crafted SQL statements
  • Better query plans
  • Reduced parsing overhead

3. Type-Safe Operations

  • Compile-time validation reduces runtime checks
  • Efficient encoder/decoder patterns
  • Zero-copy data transformations where possible

4. Connection Management

  • Direct PostgreSQL protocol usage
  • Reduced middleware layers
  • Better connection reuse

Epoch Context

Based on Cardano network data, each epoch processes approximately 21,600 blocks over a 5-day period. The performance improvement scales linearly with blockchain growth, making this migration critical for long-term sustainability.

Conclusion

The Hasql migration represents a significant infrastructure improvement, providing both immediate performance gains and better long-term scalability. The 73% reduction in processing time, positions the system for handling increased blockchain throughput.

@Cmdv Cmdv self-assigned this Jan 30, 2025
@Cmdv Cmdv force-pushed the Persistent-to-Hasql branch from f80c67b to 96182d8 Compare March 14, 2025 09:16
@Cmdv Cmdv force-pushed the Persistent-to-Hasql branch 2 times, most recently from 89285ee to 2db7312 Compare April 3, 2025 12:16
@Cmdv Cmdv force-pushed the Persistent-to-Hasql branch from 7c34694 to f434016 Compare May 8, 2025 15:09
@Cmdv Cmdv force-pushed the Persistent-to-Hasql branch 5 times, most recently from a67562c to 94f48af Compare June 30, 2025 22:20
@Cmdv Cmdv marked this pull request as ready for review June 30, 2025 22:21
@Cmdv Cmdv requested a review from a team as a code owner June 30, 2025 22:21
@Cmdv Cmdv force-pushed the Persistent-to-Hasql branch 5 times, most recently from cdab159 to 8ce46c6 Compare July 11, 2025 12:41
@Cmdv Cmdv requested a review from a team as a code owner July 11, 2025 12:41
@Cmdv Cmdv force-pushed the Persistent-to-Hasql branch 5 times, most recently from 08d56ec to f237b25 Compare July 11, 2025 18:15
@Cmdv Cmdv mentioned this pull request Jul 16, 2025
9 tasks
Cmdv added a commit that referenced this pull request Jul 17, 2025
@Cmdv Cmdv force-pushed the Persistent-to-Hasql branch 2 times, most recently from ec1082f to fe197d6 Compare July 17, 2025 11:11
Copy link
Contributor

@kderme kderme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a pass of the cardano-db-sync diffs

then logWarning trce $ Db.renderMigrationValidateError unknown
else logError trce $ Db.renderMigrationValidateError unknown
then logWarning trce $ DB.renderMigrationValidateError unknown
else logError trce $ DB.renderMigrationValidateError unknown

logInfo trce "Schema migration files validated"

let runMigration mode = do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid duplication between runMigration and runIndexesMigration. This can be done by defining a single function in runDbSyncNode with type RunMigration and passing it to both functions.

isFetchError :: OffChainVoteResult -> Bool
isFetchError = \case
OffChainVoteResultMetadata {} -> False
OffChainVoteResultError {} -> True

processResultsBatched :: MonadIO m => [OffChainVoteResult] -> DB.DbAction m ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to use pipeline here. Personal note to review it again.

pure False
mBlockNo <- lift $ DB.queryBlockHashBlockNo (SBS.fromShort . getOneEraHash $ blockPointHash blk)
case mBlockNo of
Nothing -> throwError $ SNErrRollback "Rollback.prepareRollback: queryBlockHashBlockNo: Block hash not found"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general the whole purpose of ExceptT is to have better "flow" and to avoid case patterns for errors. I think we lose this in many places.

@Cmdv Cmdv force-pushed the Persistent-to-Hasql branch from 6ad7d4d to 87c8789 Compare July 30, 2025 21:46

-- Parallel preparation of independent data
(preparedTxIn, preparedMetadata, preparedMint, txOutChunks) <- liftIO $ do
a1 <- async $ pure $ prepareTxInProcessing syncEnv grouped
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This spawns threads, but they do so minimal work that I think it's not worth it. Due to laziness, they may even do no work at all.

@Cmdv Cmdv force-pushed the Persistent-to-Hasql branch 3 times, most recently from b2df816 to 80cc82b Compare August 7, 2025 13:56
@Cmdv Cmdv force-pushed the Persistent-to-Hasql branch 3 times, most recently from 779de8b to 54e8f11 Compare August 15, 2025 22:20
@Cmdv Cmdv force-pushed the Persistent-to-Hasql branch from d1362d8 to 1e41b5a Compare August 19, 2025 12:36
@Cmdv Cmdv force-pushed the Persistent-to-Hasql branch from 1e41b5a to 1143664 Compare August 19, 2025 13:19
@Cmdv Cmdv force-pushed the Persistent-to-Hasql branch from f3b215c to 1fb3c0e Compare August 20, 2025 13:56
maTxOutIdAddressToText _ = "" -- Skip non-variant IDs

--------------------------------------------------------------------------------
textToMinIds :: TxOutVariantType -> Text -> Maybe MinIdsWrapper
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't match minIdsCoreToText. Should revert the old parser in Cardano.Db.Operations.Other.MinId, except toSqlKey.

updateBlockMetrics :: IO ()
updateBlockMetrics = do
let metricsSetters = envMetricSetters syncEnv
void $ async $ DB.runDbDirectLogged (fromMaybe mempty $ DB.dbTracer $ envDbEnv syncEnv) (envDbEnv syncEnv) $ do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using async, or use it with a different connection from the connection pool

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants