Skip to content

[epic] Create user-request load testing framework #2

@mollykarcher

Description

@mollykarcher

What problem does your feature solve?

We've built load testing infrastructure in order to validate RPC can handle increased network load (ie. TPS). However, we don't have any tools that can do something similar on the user-request volume side. Even omitting concerns around load, our release process for RPC is lacking any real-world-like traffic validation (as we have for Horizon), being that we don't run a pubnet RPC internally and therefore don't have much/any traffic to mirror in order to gain confidence.

What would you like to see?

This issue should probably transform into an epic as however we decide to do this will likely end up being a lot of work. Work would include:

  • Client Load generation/simulation tool that can be run against RPC
  • Assessment of existing RPC observability/metrics/dashboard and augmenting it with any additional metrics needed to accurate assess functional or non-functional performance degradations under different client request loads
  • Integrating this tool into our release process in some way. More automation is preferred here, so while manually running it during a release process would probably be acceptable, it might be nice to have it constantly running against the prod dev environment over time, so that we can catch changes across deployments

What alternatives are there?

Not necessarily alternatives, but some other niceties/add-ons that we may want to consider as part of this:

  • Migrating to true continuous deployment for RPC (potentially using ArgoCD Rollouts), and setup automated rollbacks/promotions based on changes in key metric values (assuming constant traffic from load testing tool hitting the deployments)
  • Open-sourcing/productionalizing the tool, such that RPC providers or operators could use it to test their own deployments and know when they might need to horizontally scale their setups
  • Deploying a separate/dedicated instance of RPC specifically for load-testing, independent of our current dev cluster. Perhaps this is just repurposing/renaming the new io2 instance we just spun up

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    In Progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions