Skip to content

Commit 1cd6858

Browse files
committed
Solana. Outbound traffic throttling feature
1 parent 57be4ca commit 1cd6858

17 files changed

+69
-42
lines changed

lib/solana/README.md

Lines changed: 43 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,16 @@ Solana nodes on AWS can be deployed in 2 different configurations: base RPC and
2626
3. The Solana nodes use all required secrets locally, but optionally can store a copy in [AWS Secrets Manager](https://docs.aws.amazon.com/secretsmanager/latest/userguide/intro.html) as secure backup.
2727
4. The Solana nodes send various monitoring metrics for both EC2 and Solana nodes to Amazon CloudWatch.
2828

29+
### Optimizing Data Transfer Costs
30+
31+
Solana Agave clients generate significant outbound traffic, ranging from 80 to 200+ TiB monthly in recent years. To manage associated costs, the blueprint includes an outbound traffic optimization feature that automatically monitors and adjusts bandwidth usage.
32+
33+
The system works by tracking the node's "Slots Behind" metric after the initial sync is done. When this metric reaches zero, indicating the node is fully synced, the system applies a user-defined bandwidth limit specified in the `SOLANA_LIMIT_OUT_TRAFFIC_MBPS` variable of your `.env` file. If the slots behind metric exceeds 100, the limit is temporarily removed until the node catches up. While the default outbound bandwidth limit is set to 20 Mbit/s (~6.5 TiB/month), testing has shown that nodes can maintain synchronization even at speeds as low as 15 Mbit/s. Inbound bandwidth remains unrestricted.
34+
35+
To maintain operational efficiency, the system excludes internal network traffic from these restrictions. Traffic within standard internal IP ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 169.254.0.0/16) remains unrestricted, ensuring that AWS applications using internal IPs function normally. This optimization can reduce data transfer costs by over 90%.
36+
37+
It's important to note that while this feature is highly effective for RPC nodes, it should not be implemented on consensus nodes. Restricting outbound traffic on consensus nodes can compromise performance and is not recommended for optimal network participation.
38+
2939
## Additional materials
3040

3141
<details>
@@ -84,8 +94,8 @@ This is the Well-Architected checklist for Solana nodes implementation of the AW
8494

8595
| Usage pattern | Ideal configuration | Primary option on AWS | Data Transfer Estimates | Config reference |
8696
|---|---|---|---|---|
87-
| 1/ Base RPC node (no secondary indexes) | 48 vCPU, 384 GiB RAM, Accounts volume: EBS gp3, 500GiB, 7K IOPS, 700 MB/s throughput, Data volume: EBS gp3, 2TB, 9K IOPS, 700 MB/s throughput | r7a.12xlarge, Accounts volume: EBS gp3, 500GiB, 7K IOPS, 700 MB/s throughput, Data volume: EBS gp3, 2TB, 9K IOPS, 700 MB/s throughput | 13-15TB/month (no staking) | [.env-sample-baserpc-x86](./sample-configs/.env-sample-baserpc-x86) |
88-
| 2/ Extended RPC node (with all secondary indexes) | 96 vCPU, 768 GiB RAM, Accounts volume: 500GiB, 7K IOPS, 700 MB/s throughput, Data volume: 2TB, 9K IOPS, 700 MB/s throughput | I8g.18xlarge, Accounts volume: Instance Store, Data volume: Instance Store | 20-38TB/month (no staking) | [.env-sample-extendedrpc-arm](./sample-configs/.env-sample-extendedrpc-arm) |
97+
| 1/ Base RPC node (no secondary indexes) | 48 vCPU, 384 GiB RAM, Accounts volume: EBS gp3, 500GiB, 7K IOPS, 700 MB/s throughput, Data volume: EBS gp3, 2TB, 9K IOPS, 700 MB/s throughput | r7a.12xlarge, Accounts volume: EBS gp3, 500GiB, 7K IOPS, 700 MB/s throughput, Data volume: EBS gp3, 2TB, 9K IOPS, 700 MB/s throughput | 100-200TB/month (no staking) | [.env-sample-baserpc-x86](./sample-configs/.env-sample-baserpc-x86) |
98+
| 2/ Extended RPC node (with all secondary indexes) | 96 vCPU, 768 GiB RAM, Accounts volume: 500GiB, 7K IOPS, 700 MB/s throughput, Data volume: 2TB, 9K IOPS, 700 MB/s throughput | r7a.24xlarge, Accounts volume: EBS io2, 500GiB, 10K IOPS, Data volume: EBS io2, 2000GiB, 30K IOPS | 100-200TB/month (no staking) | [.env-sample-extendedrpc-arm](./sample-configs/.env-sample-extendedrpc-arm) |
8999
</details>
90100

91101
## Setup Instructions
@@ -305,6 +315,37 @@ free -g
305315
sudo sysctl vm.swappiness=10
306316
```
307317

318+
6. How can I check network throttling configuration currently applied to the instance?
319+
320+
```bash
321+
# Check iptables manage table
322+
iptables -t mangle -L -n -v
323+
324+
# Set network interface ID
325+
INTERFACE=$(ip -br addr show | grep -v '^lo' | awk '{print $1}' | head -n1)
326+
327+
# Check traffic control (tc) tool configuration
328+
tc qdisc show
329+
330+
# Watch current traffic
331+
tc -s qdisc ls dev $INTERFACE
332+
333+
# Monitor bandwidth in real-time
334+
iftop -i $INTERFACE
335+
```
336+
337+
7. How to manually remove all iptables and tc rules?
338+
339+
```bash
340+
# Remove tc rules
341+
tc qdisc del dev $INTERFACE root
342+
343+
# Remove iptables rules
344+
iptables -t mangle -D OUTPUT -j MARKING
345+
iptables -t mangle -F MARKING
346+
iptables -t mangle -X MARKING
347+
```
348+
308349
## Upgrades
309350

310351
When nodes need to be upgraded or downgraded, [use blue/green pattern to do it](https://aws.amazon.com/blogs/devops/performing-bluegreen-deployments-with-aws-codedeploy-and-auto-scaling-groups/). This is not yet automated and contributions are welcome!

lib/solana/app.ts

Lines changed: 3 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -21,34 +21,15 @@ new SolanaSingleNodeStack(app, "solana-single-node", {
2121
stackName: `solana-single-node-${config.baseNodeConfig.nodeConfiguration}`,
2222
env: { account: config.baseConfig.accountId, region: config.baseConfig.region },
2323

24-
instanceType: config.baseNodeConfig.instanceType,
25-
instanceCpuType: config.baseNodeConfig.instanceCpuType,
26-
solanaCluster: config.baseNodeConfig.solanaCluster,
27-
solanaVersion: config.baseNodeConfig.solanaVersion,
28-
nodeConfiguration: config.baseNodeConfig.nodeConfiguration,
29-
dataVolume: config.baseNodeConfig.dataVolume,
30-
accountsVolume: config.baseNodeConfig.accountsVolume,
31-
solanaNodeIdentitySecretARN: config.baseNodeConfig.solanaNodeIdentitySecretARN,
32-
voteAccountSecretARN: config.baseNodeConfig.voteAccountSecretARN,
33-
authorizedWithdrawerAccountSecretARN: config.baseNodeConfig.authorizedWithdrawerAccountSecretARN,
34-
registrationTransactionFundingAccountSecretARN: config.baseNodeConfig.registrationTransactionFundingAccountSecretARN,
24+
...config.baseNodeConfig
3525
});
3626

3727
new SolanaHANodesStack(app, "solana-ha-nodes", {
3828
stackName: `solana-ha-nodes-${config.baseNodeConfig.nodeConfiguration}`,
3929
env: { account: config.baseConfig.accountId, region: config.baseConfig.region },
4030

41-
instanceType: config.baseNodeConfig.instanceType,
42-
instanceCpuType: config.baseNodeConfig.instanceCpuType,
43-
solanaCluster: config.baseNodeConfig.solanaCluster,
44-
solanaVersion: config.baseNodeConfig.solanaVersion,
45-
nodeConfiguration: config.baseNodeConfig.nodeConfiguration,
46-
dataVolume: config.baseNodeConfig.dataVolume,
47-
accountsVolume: config.baseNodeConfig.accountsVolume,
48-
49-
albHealthCheckGracePeriodMin: config.haNodeConfig.albHealthCheckGracePeriodMin,
50-
heartBeatDelayMin: config.haNodeConfig.heartBeatDelayMin,
51-
numberOfNodes: config.haNodeConfig.numberOfNodes,
31+
...config.baseNodeConfig,
32+
...config.haNodeConfig,
5233
});
5334

5435

lib/solana/lib/assets/instance/network/net-rules-start.sh

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,7 @@ if [ -n "$1" ]; then
66
else
77
echo "Warning: Specify max value for outbound data traffic in Mbps."
88
echo "Usage: net-rules.sh <max_bandwidth_mbps>"
9-
echo "Default is 26"
10-
LIMIT_OUT_TRAFFIC_MBPS=26
9+
exit 1;
1110
fi
1211

1312
# Step 1: Create an iptables rule to mark packets going to public IPs

lib/solana/lib/assets/instance/network/net-rules-stop.sh

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
#!/bin/bash
22

3+
INTERFACE=$(ip -br addr show | grep -v '^lo' | awk '{print $1}' | head -n1)
4+
35
# Remove tc rules
4-
/usr/sbin/tc qdisc del dev eth0 root
6+
/usr/sbin/tc qdisc del dev $INTERFACE root
57

68
# Remove iptables rules
79
/usr/sbin/iptables -t mangle -D OUTPUT -j MARKING
Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[Unit]
2-
Description=ipables and Traffic Control Rules
2+
Description="ipables and Traffic Control Rules"
33
After=network.target
44

55
[Service]
@@ -9,5 +9,4 @@ ExecStart=/opt/instance/network/net-rules-start.sh _LIMIT_OUT_TRAFFIC_MBPS_
99
ExecStop=/opt/instance/network/net-rules-stop.sh
1010

1111
[Install]
12-
WantedBy=multi-user.target
13-
EOF
12+
WantedBy=multi-user.target

lib/solana/lib/assets/instance/network/net-sync-checker.timer

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
[Unit]
2-
Description="Run Network Sync checker service every 5 min"
2+
Description="Run Network Sync checker service every 1 min"
33

44
[Timer]
5-
OnCalendar=*:*:0/5
5+
OnCalendar=*:*:0/1
66
Unit=net-sync-checker.service
77

88
[Install]

lib/solana/lib/assets/instance/network/net-syncchecker.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ INIT_COMPLETED_FILE=/data/data/init-completed
55
TOKEN=$(curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
66
EC2_INTERNAL_IP=$(curl -H "X-aws-ec2-metadata-token: $TOKEN" -s http://169.254.169.254/latest/meta-data/local-ipv4)
77

8+
# Start checking the sync node status only after the node has finished the initial sync
89
if [ -f "$INIT_COMPLETED_FILE" ]; then
910
SOLANA_SLOTS_BEHIND_DATA=$(curl -s -X POST -H "Content-Type: application/json" -d ' {"jsonrpc":"2.0","id":1, "method":"getHealth"}' http://$EC2_INTERNAL_IP:8899 | jq .error.data)
1011
SOLANA_SLOTS_BEHIND=$(echo $SOLANA_SLOTS_BEHIND_DATA | jq .numSlotsBehind -r)

lib/solana/lib/assets/instance/network/setup.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ if [ -n "$1" ]; then
66
else
77
echo "Warning: Specify max value for outbound data traffic in Mbps."
88
echo "Usage: instance/network/setup.sh <max_bandwidth_mbps>"
9-
LIMIT_OUT_TRAFFIC_MBPS=26
9+
exit 1;
1010
fi
1111

1212
INTERFACE=$(ip -br addr show | grep -v '^lo' | awk '{print $1}' | head -n1)

lib/solana/lib/assets/user-data-ubuntu.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ systemctl restart amazon-cloudwatch-agent
128128

129129
systemctl daemon-reload
130130

131-
if [[ "$LIMIT_OUT_TRAFFIC_MBPS" == "true" ]]; then
131+
if [[ "$LIMIT_OUT_TRAFFIC_MBPS" -gt 0 ]]; then
132132
echo "Limiting out traffic"
133133
/opt/instance/network/setup.sh
134134
fi

lib/solana/lib/config/node-config.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ export const baseNodeConfig: configTypes.SolanaBaseNodeConfig = {
5252
voteAccountSecretARN: process.env.SOLANA_VOTE_ACCOUNT_SECRET_ARN || "none",
5353
authorizedWithdrawerAccountSecretARN: process.env.SOLANA_AUTHORIZED_WITHDRAWER_ACCOUNT_SECRET_ARN || "none",
5454
registrationTransactionFundingAccountSecretARN: process.env.SOLANA_REGISTRATION_TRANSACTION_FUNDING_ACCOUNT_SECRET_ARN || "none",
55-
limitOutTrafficMbps: process.env.SOLANA_LIMIT_OUT_TRAFFIC_MBPS ? parseInt(process.env.SOLANA_LIMIT_OUT_TRAFFIC_MBPS) : 25,
55+
limitOutTrafficMbps: process.env.SOLANA_LIMIT_OUT_TRAFFIC_MBPS ? parseInt(process.env.SOLANA_LIMIT_OUT_TRAFFIC_MBPS) : 20,
5656
};
5757

5858
export const haNodeConfig: configTypes.SolanaHAConfig = {

0 commit comments

Comments
 (0)