|
| 1 | +# Sample AWS Blockchain Node Runner app for Tezos Nodes |
| 2 | + |
| 3 | +| Contributed by | |
| 4 | +|:--------------------:| |
| 5 | +| [@AhGhanima](https://github.com/AhGhanima), [@chrisdotn](https://github.com/chrisdotn) | |
| 6 | + |
| 7 | +## Architecture Overview |
| 8 | + |
| 9 | +This blueprint has two options for running nodes. You can set up a single JSON RPC node or multiple nodes in highly-available setup. The details are below. |
| 10 | + |
| 11 | +### Single RPC node setup |
| 12 | + |
| 13 | + |
| 14 | +This setup is for small scale PoC or development environments. It deploys a single EC2 instance with the tezos client. The RPC port is exposed only to internal IP range of the VPC, while P2P ports allow external access to keep the client synced. |
| 15 | + |
| 16 | +### Highly available setup |
| 17 | + |
| 18 | + |
| 19 | +1. An ongoing data synchronization process is configured with nodes in the Tezos network with a sync node and RPC nodes. |
| 20 | +2. The sync node is used to create a copy of node's state data in Amazon S3 bucket. |
| 21 | +3. When new RPC nodes are provisioned, they copy state data from Amazon S3 bucket to speed up the initial sync process. |
| 22 | +4. Applications and smart contract development tools access highly available RPC nodes behind the Application Load Balancer. |
| 23 | + |
| 24 | + |
| 25 | +## Solution Walkthrough |
| 26 | + |
| 27 | +### Setup Cloud9 |
| 28 | + |
| 29 | +We will use AWS Cloud9 to execute the subsequent commands. Follow the instructions in [Cloud9 Setup](../../docs/setup-cloud9.md) |
| 30 | + |
| 31 | +### Clone this repository and install dependencies |
| 32 | + |
| 33 | +```bash |
| 34 | + git clone https://github.com/aws-samples/aws-blockchain-node-runners.git |
| 35 | + cd aws-blockchain-node-runners |
| 36 | + npm install |
| 37 | +``` |
| 38 | + |
| 39 | +**NOTE:** In this tutorial we will set all major configuration through environment variables, but you also can modify parameters in `config/config.ts`. |
| 40 | + |
| 41 | +### Prepare to deploy nodes |
| 42 | + |
| 43 | +1. Make sure you are in the root directory of the cloned repository |
| 44 | + |
| 45 | +2. If you have deleted or don't have the default VPC, create default VPC |
| 46 | + |
| 47 | +```bash |
| 48 | + aws ec2 create-default-vpc |
| 49 | + ``` |
| 50 | + |
| 51 | + **NOTE:** You may see the following error if the default VPC already exists: `An error occurred (DefaultVpcAlreadyExists) when calling the CreateDefaultVpc operation: A Default VPC already exists for this account in this region.`. That means you can just continue with the following steps. |
| 52 | + |
| 53 | + **NOTE:** The default VPC must have at least two public subnets in different Availability Zones, and public subnet must set `Auto-assign public IPv4 address` to `YES` |
| 54 | + |
| 55 | +3. Configure your setup |
| 56 | + |
| 57 | +Create your own copy of `.env` file and edit it: |
| 58 | +```bash |
| 59 | + # Make sure you are in aws-blockchain-node-runners/lib/tezos |
| 60 | + cd lib/tezos |
| 61 | + pwd |
| 62 | + cp ./sample-configs/.env-sample-full .env |
| 63 | + nano .env |
| 64 | +``` |
| 65 | + **NOTE:** You can find more examples inside the `sample-configs` directory. |
| 66 | + |
| 67 | + |
| 68 | +4. Deploy common components such as IAM role, and Amazon S3 bucket to store data snapshots |
| 69 | + |
| 70 | +```bash |
| 71 | + pwd |
| 72 | + # Make sure you are in aws-blockchain-node-runners/lib/tezos |
| 73 | + npx cdk deploy tz-common |
| 74 | +``` |
| 75 | + |
| 76 | +### Option 1: Single RPC Node |
| 77 | + |
| 78 | +1. Deploy Single RPC Node |
| 79 | + |
| 80 | +```bash |
| 81 | + pwd |
| 82 | + # Make sure you are in aws-blockchain-node-runners/lib/tezos |
| 83 | + npx cdk deploy tz-single-node --json --outputs-file single-node-deploy.json |
| 84 | +``` |
| 85 | + **NOTE:** The default VPC must have at least two public subnets in different Availability Zones, and public subnet must set `Auto-assign public IPv4 address` to `YES`. |
| 86 | + |
| 87 | + The EC2 instance will deploy, initialize the node and start the first sync. In Cloudformation the instance will show as successful once the node is running. From that point it still takes a while until the node is synced to the blockchain. You can check the sync status with the REST call below in step 4. If the `curl cannot connect to the node on port 8732, then the node is still importing. Once that's done, the curl command works. |
| 88 | + |
| 89 | +2. After starting the node you need to wait for the inital syncronization process to finish. It may take from an hour to half a day depending on the the state of the network. You can use Amazon CloudWatch to track the progress. To see them: |
| 90 | + |
| 91 | + - Navigate to [CloudWatch service](https://console.aws.amazon.com/cloudwatch/) (make sure you are in the region you have specified for `AWS_REGION`) |
| 92 | + - Open `Dashboards` and select `tz-single-node-<type>-<network>` from the list of dashboards. |
| 93 | + |
| 94 | +4. Once the initial synchronization is done, you should be able to access the RPC API of that node from within the same VPC. The RPC port is not exposed to the Internet. Run the following query against the private IP of the single RPC node you deployed: |
| 95 | + |
| 96 | +```bash |
| 97 | + INSTANCE_ID=$(cat single-node-deploy.json | jq -r '..|.singleinstanceid? | select(. != null)') |
| 98 | + NODE_INTERNAL_IP=$(aws ec2 describe-instances --instance-ids $INSTANCE_ID --query 'Reservations[*].Instances[*].PrivateIpAddress' --output text) |
| 99 | + |
| 100 | + # We query if the node is synced to main |
| 101 | + curl http://$NODE_INTERNAL_IP:8732/chains/main/is_bootstrapped |
| 102 | +``` |
| 103 | + |
| 104 | +The result should be like this (the actual balance might change): |
| 105 | + |
| 106 | +```javascript |
| 107 | + {"bootstrapped":true,"sync_state":"synced"} |
| 108 | +``` |
| 109 | + |
| 110 | +### Option 2: Highly Available RPC Nodes |
| 111 | + |
| 112 | +1. Deploy Snapshot Node |
| 113 | + |
| 114 | +```bash |
| 115 | + pwd |
| 116 | + # Make sure you are in aws-blockchain-node-runners/lib/tezos |
| 117 | + npx cdk deploy snapshot-node --json --outputs-file sync-node-deploy.json |
| 118 | +``` |
| 119 | + **NOTE:** The default VPC must have at least two public subnets in different Availability Zones, and public subnet must set `Auto-assign public IPv4 address` to `YES` |
| 120 | + |
| 121 | +2. After starting the node you need to wait for the inital syncronization process to finish. It may take from an hour to half a day depending the state of the network. You can use Amazon CloudWatch to track the progress. To see them: |
| 122 | + |
| 123 | + - Navigate to [CloudWatch service](https://console.aws.amazon.com/cloudwatch/) (make sure you are in the region you have specified for `AWS_REGION`) |
| 124 | + - Open `Dashboards` and select `tz-snapshot-node-<type>-<network>` from the list of dashboards. |
| 125 | + |
| 126 | +Once synchronization process is over, the script will automatically stop the client and copy all the contents of the `/data` directory to your snapshot S3 bucket. That may take from 30 minutes to about 2 hours. During the process on the dashboard you will see lower CPU and RAM utilization but high data disc throughput and outbound network traffic. The script will automatically start the clients after the process is done. |
| 127 | + |
| 128 | +Note: the snapshot backup process will automatically run ever day at midnight time of the time zone were the sync node runs. To change the schedule, modify `crontab` of the root user on the node's EC2 instance. |
| 129 | + |
| 130 | +3. Configure and deploy 2 RPC Nodes |
| 131 | + |
| 132 | +```bash |
| 133 | + pwd |
| 134 | + # Make sure you are in aws-blockchain-node-runners/lib/tezos |
| 135 | + npx cdk deploy tz-ha-nodes --json --outputs-file rpc-node-deploy.json |
| 136 | +``` |
| 137 | + |
| 138 | +4. Give the new RPC nodes about an hour to initialize and then run the following query against the load balancer behind the RPC node created |
| 139 | + |
| 140 | +```bash |
| 141 | + export RPC_ABL_URL=$(cat rpc-node-deploy.json | jq -r '..|.alburl? | select(. != null)') |
| 142 | + echo $RPC_ABL_URL |
| 143 | + |
| 144 | + curl http://$RPC_ABL_URL:8732/chains/main/is_bootstrapped |
| 145 | +``` |
| 146 | + |
| 147 | +The result should be like this: |
| 148 | + |
| 149 | +```javascript |
| 150 | + {"bootstrapped":true,"sync_state":"synced"} |
| 151 | +``` |
| 152 | + |
| 153 | + If the nodes are still starting and catching up with the chain, you will see the following repsonse: |
| 154 | + |
| 155 | +```HTML |
| 156 | + <html> |
| 157 | + <head><title>503 Service Temporarily Unavailable</title></head> |
| 158 | + <body> |
| 159 | + <center><h1>503 Service Temporarily Unavailable</h1></center> |
| 160 | + </body> |
| 161 | +``` |
| 162 | + |
| 163 | +**NOTE:** By default and for security reasons the load balancer is available only from within the default VPC in the region where it is deployed. It is not available from the Internet and is not open for external connections. Before opening it up please make sure you protect your RPC APIs. |
| 164 | + |
| 165 | +### Clearing up and undeploying everything |
| 166 | + |
| 167 | +1. Undeploy RPC Nodes, Sync Nodes and Common components |
| 168 | + |
| 169 | +```bash |
| 170 | + # Setting the AWS account id and region in case local .env file is lost |
| 171 | + export AWS_ACCOUNT_ID=<your_target_AWS_account_id> |
| 172 | + export AWS_REGION=<your_target_AWS_region> |
| 173 | + |
| 174 | + pwd |
| 175 | + # Make sure you are in aws-blockchain-node-runners/lib/tezos |
| 176 | + |
| 177 | + # Undeploy Single RPC Node |
| 178 | + cdk destroy tz-single-node |
| 179 | + |
| 180 | + # Undeploy RPC Nodes |
| 181 | + cdk destroy tz-ha-nodes |
| 182 | + |
| 183 | + # Undeploy Sync Node |
| 184 | + cdk destroy tz-snapshot-node |
| 185 | + |
| 186 | + # You need to manually delete an s3 bucket with a name similar to 'tz-snapshots-$accountid-tz-nodes-common' on the console,firstly empty the bucket,secondly delete the bucket,and then execute |
| 187 | + # Delete all common components like IAM role and Security Group |
| 188 | + cdk destroy tz-common |
| 189 | +``` |
| 190 | + |
| 191 | +2. Follow steps to delete the Cloud9 instance in [Cloud9 Setup](../../doc/setup-cloud9.md) |
| 192 | + |
| 193 | +### FAQ |
| 194 | + |
| 195 | +1. How to check the logs from the EC2 user-data script? |
| 196 | + |
| 197 | + **Note:** In this tutorial we chose not to use SSH and use Session Manager instead. That allows you to log all sessions in AWS CloudTrail to see who logged into the server and when. If you receive an error similar to `SessionManagerPlugin is not found`, [install Session Manager plugin for AWS CLI](https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-working-with-install-plugin.html) |
| 198 | + |
| 199 | +```bash |
| 200 | + pwd |
| 201 | + # Make sure you are in aws-blockchain-node-runners/lib/tezos |
| 202 | + |
| 203 | + export INSTANCE_ID=$(cat single-node-deploy.json | jq -r '..|.single-node-instance-id? | select(. != null)') |
| 204 | + echo "INSTANCE_ID=" $INSTANCE_ID |
| 205 | + aws ssm start-session --target $INSTANCE_ID --region $AWS_REGION |
| 206 | + sudo cat /var/log/cloud-init-output.log |
| 207 | +``` |
0 commit comments