Skip to content

Commit 541a11e

Browse files
committed
Updated cloud documentation; persistent storage for a single (non-SC) instance. Explanation of terminated vs stopped instances.
1 parent d7eca8d commit 541a11e

File tree

2 files changed

+72
-10
lines changed

2 files changed

+72
-10
lines changed

docs/user/_sources/cloud.txt

Lines changed: 43 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,13 @@ An Amazon Marketplace AMI for C-PAC has been released, making it easier for rese
1212

1313
* Instance Type - The hardware specification for a given instance. A list of the instance types made available by Amazon may be found `here <http://aws.amazon.com/ec2/instance-types>`_.
1414

15-
* Elastic Block Storage (EBS) - A form of persistent storage offered by Amazon for use with instances.
15+
* Terminated Instance - An instance is considered terminated when its resources have been completely freed up for use by others in the Amazon cloud. Any data on a terminated instance that is not relocated to persistent storage such as EBS (see below) will be completely discarded. Instance termination is the virtual equivalent of throwing out a physical server. When you have terminated an instance, you are no longer paying for it. If your data and results are in persistent storage, you should terminate any instances you are using when you are done. Note that by default, instances do not have persistent storage attached to them- you will need to configure persistent storage when you set up the instance.
16+
17+
* Stopped Instance - An instance is considered stopped when it is not active, but its resources are still available for future use whenever you choose to reactivate it. Stopping an instance is the virtual equivalent of turning a computer off or putting it in hibernate mode. When you stop an instance, you continue to pay for it, since it is only paused. You should stop an instance when the analyses you are working on are not fully done and you would like to preserve the current state of a running instance.
18+
19+
* Simple Storage Service (S3) - A form of storage offered by Amazon. S3 is not intended for use with instances since it lacks a filesystem, but it can be used to archive large datasets. It is less costly than EBS.
20+
21+
* Elastic Block Storage (EBS) - A form of persistent storage offered by Amazon for use with instances. When you have terminated an instance, items stored in an EBS volume can be accessed by any future instances that you start up.
1622

1723
* Head Node - The primary node of an HPC cluster, which all other nodes are connected to. The head node will run a job scheduler (such as Sun Grid Engine) to allocate jobs to the other nodes. Jobs may also be run on the head node.
1824

@@ -48,7 +54,7 @@ Before you can create a single C-PAC machine or a C-PAC HPC cluster, you must fi
4854
Starting a Single C-PAC Instance via the AWS Console
4955
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5056

51-
Now that you have generated the access keys and a pem file, you may launch a single instance via Amazon's web interface by following the steps below. If you are planning on processing many subjects or obtaining computationally-intensive derivatives (such as network centrality), you should consider using Starcluster instead.
57+
Now that you have generated the access keys and a pem file, you may launch a single instance via Amazon's web interface by following the steps below. If you are planning on processing many subjects or obtaining computationally-intensive derivatives (such as network centrality), you should use Starcluster instead.
5258

5359
#. In the left-hand column under the `INSTANCES` header in the AWS console, click `Instances`. This is a dashboard of all instances you currently have running in the AWS cloud. Click the blue `Launch Instance` button.
5460

@@ -60,19 +66,39 @@ Now that you have generated the access keys and a pem file, you may launch a sin
6066

6167
#. The details page can be used to request spot instances, as well as other functionality (including VPN, VPC options). For a basic run you do not need to change anything, although you can tailor it according to your future needs. Hovering over the 'i' icons on this page will give you more insight into the options available. When done, click `Next: Add Storage.`
6268

63-
#. On the storage page, you can allocate storage for your dataset. Note that the amount of space you allocate will have to encompass raw data, preprocessed data, and derivatives. Click `Next: Tag Instance`.
69+
#. On the storage page, you can allocate space for the workstation, such as user and system directories. If you want the files stored in these directories to be kept after the instance is terminated, uncheck the box below the `Delete on Termination` column. Note that persistent storage for the datasets will be allocated and attached in later steps below. Click `Next: Tag Instance`.
6470

6571
#. On this page you can tag the instance with metadata (e.g., details related to the specific purpose for the instance). Tags are key-value pairs, so any contextual data that can be encapsulated in this format can be saved. Click `Next: Configure Security Group`.
6672

67-
#. On this page, you can modify who has access to the instance. The AMI defaults allow remote access from anywhere. If you would like to customize security to allow only a certain set of IP addresses and users access to the instance, you can do so here. Click `Review and Launch` when you are done.
73+
#. On this page, you can modify who has access to the instance. The AMI defaults allow remote access from anywhere. If you would like to customize security to allow only a certain set of IP addresses and users access to the instance, you can do so here. If you find that custom settings, such as using the `My IP` setting or specifying a range of IP addresses, do not work, consult with your institution's network administrator to make sure that you are entering settings correctly. Click `Review and Launch` when you are done.
6874

6975
#. This final page summarizes the instance details you are about to launch. You might receive some warnings as a result of security or the instance type not being in the free tier. These warnings can be ignored.
7076

7177
#. Click the `Launch` button. A dialogue box will ask you to choose a key pair for the instance. Every instance requires a key pair in order for you to securely log in and use it. Change the top drop down menu bar to `Choose an existing key pair` and select the key pair you created in the `Creating AWS Access and Network Keys` section in the other drop down menu. Check the acknowledgement check box and click the blue `Launch Instances` button.
7278

7379
#. You can click the `View Instances` blue button on the lower right of the page after to watch your new instance start up in the instance console.
7480

75-
#. When the `Instance State` column reads `running` and the `Status Checks` column reads `2/2` you can access and use the instance. Click on the instance's row. In the bottom pane, find the `Public DNS` field under the `Description` tab and save the field value to your clipboard.
81+
#. When the `Instance State` column reads `running` and the `Status Checks` column reads `2/2`, the instance should be active. Click on the row for the new instance. In the bottom pane, take note of the values for the `Instance ID`, `Public DNS`, and `Availability zone` fields under the `Description` tab.
82+
83+
#. Now, create a persistent storage volume for your data and results. In the left-hand column under the `ELASTIC BLOCK STORE` header in the AWS console, click `Volumes`. This is a dashboard of all volumes that you currently have stored in EBS. Click the blue `Create Volume` button.
84+
85+
#. Change the size field in the proceeding dialogue to have enough space to encompass your raw data, preprocessed data, and derivatives. A single volume can be as small as 1 GB or as large as 16 TB. Change the availability zone to match the zone from your instance's `Description` tab.
86+
87+
#. Click the checkbox next to the newly-created volume. Click `Actions` followed by `Attach Volumes`. Enter the `Instance ID` from the instance's `Description` tab in the `Instance` field. The `Device` field should fill itself automatically and should be of the form `/dev/sdb` or similar. Note the letter used after the `sd`. Click the blue `Attach` button.
88+
89+
#. Execute the following command from the terminal to make it so that your instance can see the volume (replace the letter `b` at the end of `/dev/xvdb` with the letter from the previous step).
90+
91+
.. code-block:: bash
92+
93+
ssh -i /path/to/pem/file ubuntu@<public_dns> 'sudo mkfs -t ext4 /dev/xvdb && sudo mount /dev/xvdb /mnt && sudo mkdir /mnt/sgeadmin && sudo chmod -R 777 /mnt'
94+
95+
To use this volume with future instances, you may attach it to the instance using the AWS console and then use this command:
96+
97+
.. code-block:: bash
98+
99+
ssh -i /path/to/pem/file ubuntu@<public_dns> 'sudo mount /dev/xvdb /mnt && sudo chmod -R 777 /mnt'
100+
101+
Note that the creation of a persistent volume is heavily automated in Starcluster, so if you will be creating many different persistent volumes you should use Starcluster instead.
76102

77103
There are now two different means of accessing the instance. Either through X2Go (a desktop GUI-based session) or through ssh (a command line session).
78104

@@ -104,6 +130,18 @@ When you are done, your session configuration should look similar to the followi
104130

105131
.. figure:: /_images/cloud_x2go.png
106132

133+
Note: If X2Go does not work on your computer, you can also access the C-PAC GUI by adding the ``-X`` flag to the ssh command to enable X11 port forwarding (i.e., the ssh command would be ``ssh -X -i /path/to/pem/file ubuntu@<public_dns>``). X11 port forwarding is very slow compared to X2Go, however, so it is recommended that you troubleshoot X2Go further before turning to this option.
134+
135+
Uploading Data to Your Instance
136+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
137+
138+
To upload data to your newly-created AWS instance, you can run the following command on the computer containing your data:
139+
140+
.. code-block:: bash
141+
142+
rsync -avu /path/to/data ubuntu@<public_dns>:/path/to/server/directory
143+
144+
If you have configured persistent storage, you will want to ensure that `/path/to/server/directory` is pointing to the mount point for the persistent storage. If you followed the instructions above or the instructions in the Starcluster section below, the mount point should be `/mnt`.
107145

108146
Starting a C-PAC HPC Cluster via Starcluster
109147
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

0 commit comments

Comments
 (0)