Skip to content

Commit ad50974

Browse files
committed
default to not assign ip address
1 parent e2fda31 commit ad50974

File tree

9 files changed

+41
-15
lines changed

9 files changed

+41
-15
lines changed

config.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
MACHINE_PRICE = 0.20
2525
EBS_VOL_SIZE = 30 # In GB. Minimum allowed is 22.
2626
DOWNLOAD_FILES = 'False'
27+
ASSIGN_IP = 'False' # If false, will overwrite setting in Fleet file
2728

2829
# DOCKER INSTANCE RUNNING ENVIRONMENT:
2930
DOCKER_CORES = 4 # Number of CellProfiler processes to run inside a docker container

documentation/DCP-documentation/config_examples.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ Our internal configurations for each pipeline are as follows:
4646
| EBS_VOL_SIZE (if using S3 mounted as a file system) | 22 | 22 | 22 | 22 | 22 | Files are read directly off of S3, mounted as a file system when `DOWNLOAD_FILES = False`. |
4747
| EBS_VOL_SIZE (if downloading files) | 22 | 200 | 22 | 22 | 40 | Files are downloaded to the EBS volume when `DOWNLOAD_FILES = True`. |
4848
| DOWNLOAD_FILES | 'False' | 'False' | 'False' | 'False' | 'False' | |
49+
| ASSIGN_IP | 'False' | 'False' | 'False' | 'False' | 'False' | |
4950
| DOCKER_CORES | 4 | 4 | 4 | 4 | 3 | If using c class machines and large images (2k + pixels) then you might need to reduce this number. |
5051
| CPU_SHARES | DOCKER_CORES * 1024 | DOCKER_CORES * 1024 | DOCKER_CORES * 1024 | DOCKER_CORES * 1024 | DOCKER_CORES * 1024 | We never change this. |
5152
| MEMORY | 7500 | 7500 | 7500 | 7500 | 7500 | This must match your machine type. m class use 15000, c class use 7500. |

documentation/DCP-documentation/costs.md

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,18 +2,24 @@
22

33
Distributed-CellProfiler is run by a series of three commands, only one of which incurs costs at typical scale of usage:
44

5-
[`setup`](step_1_configuration.md) creates a queue in SQS and a cluster, service, and task definition in ECS.
6-
ECS is entirely free.
5+
[`setup`](step_1_configuration.md) creates a queue in SQS and a cluster, service, and task definition in ECS.
6+
ECS is entirely free.
77
SQS queues are free to create and use up to 1 million requests/month.
88

99
[`submitJobs`](step_2_submit_jobs.md) places messages in the SQS queue which is free (under 1 million requests/month).
1010

11-
[`startCluster`](step_3_start_cluster.md) is the only command that incurs costs with initiation of your spot fleet request, creating machine alarms, and optionally creating a run dashboard.
11+
[`startCluster`](step_3_start_cluster.md) is the only command that incurs costs with initiation of your spot fleet request, creating machine alarms, and optionally creating a run dashboard.
1212

13-
The spot fleet is the major cost of running Distributed-CellProfiler, exact pricing of which depends on the number of machines, type of machines, and duration of use.
13+
The spot fleet is the major cost of running Distributed-CellProfiler, exact pricing of which depends on the number of machines, type of machines, and duration of use.
1414
Your bid is configured in the [config file](step_1_configuration.md).
15+
Simple spot fleet configurations can be minimized by:
16+
17+
1) Optimize `MACHINE_TYPE` and `EBS_VOL_SIZE` based on the actual memory and harddrive needs of your run.
18+
2) When possible, mount your S3 bucket using S3FS so that you can set `DOWNLOAD_FILES = 'False'` to not incur file egress costs.
19+
3) Set `ASSIGN_IP = 'False'` so that you don't pay for IPv4 addresses per EC2 instance in your spot fleet.
1520

1621
Spot fleet costs can be minimized/stopped in multiple ways:
22+
1723
1) We encourage the use of [`monitor`](step_4_monitor.md) during your job to help minimize the spot fleet cost as it automatically scales down your spot fleet request as your job queue empties and cancels your spot fleet request when you have no more jobs in the queue.
1824
Note that you can also perform a more aggressive downscaling of your fleet by monitor by engaging Cheapest mode (see [`more information here`](step_4_monitor.md)).
1925
2) If your job is finished, you can still initiate [`monitor`](step_4_monitor.md) to perform the same cleanup (without the automatic scaling).
@@ -23,14 +29,16 @@ Note that you can also perform a more aggressive downscaling of your fleet by mo
2329
After the spot fleet has started, a Cloudwatch instance alarm is automatically placed on each instance in the fleet.
2430
Cloudwatch instance alarms [are currently $0.10/alarm/month](https://aws.amazon.com/cloudwatch/pricing/).
2531
Cloudwatch instance alarm costs can be minimized/stopped in multiple ways:
32+
2633
1) If you run monitor during your job, it will automatically delete Cloudwatch alarms for any instance that is no longer in use once an hour while running and at the end of a run.
2734
2) If your job is finished, you can still initiate [`monitor`](step_4_monitor.md) to delete Cloudwatch alarms for any instance that is no longer in use.
2835
3) In [AWS Cloudwatch console](https://console.aws.amazon.com/cloudwatch/) you can select unused alarms by going to Alarms => All alarms. Change Any State to Insufficient Data, select all alarms, and then Actions => Delete.
2936
4) We provide a [hygiene script](hygiene.md) that will clean up old alarms for you.
3037

31-
Cloudwatch Dashboards [are currently free](https://aws.amazon.com/cloudwatch/pricing/) for 3 Dashboards with up to 50 metrics per month and are $3 per dashboard per month after that.
38+
Cloudwatch Dashboards [are currently free](https://aws.amazon.com/cloudwatch/pricing/) for 3 Dashboards with up to 50 metrics per month and are $3 per dashboard per month after that.
3239
Cloudwatch Dashboard costs can be minimized/prevented in multiple ways:
40+
3341
1) You can choose not to have Distributed-CellProfiler create a Dashboard by setting `CREATE_DASHBOARD = 'False'` in your [config file](step_1_configuration.md).
3442
2) We encourage the use of [`monitor`](step_4_monitor.md) during your job as if you have set `CLEAN_DASHBOARD = 'True'` in your [config file](step_1_configuration.md) it will automatically delete your Dashboard when your job is done.
3543
3) If your job is finished, you can still initiate [`monitor`](step_4_monitor.md) to perform the same cleanup (without the automatic scaling).
36-
4) You can manually delete Dashboards in the [Cloudwatch Console]((https://console.aws.amazon.com/cloudwatch/)) by going to Dashboards, selecting your Dashboard, and selecting Delete.
44+
4) You can manually delete Dashboards in the [Cloudwatch Console]((https://console.aws.amazon.com/cloudwatch/)) by going to Dashboards, selecting your Dashboard, and selecting Delete.

documentation/DCP-documentation/step_1_configuration.md

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ It need not be unique, but it should be descriptive enough that you can tell job
1717
***
1818

1919
### AWS GENERAL SETTINGS
20+
2021
These are settings that will allow your instances to be configured correctly and access the resources they need- see [Step 0: Prep](step_0_prep.md) for more information.
2122

2223
Bucket configurations allow you to read/write from/to different bucket in different accounts from where you are running DCP.
@@ -53,10 +54,15 @@ The minimum allowed is 22.
5354
If you have multiple Dockers running per machine, each Docker will have access to (EBS_VOL_SIZE/TASKS_PER_MACHINE)- 2 GB of space.
5455
* **DOWNLOAD_FILES:** Whether or not to download the image files to the EBS volume before processing, as opposed to accessing them all from S3FS.
5556
This typically requires a larger EBS volume (depending on the size of your image sets, and how many sets are processed per group), but avoids occasional issues with S3FS that can crop up on longer runs.
57+
* **ASSIGN_IP:** Whether or not to assign an a public IPv4 address to each instance in the spot fleet.
58+
If set to 'False' will overwrite whatever is in the Fleet file.
59+
If set to 'True' will respect whatever is in the Fleet file.
60+
Distributed-CellProfiler originally defaulted to assign an IP address to each instance so that one could connect to the instance for troubleshooting but that need has been obviated by the level of logging currently in DCP.
5661

5762
***
5863

5964
### DOCKER INSTANCE RUNNING ENVIRONMENT
65+
6066
* **DOCKER_CORES:** How many copies of your script to run in each Docker container.
6167
* **CPU_SHARES:** How many CPUs each Docker container may have.
6268
* **MEMORY:** How much memory each Docker container may have.
@@ -83,8 +89,9 @@ See [Step 0: Prep](step_0_prep.med) for more information.
8389

8490
***
8591

86-
### MONITORING
87-
* **AUTO_MONITOR:** Whether or not to have Auto-Monitor automatically monitor your jobs.
92+
### MONITORING
93+
94+
* **AUTO_MONITOR:** Whether or not to have Auto-Monitor automatically monitor your jobs.
8895

8996
***
9097

@@ -111,6 +118,7 @@ Useful when trying to detect jobs that may have exported smaller corrupted files
111118
***
112119

113120
### CELLPROFILER SETTINGS
121+
114122
* **ALWAYS CONTINUE:** Whether or not to run CellProfiler with the --always-continue flag, which will keep CellProfiler from crashing if it errors.
115123
Use with caution.
116124
This can be particularly helpful in jobs where a large number of files are loaded in a single run (such as during illumination correction) so that a corrupted or missing file doesn't prevent the whole job completing.
@@ -120,6 +128,7 @@ We suggest using this setting in conjunction with a small number of JOB_RETRIES.
120128
***
121129

122130
### PLUGINS
131+
123132
* **USE_PLUGINS:** Whether or not you will be using external plugins from the CellProfiler-plugins repository.
124133
When True, passes the `--plugins-directory` flag to CellProfiler.
125134
Defaults to the current v1.0 `CellProfiler-plugins/active_plugins` location for plugins but will revert to the historical location of plugins in the `CellProfiler-plugins` root directory if the `active_plugins` folder is not present.
@@ -147,7 +156,7 @@ If you need to use deprecated plugin organization you can access previous commit
147156

148157
### EXAMPLE CONFIGURATIONS
149158

150-
!(Sample_Distributed-CellProfiler_Configuration_1)[images/sample_DCP_config_1.png]
159+
![Sample_Distributed-CellProfiler_Configuration_1](images/sample_DCP_config_1.png)
151160

152161
This is an example of one possible configuration.
153162
It's a fairly large machine that is able to process 64 jobs at the same time.
@@ -159,9 +168,9 @@ The Config settings for this example are:
159168

160169
**DOCKER_CORES** = 4 (copies of CellProfiler to run inside a docker)
161170
**CPU_SHARES** = 4096 (number of cores for each Docker * 1024)
162-
**MEMORY** = 15000 (MB for each Docker)
171+
**MEMORY** = 15000 (MB for each Docker)
163172

164-
!(Sample_Distributed-CellProfiler_Configuration_2)[images/sample_DCP_config_2.png]
173+
![Sample_Distributed-CellProfiler_Configuration_2](images/sample_DCP_config_2.png)
165174

166175
This is an example of another possible configuration.
167176
When we run Distributed CellProfiler we tend to prefer running a larger number of smaller machine.
@@ -175,4 +184,4 @@ The Config settings for this example are:
175184

176185
**DOCKER_CORES** = 4 (copies of CellProfiler to run inside a docker)
177186
**CPU_SHARES** = 4096 (number of cores for each Docker * 1024)
178-
**MEMORY** = 15000 (MB for each Docker)
187+
**MEMORY** = 15000 (MB for each Docker)

example_project/config.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
MACHINE_PRICE = 0.13
2222
EBS_VOL_SIZE = 22 # In GB. Minimum allowed is 22.
2323
DOWNLOAD_FILES = 'False'
24+
ASSIGN_IP = 'False' # If false, will overwrite setting in Fleet file
2425

2526
# DOCKER INSTANCE RUNNING ENVIRONMENT:
2627
DOCKER_CORES = 1 # Number of CellProfiler processes to run inside a docker container

example_project_CPG/config.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
MACHINE_PRICE = 0.13
2323
EBS_VOL_SIZE = 22 # In GB. Minimum allowed is 22.
2424
DOWNLOAD_FILES = 'True'
25+
ASSIGN_IP = 'False' # If false, will overwrite setting in Fleet file
2526

2627
# DOCKER INSTANCE RUNNING ENVIRONMENT:
2728
DOCKER_CORES = 1 # Number of CellProfiler processes to run inside a docker container

files/exampleFleet_us-east-1.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@
3232
"DeviceIndex": 0,
3333
"SubnetId": "subnet-WWWWWWWW",
3434
"DeleteOnTermination": true,
35-
"AssociatePublicIpAddress": true,
35+
"AssociatePublicIpAddress": false,
3636
"Groups": [
3737
"sg-ZZZZZZZZZ"
3838
]

files/exampleFleet_us-west-2.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@
3232
"DeviceIndex": 0,
3333
"SubnetId": "subnet-WWWWWWWW",
3434
"DeleteOnTermination": true,
35-
"AssociatePublicIpAddress": true,
35+
"AssociatePublicIpAddress": false,
3636
"Groups": [
3737
"sg-ZZZZZZZZZ"
3838
]

run.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
REQUIREMENTS_FILE = False
2222
ALWAYS_CONTINUE = 'False'
2323
JOB_RETRIES = 10
24+
ASSIGN_IP = 'True'
2425

2526
from config import *
2627

@@ -595,7 +596,11 @@ def startCluster():
595596
spotfleetConfig['LaunchSpecifications'][LaunchSpecification]["UserData"]=userData
596597
spotfleetConfig['LaunchSpecifications'][LaunchSpecification]['BlockDeviceMappings'][1]['Ebs']["VolumeSize"]= EBS_VOL_SIZE
597598
spotfleetConfig['LaunchSpecifications'][LaunchSpecification]['InstanceType'] = MACHINE_TYPE[LaunchSpecification]
598-
599+
if not ASSIGN_IP:
600+
try:
601+
spotfleetConfig['LaunchSpecifications'][0]['NetworkInterfaces'][0]['AssociatePublicIpAddress'] = False
602+
except:
603+
print("Couldn't add or overwrite 'AssociatePublicIpAddress' to False in spot fleet config.")
599604

600605
# Step 2: make the spot fleet request
601606
ec2client=boto3.client('ec2')

0 commit comments

Comments
 (0)