Skip to content

Commit 299c207

Browse files
committed
Update Blog “production-ready-object-detection-model-training-workflow-with-hpe-machine-learning-development-environment”
1 parent c0735f5 commit 299c207

File tree

1 file changed

+15
-15
lines changed

1 file changed

+15
-15
lines changed

content/blog/production-ready-object-detection-model-training-workflow-with-hpe-machine-learning-development-environment.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -181,40 +181,40 @@ Here, you will be using the SAHI library to slice our large satellite images. Sa
181181

182182
## 4. Upload to S3 bucket to support distributed training
183183

184-
We will now upload our exported data to a publically accessible AWS S3 bucket. This will enable for a large scale distributed experiment to have access to the dataset without installing the dataset on device.
184+
Now, you can upload your exported data to a publicly accessible AWS S3 bucket. For a large-scale distributed experiment, this will enable you to access the dataset without installing the dataset on the device.
185185
View [Determined Documentation](<* https://docs.determined.ai/latest/training/load-model-data.html#streaming-from-object-storage>) and [AWS instructions](<* https://codingsight.com/upload-files-to-aws-s3-with-the-aws-cli/>) to learn how to upload your dataset to an S3 bucket. Review the `S3Backend` class in `data.py`
186186

187-
Once you create an S3 bucket that is publically accessible, here are example commands to upload the preprocessed dataset to S3:
187+
Once you create an S3 bucket that is publicly accessible, here are example commands to upload the preprocessed dataset to S3:
188188

189189
* `aws s3 cp --recursive xview_dataset/train_sliced_no_neg/ s3://determined-ai-xview-coco-dataset/train_sliced_no_neg`
190190
* `aws s3 cp --recursive xview_dataset/val_sliced_no_neg/ s3://determined-ai-xview-coco-dataset/val_sliced_no_neg`
191191

192-
Our satellite imagery data is in an S3 bucket and is prepped for distributed training, so now we can progress to model training and inference via the NGC Container.
192+
Now that the satellite imagery data is in an S3 bucket and is prepped for distributed training, you can progress to model training and inferencing via the NGC container.
193193

194-
# Part 3: End-to-End example training object detection model using NVIDIA PyTorch Container from NGC
194+
# Part 3: End-to-End example training object detection model using NVIDIA PyTorch container from NGC
195195

196-
## Training and Inference via NGC Container
196+
## Training and inference via NGC Container
197197

198-
This notebook walks you each step to train a model using containers from the NGC Catalog. We chose the GPU optimized Pytorch container as an example. The basics of working with docker containers apply to all NGC containers.
198+
This notebook walks you through each step to train a model using containers from the NGC Catalog. I chose the GPU-optimized PyTorch container for this example. The basics of working with Docker containers apply to all NGC containers.
199199

200200
We will show you how to:
201201

202-
* Execute training a object detection on satellite imagery using TensorFlow and Jupyter Notebook
202+
* Execute training an object detection model on satellite imagery using TensorFlow and Jupyter Notebook
203203
* Run inference on a trained object detection model using the SAHI library
204204

205-
Note this Object Detection demo is based on https://github.com/pytorch/vision/tree/v0.11.3 and ngc docker image `nvcr.io/nvidia/pytorch:21.11-py3`
205+
Note this object detection demo is based on [this PyTorch repo](https://github.com/pytorch/vision/tree/v0.11.3) and ngc docker image `nvcr.io/nvidia/pytorch:21.11-py3`
206206

207-
We assume you completed step 2 of dataset preprocessing and have your tiled satellite imagery dataset completed and in the local directory `train_images_rgb_no_neg/train_images_300_02`
207+
It is assumed that, by now, you have completed step 2 of dataset preprocessing and have your tiled satellite imagery dataset completed and in the local directory `train_images_rgb_no_neg/train_images_300_02`
208208

209-
Let's get started!
209+
Let's get started!
210210

211-
## Execute docker run to create NGC environment for Data Prep
211+
## Execute Docker run to create NGC environment for data prep
212212

213-
Make sure to map host directory to docker directory, we will use the host directory again to
213+
Make sure to map host directory to Docker directory. You will use the host directory again to:
214214

215215
* `docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -v /home/ubuntu:/home/ubuntu -p 8008:8888 -it nvcr.io/nvidia/pytorch:21.11-py3 /bin/bash`
216216

217-
## Run Jupyter notebook command within docker container to access it on your local browser
217+
## Run Jupyter Notebook command within Docker container to access it on your local browser
218218

219219
* `cd /home/ubuntu`
220220
* `jupyter lab --ip=0.0.0.0 --port=8888 --NotebookApp.token='' --NotebookApp.password=''`
@@ -224,9 +224,9 @@ Make sure to map host directory to docker directory, we will use the host direct
224224
!pip install cython pycocotools matplotlib terminaltables
225225
```
226226

227-
## TLDR; Run training job on 4 GPUS
227+
## TLDR; Run training job on 4 GPUs
228228

229-
The below cell will run a multi-gpu training job. This job will train an object detection model (faster-rcnn) on a dataset of satellite imagery images that contain 61 classes of objects
229+
The below cell will run a multi-gpu training job. This job will train an object detection model (faster-rcnn) on a dataset of satellite imagery images that contain 61 classes of objects.
230230

231231
* Change `nproc_per_node` argument to specify the number of GPUs available on your server
232232

0 commit comments

Comments
 (0)