update config for classification

taigw · taigw · commit 143476c7b018 · 2022-08-20T12:00:26.000+08:00
diff --git a/classification/AntBee/README.md b/classification/AntBee/README.md
@@ -6,18 +6,27 @@
 In this example, we finetune a pretrained resnet18 for classification of images with two categries: Ant and Bee. This example is a PyMIC implementation of pytorch's "transfer learning for computer vision tutorial". The orginal tutorial can be found [here][torch_tutorial]. In PyMIC's implementation, we only need to edit the configure file to run the code. 
 
 ## Data and preprocessing
-1. The dataset contains about 120 training images each for ants and bees. There are 75 validation images for each class. Download the data from [here][data_link] and extract it.
-2. Set `AntBee_root` according to your computer in `write_csv_files.py`, where `AntBee_root` should be the path of `hymenoptera_data` based on the dataset you extracted. 
-3. Run `python write_csv_files.py` to create two csv files storing the paths and labels of training and validation images. They are `train_data.csv` and `valid_data.csv` and saved in `./config`.
+1. The dataset contains about 120 training images each for ants and bees. There are 75 validation images for each class. Download the data from [here][data_link] and extract it to `PyMIC_data`. Then the path for training and validation set should be `PyMIC_data/hymenoptera_data/train` and `PyMIC_data/hymenoptera_data/val`, respectively.
+2. Run `python write_csv_files.py` to create two csv files storing the paths and labels of training and validation images. They are `train_data.csv` and `valid_data.csv` and saved in `./config`.
 
 [torch_tutorial]:https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html
 [data_link]:https://download.pytorch.org/tutorial/hymenoptera_data.zip
 
 ## Finetuning all layers of resnet18
-1. Here we use resnet18 for finetuning, and update all the layers. Open the configure file `config/train_test_ce1.cfg`. In  the `network` section we can find details for the network. In the `dataset` section, set the value of `root_dir` as your `AntBee_root`. Then start to train by running:
+1. Here we use resnet18 for finetuning, and update all the layers. Open the configure file `config/train_test_ce1.cfg`. In  the `network` section we can find details for the network. Here `update_layers = 0` means updating all the layers.
+```bash
+# type of network
+net_type   = resnet18
+pretrain   = True
+input_chns = 3
+# finetune all the layers
+update_layers = 0  
+```
+
+Then start to train by running:
  
 ```bash
-pymic_net_run train config/train_test_ce1.cfg
+pymic_run train config/train_test_ce1.cfg
 ```
 
 2. During training or after training, run `tensorboard --logdir model/resnet18_ce1` and you will see a link in the output, such as `http://your-computer:6006`. Open the link in the browser and you can observe the average loss and accuracy during the training stage, such as shown in the following images, where blue and red curves are for training set and validation set respectively. The iteration number obtained the highest accuracy on the validation set was 400, and may be different based on the hardware environment. After training, you can find the trained models in `./model/resnet18_ce1`. 
@@ -26,24 +35,33 @@ pymic_net_run train config/train_test_ce1.cfg
 ![avg_acc](./picture/acc.png)
 
 ## Testing and evaluation
-1. Run the following command to obtain classification results of testing images. By default we use the best performing checkpoint based on the validation set. You can set `ckpt_mode` to 0 in `config/train_test.cfg` to use the latest checkpoint.
+1. Run the following command to obtain classification results of testing images. By default we use the best performing checkpoint based on the validation set. You can set `ckpt_mode` to 0 in `config/train_test_ce1.cfg` to use the latest checkpoint.
 
 ```bash
 mkdir result
-pymic_net_run test config/train_test_ce1.cfg
+pymic_run test config/train_test_ce1.cfg
 ```
 
 2. Then run the following command to obtain quantitative evaluation results in terms of accuracy. 
 
 ```bash
-pymic_evaluate_cls config/evaluation.cfg
+pymic_eval_cls config/evaluation.cfg
 ```
 
-The obtained accuracy by default setting should be around 0.9412, and the AUC will be 0.973.
+The obtained accuracy by default setting should be around 0.9412, and the AUC will be around 0.976.
 
 3. Run `python show_roc.py` to show the receiver operating characteristic curve. 
 
 ![roc](./picture/roc.png)
 
 ## Finetuning the last layer of resnet18
-Similarly to the above example, we further try to only finetune the last layer of resnet18 for the same classification task. Use a different configure file `config/train_test_ce2.cfg` for training and testing, where `update_layers = -1` in the `network` section means updating the last layer only. Edit `config/evaluation.cfg` accordinly for evaluation. The iteration number obtained the highest accuracy on the validation set was 400 in our testing machine, and the accuracy was around 0.9543. The AUC was 0.981.
+Similarly to the above example, we further try to only finetune the last layer of resnet18 for the same classification task. Use a different configure file `config/train_test_ce2.cfg` for training and testing, where `update_layers = -1` in the `network` section means updating the last layer only:
+```bash
+net_type   = resnet18
+pretrain   = True
+input_chns = 3
+# finetune the last layer only
+update_layers = -1
+```
+
+Edit `config/evaluation.cfg` accordinly for evaluation. 
diff --git a/classification/AntBee/config/train_test_ce1.cfg b/classification/AntBee/config/train_test_ce1.cfg
@@ -3,7 +3,7 @@
 tensor_type = float
 
 task_type = cls
-root_dir  = /home/guotai/disk2t/projects/torch_project/transfer_learning/hymenoptera_data
+root_dir  = ../../PyMIC_data/hymenoptera_data
 train_csv = config/train_data.csv
 valid_csv = config/valid_data.csv
 test_csv  = config/valid_data.csv
@@ -57,17 +57,18 @@ momentum      = 0.9
 weight_decay  = 1e-5
 
 # for lr schedular (MultiStepLR)
+lr_scheduler  = MultiStepLR
 lr_gamma      = 0.1
 lr_milestones = [500, 1000]
 
 ckpt_save_dir    = model/resnet18_ce1
-ckpt_save_prefix = resnet18
+ckpt_prefix = resnet18
 
 # iteration
 iter_start = 0
 iter_max   = 1500
 iter_valid = 100
-iter_save  = 500
+iter_save  = 1500
 
 [testing]
 # list of gpus
diff --git a/classification/AntBee/config/train_test_ce2.cfg b/classification/AntBee/config/train_test_ce2.cfg
@@ -3,7 +3,7 @@
 tensor_type = float
 
 task_type = cls
-root_dir  = /home/guotai/disk2t/projects/torch_project/transfer_learning/hymenoptera_data
+root_dir  = ../../PyMIC_data/hymenoptera_data
 train_csv = config/train_data.csv
 valid_csv = config/valid_data.csv
 test_csv  = config/valid_data.csv
@@ -58,17 +58,18 @@ momentum      = 0.9
 weight_decay  = 1e-5
 
 # for lr schedular (MultiStepLR)
+lr_scheduler  = MultiStepLR
 lr_gamma      = 0.1
 lr_milestones = [500, 1000]
 
 ckpt_save_dir    = model/resnet18_ce2
-ckpt_save_prefix = resnet18
+ckpt_prefix = resnet18
 
 # iteration
 iter_start = 0
 iter_max   = 1500
 iter_valid = 100
-iter_save  = 500
+iter_save  = 1500
 
 [testing]
 # list of gpus
diff --git a/classification/AntBee/write_csv_files.py b/classification/AntBee/write_csv_files.py
@@ -50,7 +50,7 @@ def get_evaluation_image_pairs(test_csv, gt_seg_csv):
 
 if __name__ == "__main__":
     # create cvs file for JSRT dataset
-    AntBee_root   = '/home/guotai/disk2t/projects/torch_project/transfer_learning/hymenoptera_data'
+    AntBee_root   = '../../PyMIC_data/hymenoptera_data'
     create_csv_file(AntBee_root, 'train', 'config/train_data.csv')
     create_csv_file(AntBee_root, 'val', 'config/valid_data.csv')
 
diff --git a/classification/CHNCXR/README.md b/classification/CHNCXR/README.md
@@ -6,18 +6,27 @@
 In this example, we finetune a pretrained resnet18 and vgg16 for classification of X-Ray images with two categries: normal and tuberculosis. 
 
 ## Data and preprocessing
-1. We use the Shenzhen Hospital X-ray Set for this experiment. This dataset contains images in JPEG format. There are 326 normal x-rays and 336 abnormal x-rays showing various manifestations of tuberculosis.  Download the dataset from [here][data_link] and extract it, and the folder name will be "ChinaSet_AllFiles/CXR_png".
+1. We use the Shenzhen Hospital X-ray Set for this experiment. This [dataset] contains images in JPEG format. There are 326 normal x-rays and 336 abnormal x-rays showing various manifestations of tuberculosis. The images are available in `PyMIC_data/CHNCXR`.
 
 [data_link]:https://lhncbc.nlm.nih.gov/publication/pub9931
 
-2. Set `image_dir` according to your computer in `write_csv_files.py`, where `image_dir` should be the path of "CXR_png" based on the dataset you extracted. 
-3. Run `python write_csv_files.py` to randomly split the entire dataset into 70% for training, 10% for validation and 20% for testing. The output files are `cxr_train.csv`, `cxr_valid.csv` and `cxr_test.csv` under folder `./config`.
+2. Run `python write_csv_files.py` to randomly split the entire dataset into 70% for training, 10% for validation and 20% for testing. The output files are `cxr_train.csv`, `cxr_valid.csv` and `cxr_test.csv` under folder `./config`.
 
 ## Finetuning resnet18
-1. First, we use resnet18 for finetuning, and update all the layers. Open the configure file `config/net_resnet18.cfg`. In the `dataset` section, set the value of `root_dir` as your path of "CXR_png". Then start to train by running:
+1. First, we use resnet18 for finetuning, and update all the layers. The configuration file is `config/net_resnet18.cfg`. The setting for network is:
+
+```bash
+net_type = resnet18
+pretrain = True
+input_chns = 3
+# finetune all the layers
+update_layers = 0
+```
+
+Start to train by running:
  
 ```bash
-pymic_net_run train config/net_resnet18.cfg
+pymic_run train config/net_resnet18.cfg
 ```
 
 2. During training or after training, run `tensorboard --logdir model/resnet18` and you will see a link in the output, such as `http://your-computer:6006`. Open the link in the browser and you can observe the average loss and accuracy during the training stage, such as shown in the following images, where blue and red curves are for training set and validation set respectively. The iteration number obtained the highest accuracy on the validation set was 1800, and may be different based on the hardware environment. After training, you can find the trained models in `./model/resnet18`. 
@@ -30,13 +39,13 @@ pymic_net_run train config/net_resnet18.cfg
 
 ```bash
 mkdir result
-pymic_net_run test config/net_resnet18.cfg
+pymic_run test config/net_resnet18.cfg
 ```
 
 2. Then run the following command to obtain quantitative evaluation results in terms of accuracy. 
 
 ```bash
-pymic_evaluate_cls config/evaluation.cfg
+pymic_eval_cls config/evaluation.cfg
 ```
 
 The obtained accuracy by default setting should be around 0.8571, and the AUC is 0.94.
@@ -47,4 +56,4 @@ The obtained accuracy by default setting should be around 0.8571, and the AUC is
 
 
 ## Finetuning vgg16
-Similarly to the above example, we further try to finetune vgg16  for the same classification task. Use a different configure file `config/net_vg16.cfg` for training and testing. Edit `config/evaluation.cfg` accordinly for evaluation. The iteration number for the highest accuracy on the validation set was 2300, and the accuracy will be around 0.8797. 
+Similarly to the above example, we further try to finetune vgg16 for the same classification task. Use a different configure file `config/net_vg16.cfg` for training and testing. Edit `config/evaluation.cfg` accordinly for evaluation. The iteration number for the highest accuracy on the validation set was 2300, and the accuracy will be around 0.8797. 
diff --git a/classification/CHNCXR/config/net_resnet18.cfg b/classification/CHNCXR/config/net_resnet18.cfg
@@ -3,7 +3,7 @@
 tensor_type = float
 
 task_type = cls
-root_dir  = /home/guotai/disk2t/data/lung/ChinaSet_AllFiles/CXR_png
+root_dir  = ../../PyMIC_data/CHNCXR/CXR_png
 train_csv = config/cxr_train.csv
 valid_csv = config/cxr_valid.csv
 test_csv  = config/cxr_test.csv
@@ -55,11 +55,12 @@ momentum      = 0.9
 weight_decay  = 1e-5
 
 # for lr schedular (MultiStepLR)
+lr_scheduler  = MultiStepLR
 lr_gamma      = 0.1
 lr_milestones = [1500, 3000]
 
 ckpt_save_dir    = model/resnet18
-ckpt_save_prefix = resnet18
+ckpt_prefix      = resnet18
 
 # iteration
 iter_start = 0
diff --git a/classification/CHNCXR/config/net_vgg16.cfg b/classification/CHNCXR/config/net_vgg16.cfg
@@ -3,7 +3,7 @@
 tensor_type = float
 
 task_type = cls
-root_dir  = /home/guotai/disk2t/data/lung/ChinaSet_AllFiles/CXR_png
+root_dir  = ../../PyMIC_data/CHNCXR/CXR_png
 train_csv = config/cxr_train.csv
 valid_csv = config/cxr_valid.csv
 test_csv  = config/cxr_test.csv
@@ -55,11 +55,12 @@ momentum      = 0.9
 weight_decay  = 1e-5
 
 # for lr schedular (MultiStepLR)
+lr_scheduler  = MultiStepLR
 lr_gamma      = 0.1
 lr_milestones = [1500, 3000]
 
-ckpt_save_dir    = model/vgg16
-ckpt_save_prefix = vgg16
+ckpt_save_dir = model/vgg16
+ckpt_prefix   = vgg16
 
 # iteration
 iter_start = 0
diff --git a/classification/CHNCXR/write_csv_files.py b/classification/CHNCXR/write_csv_files.py
@@ -58,7 +58,7 @@ def random_split_dataset():
   
 if __name__ == "__main__":
     # create cvs file for ISIC dataset
-    image_dir   = '/home/guotai/disk2t/data/lung/ChinaSet_AllFiles/CXR_png'
+    image_dir   = '../../PyMIC_data/CHNCXR/CXR_png'
     output_csv  = 'config/cxr_all.csv'
     create_csv_file(image_dir, output_csv)