thuml
diff --git a/‎docs/index.rst‎
Lines changed: 4 additions & 8 deletions b/‎docs/index.rst‎
Lines changed: 4 additions & 8 deletions
diff --git a/‎docs/ssllib/benchmarks/image_classification.rst‎
Lines changed: 0 additions & 70 deletions b/‎docs/ssllib/benchmarks/image_classification.rst‎
Lines changed: 0 additions & 70 deletions
diff --git a/‎docs/ssllib/semi_supervised_learning.rst‎ renamed to ‎docs/ssllib/consistency_regularization.rst‎
Lines changed: 0 additions & 40 deletions b/‎docs/ssllib/semi_supervised_learning.rst‎ renamed to ‎docs/ssllib/consistency_regularization.rst‎
Lines changed: 0 additions & 40 deletions
diff --git a/‎docs/ssllib/contrastive_learning.rst‎
Lines changed: 12 additions & 0 deletions b/‎docs/ssllib/contrastive_learning.rst‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎docs/ssllib/holistic_methods.rst‎
Lines changed: 10 additions & 0 deletions b/‎docs/ssllib/holistic_methods.rst‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎docs/ssllib/proxy_label.rst‎
Lines changed: 12 additions & 0 deletions b/‎docs/ssllib/proxy_label.rst‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎projects/README.md‎
Lines changed: 2 additions & 2 deletions b/‎projects/README.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎projects/self_tuning/README.md‎
Lines changed: 13 additions & 4 deletions b/‎projects/self_tuning/README.md‎
Lines changed: 13 additions & 4 deletions
diff --git a/‎projects/self_tuning/baseline.py‎
Lines changed: 11 additions & 1 deletion b/‎projects/self_tuning/baseline.py‎
Lines changed: 11 additions & 1 deletion
diff --git a/‎projects/self_tuning/baseline.sh‎
Lines changed: 18 additions & 18 deletions b/‎projects/self_tuning/baseline.sh‎
Lines changed: 18 additions & 18 deletions
@@ -35,13 +35,6 @@ Transfer Learning
 
     talib/benchmarks/image_classification
 
-.. toctree::
-    :maxdepth: 2
-    :caption: Semi Supervised Learning Settings
-    :titlesonly:
-
-    ssllib/benchmarks/image_classification
-
 
 .. toctree::
     :maxdepth: 2
@@ -80,7 +73,10 @@ Transfer Learning
     :caption: Semi Supervised Learning Methods
     :titlesonly:
 
-    ssllib/semi_supervised_learning.rst
+    ssllib/consistency_regularization.rst
+    ssllib/contrastive_learning.rst
+    ssllib/holistic_methods.rst
+    ssllib/proxy_label.rst
 
 
 
 
@@ -1,7 +1,4 @@
 =======================================
-Semi Supervised Learning
-=======================================
-
 Consistency Regularization
 =======================================
 
@@ -43,40 +40,3 @@ Unsupervised Data Augmentation (UDA)
 .. autoclass:: ssllib.uda.SupervisedUDALoss
 
 .. autoclass:: ssllib.uda.UnsupervisedUDALoss
-
-
-Pseudo Labels
-=======================================
-
-.. _PSEUDO:
-
-Pseudo Label
-------------------
-
-Given model predictions :math:`y` on unlabeled samples, we can directly utilize them to generate
-pseudo labels :math:`label=\mathop{\arg\max}\limits_{i}~y[i]`. Then we use these pseudo labels as supervision to train
-our model. Details can be found at `projects/self_tuning/pseudo_label.py`.
-
-
-Holistic Methods
-=======================================
-
-.. _FIXMATCH:
-
-FixMatch
-------------------
-
-.. autoclass:: ssllib.fix_match.FixMatchConsistencyLoss
-
-
-Contrastive Learning
-=======================================
-
-.. _SELF_TUNING:
-
-Self-Tuning
-------------------
-
-.. autoclass:: ssllib.self_tuning.Classifier
-
-.. autoclass:: ssllib.self_tuning.SelfTuning
@@ -0,0 +1,12 @@
+=======================================
+Contrastive Learning
+=======================================
+
+.. _SELF_TUNING:
+
+Self-Tuning
+------------------
+
+.. autoclass:: ssllib.self_tuning.Classifier
+
+.. autoclass:: ssllib.self_tuning.SelfTuning
@@ -0,0 +1,10 @@
+=======================================
+Holistic Methods
+=======================================
+
+.. _FIXMATCH:
+
+FixMatch
+------------------
+
+.. autoclass:: ssllib.fix_match.FixMatchConsistencyLoss
@@ -0,0 +1,12 @@
+=======================================
+Proxy-Label Based Methods
+=======================================
+
+.. _PSEUDO:
+
+Pseudo Label
+------------------
+
+Given model predictions :math:`y` on unlabeled samples, we can directly utilize them to generate
+pseudo labels :math:`label=\mathop{\arg\max}\limits_{i}~y[i]`. Then we use these pseudo labels as supervision to train
+our model. Details can be found at `projects/self_tuning/pseudo_label.py`.
@@ -1,5 +1,5 @@
-Here are a few projects that are built on Trans-Learn. 
-They are examples of how to use Trans-Learn as a library, to facilitate your own research.
+Here are a few projects that are built on Trans-Learn. They are examples of how to use Trans-Learn as a library, to
+facilitate your own research.
 
 ## Projects by [THUML](https://github.com/thuml)
 
 
@@ -33,8 +33,9 @@ Supported methods include:
 ## Experiments and Results
 
 ### SSL with supervised pre-trained model
-The shell files give the script to reproduce our [results](/docs/ssllib/benchmarks/image_classification.rst#) with specified hyper-parameters.
-For example, if you want to run baseline on CUB200 with 15% labeled samples, use the following script
+
+The shell files give the script to reproduce our [results](benchmark.md) with specified hyper-parameters. For example,
+if you want to run baseline on CUB200 with 15% labeled samples, use the following script
 
 ```shell script
 # SSL with ResNet50 backbone on CUB200.
@@ -44,24 +45,32 @@ CUDA_VISIBLE_DEVICES=0 python baseline.py data/cub200 -d CUB200 -sr 15 --seed 0
 ```
 
 ### SSL with unsupervised pre-trained model
-Take MoCo as an example. 
+
+Take MoCo as an example.
+
 1. Download MoCo pretrained checkpoints from https://github.com/facebookresearch/moco
-2. Convert  the format of the MoCo checkpoints to the standard format of pytorch
+2. Convert the format of the MoCo checkpoints to the standard format of pytorch
+
 ```shell
 mkdir checkpoints
 python convert_moco_to_pretrained.py checkpoints/moco_v1_200ep_pretrain.pth.tar checkpoints/moco_v1_200ep_backbone.pth checkpoints/moco_v1_200ep_fc.pth
 ```
+
 3. Start training
+
 ```shell
 CUDA_VISIBLE_DEVICES=0 python baseline.py data/cub200 -d CUB200 -sr 15 --seed 0 --log logs/baseline_moco/cub200_15 \
   --pretrained checkpoints/moco_v1_200ep_backbone.pth
 ```
 
 ## TODO
+
 Support datasets: CIFAR10, CIFAR100, ImageNet
 
 ## Citation
+
 If you use these methods in your research, please consider citing.
+
 ```
 @inproceedings{pi-model,
     title={Temporal ensembling for semi-supervised learning},
 
@@ -83,8 +83,9 @@ def main(args: argparse.Namespace):
     pool_layer = nn.Identity() if args.no_pool else None
     classifier = Classifier(backbone, num_classes, pool_layer=pool_layer, finetune=not args.scratch).to(device)
 
-    # define optimizer
+    # define optimizer and lr scheduler
     optimizer = SGD(classifier.get_parameters(args.lr), args.lr, momentum=0.9, weight_decay=args.wd, nesterov=True)
+    lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, args.milestones, gamma=args.lr_gamma)
 
     # resume from the best checkpoint
     if args.phase == 'test':
@@ -97,8 +98,15 @@ def main(args: argparse.Namespace):
     # start training
     best_acc1 = 0.0
     for epoch in range(args.epochs):
+        # print lr
+        print(lr_scheduler.get_lr())
+
         # train for one epoch
         train(labeled_train_iter, classifier, optimizer, epoch, args)
+
+        # update lr
+        lr_scheduler.step()
+
         # evaluate on validation set
         with torch.no_grad():
             acc1 = utils.validate(val_loader, classifier, args, device)
@@ -188,6 +196,8 @@ def train(labeled_train_iter: ForeverDataIterator, model, optimizer: SGD, epoch:
                         help='mini-batch size (default: 48)')
     parser.add_argument('--lr', '--learning-rate', default=0.01, type=float,
                         metavar='LR', help='initial learning rate', dest='lr')
+    parser.add_argument('--lr-gamma', default=0.1, type=float, help='parameter for lr scheduler')
+    parser.add_argument('--milestones', type=int, default=[5], nargs='+', help='epochs to decay lr')
     parser.add_argument('--wd', '--weight-decay', default=1e-4, type=float,
                         metavar='W', help='weight decay (default:1e-4)')
     parser.add_argument('-j', '--workers', default=2, type=int, metavar='N',
 
@@ -17,25 +17,25 @@ CUDA_VISIBLE_DEVICES=0 python baseline.py data/aircraft -d Aircraft -sr 50 --see
 
 # MoCo (Unsupervised Pretraining)
 # ResNet50, CUB200
-CUDA_VISIBLE_DEVICES=0 python baseline.py data/cub200 -d CUB200 -i 2000 -sr 15 --seed 0 --log logs/baseline_moco/cub200_15 \
-  --pretrained checkpoints/moco_v1_200ep_backbone.pth
-CUDA_VISIBLE_DEVICES=0 python baseline.py data/cub200 -d CUB200 -i 2000 -sr 30 --seed 0 --log logs/baseline_moco/cub200_30 \
-  --pretrained checkpoints/moco_v1_200ep_backbone.pth
-CUDA_VISIBLE_DEVICES=0 python baseline.py data/cub200 -d CUB200 -i 2000 -sr 50 --seed 0 --log logs/baseline_moco/cub200_50 \
-  --pretrained checkpoints/moco_v1_200ep_backbone.pth
+CUDA_VISIBLE_DEVICES=0 python baseline.py data/cub200 -d CUB200 --lr 0.1 --epochs 12 --milestones 3 6 9 \
+  -i 2000 -sr 15 --seed 0 --log logs/baseline_moco/cub200_15 --pretrained checkpoints/moco_v1_200ep_backbone.pth
+CUDA_VISIBLE_DEVICES=0 python baseline.py data/cub200 -d CUB200 --lr 0.1 --epochs 12 --milestones 3 6 9 \
+  -i 2000 -sr 30 --seed 0 --log logs/baseline_moco/cub200_30 --pretrained checkpoints/moco_v1_200ep_backbone.pth
+CUDA_VISIBLE_DEVICES=0 python baseline.py data/cub200 -d CUB200 --lr 0.1 --epochs 12 --milestones 3 6 9 \
+  -i 2000 -sr 50 --seed 0 --log logs/baseline_moco/cub200_50 --pretrained checkpoints/moco_v1_200ep_backbone.pth
 
 # ResNet50, StanfordCars
-CUDA_VISIBLE_DEVICES=0 python baseline.py data/stanford_cars -d StanfordCars -i 2000 -sr 15 --seed 0 --log logs/baseline_moco/car_15 \
-  --pretrained checkpoints/moco_v1_200ep_backbone.pth
-CUDA_VISIBLE_DEVICES=0 python baseline.py data/stanford_cars -d StanfordCars -i 2000 -sr 30 --seed 0 --log logs/baseline_moco/car_30 \
-  --pretrained checkpoints/moco_v1_200ep_backbone.pth
-CUDA_VISIBLE_DEVICES=0 python baseline.py data/stanford_cars -d StanfordCars -i 2000 -sr 50 --seed 0 --log logs/baseline_moco/car_50 \
-  --pretrained checkpoints/moco_v1_200ep_backbone.pth
+CUDA_VISIBLE_DEVICES=0 python baseline.py data/stanford_cars -d StanfordCars --lr 0.1 --epochs 12 --milestones 3 6 9 \
+  -i 2000 -sr 15 --seed 0 --log logs/baseline_moco/car_15 --pretrained checkpoints/moco_v1_200ep_backbone.pth
+CUDA_VISIBLE_DEVICES=0 python baseline.py data/stanford_cars -d StanfordCars --lr 0.1 --epochs 12 --milestones 3 6 9 \
+  -i 2000 -sr 30 --seed 0 --log logs/baseline_moco/car_30 --pretrained checkpoints/moco_v1_200ep_backbone.pth
+CUDA_VISIBLE_DEVICES=0 python baseline.py data/stanford_cars -d StanfordCars --lr 0.1 --epochs 12 --milestones 3 6 9 \
+  -i 2000 -sr 50 --seed 0 --log logs/baseline_moco/car_50 --pretrained checkpoints/moco_v1_200ep_backbone.pth
 
 # ResNet50, Aircraft
-CUDA_VISIBLE_DEVICES=0 python baseline.py data/aircraft -d Aircraft -i 2000 -sr 15 --seed 0 --log logs/baseline_moco/aircraft_15 \
-  --pretrained checkpoints/moco_v1_200ep_backbone.pth
-CUDA_VISIBLE_DEVICES=0 python baseline.py data/aircraft -d Aircraft -i 2000 -sr 30 --seed 0 --log logs/baseline_moco/aircraft_30 \
-  --pretrained checkpoints/moco_v1_200ep_backbone.pth
-CUDA_VISIBLE_DEVICES=0 python baseline.py data/aircraft -d Aircraft -i 2000 -sr 50 --seed 0 --log logs/baseline_moco/aircraft_50 \
-  --pretrained checkpoints/moco_v1_200ep_backbone.pth
+CUDA_VISIBLE_DEVICES=0 python baseline.py data/aircraft -d Aircraft --lr 0.1 --epochs 12 --milestones 3 6 9 \
+  -i 2000 -sr 15 --seed 0 --log logs/baseline_moco/aircraft_15 --pretrained checkpoints/moco_v1_200ep_backbone.pth
+CUDA_VISIBLE_DEVICES=0 python baseline.py data/aircraft -d Aircraft --lr 0.1 --epochs 12 --milestones 3 6 9 \
+  -i 2000 -sr 30 --seed 0 --log logs/baseline_moco/aircraft_30 --pretrained checkpoints/moco_v1_200ep_backbone.pth
+CUDA_VISIBLE_DEVICES=0 python baseline.py data/aircraft -d Aircraft --lr 0.1 --epochs 12 --milestones 3 6 9 \
+  -i 2000 -sr 50 --seed 0 --log logs/baseline_moco/aircraft_50 --pretrained checkpoints/moco_v1_200ep_backbone.pth