File tree Expand file tree Collapse file tree 1 file changed +3
-3
lines changed Expand file tree Collapse file tree 1 file changed +3
-3
lines changed Original file line number Diff line number Diff line change @@ -161,7 +161,7 @@ For optimal results, combine depth and width pruning. This will require more tun
161161After pruning, distillation is required to recover model accuracy. Below are recommended starting hyperparameters for distillation:
162162
163163|  ** Hyperparameter**  |  ** Recommendation**  | 
164- |  :--- |  :--- | 
164+ |  :---:  |  :---:  | 
165165|  ** Sequence Length**  |  8192 (or 4096 if dataset has smaller sequences) | 
166166|  ** Global Batch Size (GBS)**  |  768 | 
167167|  ** Micro Batch Size (MBS)**  |  As large as your GPU memory can accommodate | 
@@ -192,8 +192,8 @@ Check out the FastNAS pruning example usage in the [documentation](https://nvidi
192192
193193You can also take a look at FastNAS pruning interactive notebook [ cifar_resnet] ( ./cifar_resnet.ipynb )  in this directory
194194which showcases the usage of FastNAS for pruning a ResNet 20 model for the CIFAR-10 dataset. The notebook
195- also how to profiling  the model to understand the search space of possible pruning options and demonstrates
196- the usage saving  and restoring  pruned models.
195+ also shows  how to profile  the model to understand the search space of possible pruning options and demonstrates
196+ how to save  and restore  pruned models.
197197
198198### GradNAS Pruning for HuggingFace Language Models (e.g. BERT)  
199199
    
 
   
 
     
   
   
          
     
  
    
     
 
    
      
     
 
     
    You can’t perform that action at this time.
  
 
    
  
     
    
      
        
     
 
       
      
     
   
 
    
    
  
 
  
 
     
    
0 commit comments