Disable memory benchmarking (#589)

Igor Shilov · facebook-github-bot · commit 7b280548d319 · 2023-08-01T12:12:17.000-07:00
Summary: Our tests have been red for a while due to failing memory bechmarks. ## Issue When benchmarking opacus we run the training script multiple times within one process: ``` for i in range(args.num_runs): run_layer_benchmark( ... ) ``` We use built-in pytorch tools to check memory stats. Crucially, we verify that `torch.cuda.memory_allocated()` is 0 before the run starts. Normally, it should be 0, as all previous tensors are out of scope and should have been collected. It all worked fine until something changed and some GPU memory stayed allocated between runs. No idea why, but explicit cache clearing or object deletion didn't help. So I gave up and disabled memory benchmarking, since it seems like it's not a complicated thing to do due to some PyTorch update Pull Request resolved: #589 Reviewed By: JohnlNguyen Differential Revision: D45691684 Pulled By: karthikprasad fbshipit-source-id: 82006e503240532840d3fb6dc0314f2202780973
diff --git a/.circleci/config.yml b/.circleci/config.yml
@@ -273,7 +273,6 @@ commands:
             python benchmarks/generate_report.py --path-to-results /tmp/report_layers --save-path benchmarks/results/report-${report_id}.pkl --format pkl
 
             python benchmarks/check_threshold.py --report-path "./benchmarks/results/report-"$report_id".pkl" --metric runtime --threshold <<parameters.runtime_ratio_threshold>>  --column <<parameters.report_column>>
-            python benchmarks/check_threshold.py --report-path "./benchmarks/results/report-"$report_id".pkl" --metric memory --threshold <<parameters.memory_ratio_threshold>>  --column <<parameters.report_column>>
           when: always
       - store_artifacts:
           path: benchmarks/results/
diff --git a/benchmarks/utils.py b/benchmarks/utils.py
@@ -230,7 +230,7 @@ def generate_report(path_to_results: str, save_path: str, format: str) -> None:
     pivot = results.pivot_table(
         index=["batch_size", "num_runs", "num_repeats", "forward_only", "layer"],
         columns=["gsm_mode"],
-        values=["runtime", "memory"],
+        values=["runtime"],
     )
 
     def add_ratio(df, metric, variant):
@@ -245,7 +245,6 @@ def add_ratio(df, metric, variant):
     if "baseline" in results["gsm_mode"].tolist():
         for m in set(results["gsm_mode"].tolist()) - {"baseline"}:
             add_ratio(pivot, "runtime", m)
-            add_ratio(pivot, "memory", m)
         pivot.columns = pivot.columns.set_names("value", level=1)
 
     output = pivot.sort_index(axis=1).sort_values(