Skip to content

Commit 5a7d78f

Browse files
Updated on-gpu benchmaking with cuda event & sync
1 parent a6c87d8 commit 5a7d78f

File tree

3 files changed

+26
-7
lines changed

3 files changed

+26
-7
lines changed

CHANGELOG.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,20 +9,24 @@ From v1.0.0 and on, the project will adherence strictly to Semantic Versioning.
99

1010
## [Unreleased]
1111

12+
## [0.2.1] - 2022-02-11
13+
### Fixed
14+
- Updated on-gpu model benchmaking with best-practices on `cuda.Event` and `cuda.synchronize`.
15+
1216

13-
## [0.2.0] - 2022-02-11]
17+
## [0.2.0] - 2022-02-11
1418
### Added
1519
- Overloads for benchmark parameters and functions to allow benchmark of custom classes.
1620

1721

18-
## [0.1.2] - 2022-02-10]
22+
## [0.1.2] - 2022-02-10
1923
### Fixed
2024
- GPU compatibility.
2125

2226
### Removed
2327
- Carbon-tracker energy measurement. Library is still too immature at this point.
2428

2529

26-
## [0.1.1] - 2022-02-10]
30+
## [0.1.1] - 2022-02-10
2731
### Added
2832
- Initial version.

pytorch_benchmark/benchmark.py

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -150,14 +150,29 @@ def measure_repeated_inference_timing(
150150
):
151151
start_on_cpu = time()
152152
device_sample = transfer_to_device_fn(sample, model_device)
153-
start_on_device = time()
153+
154+
if model_device.type == "cuda":
155+
start_event = torch.cuda.Event(enable_timing=True)
156+
stop_event = torch.cuda.Event(enable_timing=True)
157+
start_event.record() # For GPU timing
158+
start_on_device = time() # For CPU timing
159+
154160
device_result = model(device_sample)
155-
stop_on_device = time()
161+
162+
if model_device.type == "cuda":
163+
stop_event.record()
164+
torch.cuda.synchronize()
165+
elapsed_on_device = stop_event.elapsed_time(start_event)
166+
stop_on_device = time()
167+
else:
168+
stop_on_device = time()
169+
elapsed_on_device = stop_on_device - start_on_device
170+
156171
transfer_to_device_fn(device_result, "cpu")
157172
stop_on_cpu = time()
158173

159174
t_c2d.append(start_on_device - start_on_cpu)
160-
t_inf.append(stop_on_device - start_on_device)
175+
t_inf.append(elapsed_on_device)
161176
t_d2c.append(stop_on_cpu - stop_on_device)
162177
t_tot.append(stop_on_cpu - start_on_cpu)
163178

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ def from_file(file_name: str = "requirements.txt", comment_char: str = "#"):
2525

2626
setup(
2727
name="pytorch-benchmark",
28-
version="0.2.0",
28+
version="0.2.1",
2929
description="Easily benchmark PyTorch model FLOPs, latency, throughput, max allocated memory and energy consumption in one go.",
3030
long_description=long_description(),
3131
long_description_content_type="text/markdown",

0 commit comments

Comments
 (0)