Skip to content

Commit 97a1534

Browse files
TroyGardenfacebook-github-bot
authored andcommitted
update docstring for num_pooling and pooling_factor (#3396)
Summary: Pull Request resolved: #3396 # context * Often confused by the terminology related to pooling * for normal use cases, the pooling factor means the average id count in a pooled embedding. for example, sparse feature of "books a person has read", the pooling factor would be the average book count perople has read * for EBF (event-based feature), there would be multiple events in a feature, the num_pooling describes the event count in a feature. for example, the EBF of "books a person has read" for "people I met in past week", would have the pooling factor, while the num_pooling is the average met-people count. Reviewed By: mserturk Differential Revision: D83356999 fbshipit-source-id: 7a68fb7d60026b88a467766a52edb1743be89603
1 parent 15c3d7a commit 97a1534

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

torchrec/distributed/planner/shard_estimators.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -299,7 +299,8 @@ def perf_func_emb_wall_time(
299299
world_size (int): the number of devices for all hosts.
300300
local_world_size (int): the number of the device for each host.
301301
input_lengths (List[float]): the list of the average number of lookups of each
302-
input query feature.
302+
input query feature. Also referred to as pooling_mean, and it's equal to
303+
the pooling_factor * num_pooling.
303304
input_data_type_size (float): the data type size of the distributed
304305
data_parallel input.
305306
table_data_type_size (float): the data type size of the table.
@@ -308,7 +309,8 @@ def perf_func_emb_wall_time(
308309
data_parallel input during forward communication.
309310
bwd_comm_data_type_size (float): the data type size of the distributed
310311
data_parallel input during backward communication.
311-
num_poolings (List[float]): number of poolings per sample, typically 1.0.
312+
num_poolings (List[float]): number of poolings per sample, typically 1.0 for
313+
non-EBF use cases. In EBF use cases, this is the number of events per sample.
312314
hbm_mem_bw (float): the bandwidth of the device HBM.
313315
ddr_mem_bw (float): the bandwidth of the system DDR memory.
314316
hbm_to_ddr_bw (float): the bandwidth between device HBM and system DDR.

0 commit comments

Comments
 (0)