Skip to content

Commit 4104842

Browse files
committed
more docstring
Signed-off-by: Hao Wu <[email protected]>
1 parent 05e54a6 commit 4104842

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

emerging_optimizers/utils/modules.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,11 @@ class Conv1dFlatWeights(nn.Conv1d):
3535
3636
Arguments are the same as ::class:`torch.nn.Conv1d`.
3737
38+
Note:
39+
This implementation potentially introduces a small overhead because of split weights can combining gradients
40+
of it. This should be trivial compared to computational cost of LLM training. If it becomes a concern, a
41+
kernel can be developed to eliminate the overhead.
42+
3843
Note:
3944
Similar flattening logic can be applied to N-D convolution. But since we don't have use cases of them in LLM
4045
yet, they are not supported despite the __init__() function is generalized enough to support N-D convolution.

0 commit comments

Comments
 (0)