Skip to content
Discussion options

You must be logged in to vote

I'm curious to understand why you have GPU memory problems when operating on a graph with around 40 nodes :)

The tower argument is similar to the groups argument in torch.nn.Conv2d, where you subdivide your number of features into groups, and each group is solely transformed based on the features inside the same group. This will reduce the number of parameters from in_channels * out_channels to num_groups * (in_channels/num_groups) * (out_channels/num_groups) and, as a result, might prevent overfitting. I personally don't think that the tower size is highly sensible to model performance.

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@MattJud
Comment options

@rusty1s
Comment options

@MattJud
Comment options

@MattJud
Comment options

@rusty1s
Comment options

Answer selected by MattJud
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants