How to determine the range of num_hidden_unit given the input feature dimension. #267
-
Thank you for developing such a powerful tool! I'm currently experimenting with grid search and wondering if there are any fundamental principles for determining the number of hidden units given the input data dimensions? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Hi @AlbertXTang , thanks for asking.
Yes, this is possible and in fact recommended. The number of hidden units (alongside the model architecture) determines the capacity/#parameters of the model. With more capacity and very little data, it becomes more likely that the network overfits. With less capacity, you might get underfitting. The factor (1.5, 2, ...) mainly depends on how much data you have available (=samples/timesteps). If you have enough data to avoid overfitting, it is certainly possible to e.g. train a model with hidden dim 256 or even higher. Projection to a lower dim space is done in the last layer, and the number of output features, is indeed much lower than the input dimensionality. The best practices notebook has some examples how to practically tune the model! How large is your dataset? |
Beta Was this translation helpful? Give feedback.
Hi @AlbertXTang , thanks for asking.
Yes, this is possible and in fact recommended. The number of hidden units (alongside the model architecture) determines the capacity/#parameters of the model. With more capacity and very little data, it becomes more likely that the network overfits. With less capacity, you might get underfitting. The factor (1.5, 2, ...) mainly depends on how much data you have available (=samples/timesteps). If you have enough data to avoid ov…