How to determine the range of num_hidden_unit given the input feature dimension. #267

AlbertXTang · 2025-09-02T16:04:23Z

AlbertXTang
Sep 2, 2025

Thank you for developing such a powerful tool! I'm currently experimenting with grid search and wondering if there are any fundamental principles for determining the number of hidden units given the input data dimensions?
Specifically, can the number of hidden units exceed the input data dimensions? For example, when the input data is 70, is it reasonable to consider setting 1.5 to 2 times the number of hidden units to capture more information?

Answered by stes

Sep 2, 2025

Hi @AlbertXTang , thanks for asking.

Specifically, can the number of hidden units exceed the input data dimensions? For example, when the input data is 70, is it reasonable to consider setting 1.5 to 2 times the number of hidden units to capture more information?

Yes, this is possible and in fact recommended. The number of hidden units (alongside the model architecture) determines the capacity/#parameters of the model. With more capacity and very little data, it becomes more likely that the network overfits. With less capacity, you might get underfitting. The factor (1.5, 2, ...) mainly depends on how much data you have available (=samples/timesteps). If you have enough data to avoid ov…

View full answer

stes · 2025-09-02T18:22:49Z

stes
Sep 2, 2025
Maintainer

Hi @AlbertXTang , thanks for asking.

Specifically, can the number of hidden units exceed the input data dimensions? For example, when the input data is 70, is it reasonable to consider setting 1.5 to 2 times the number of hidden units to capture more information?

Yes, this is possible and in fact recommended. The number of hidden units (alongside the model architecture) determines the capacity/#parameters of the model. With more capacity and very little data, it becomes more likely that the network overfits. With less capacity, you might get underfitting. The factor (1.5, 2, ...) mainly depends on how much data you have available (=samples/timesteps). If you have enough data to avoid overfitting, it is certainly possible to e.g. train a model with hidden dim 256 or even higher.

Projection to a lower dim space is done in the last layer, and the number of output features, is indeed much lower than the input dimensionality.

The best practices notebook has some examples how to practically tune the model!

How large is your dataset?

2 replies

AlbertXTang Sep 3, 2025
Author

Thank you for your response!
My dataset contains 70 features and approximately 40,000 samples, and the sample size seems sufficient to accommodate more hidden units. I've also discovered that using more than 70 hidden units does indeed yield clearer embeddings!

stes Sep 3, 2025
Maintainer

Yes, in that range (without knowing any additional details about data complexity), it is def. safe to go to hidden dim of 256-512.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to determine the range of num_hidden_unit given the input feature dimension. #267

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to determine the range of num_hidden_unit given the input feature dimension. #267

Uh oh!

AlbertXTang Sep 2, 2025

Replies: 1 comment · 2 replies

Uh oh!

Uh oh!

stes Sep 2, 2025 Maintainer

Uh oh!

AlbertXTang Sep 3, 2025 Author

Uh oh!

stes Sep 3, 2025 Maintainer

AlbertXTang
Sep 2, 2025

Replies: 1 comment 2 replies

stes
Sep 2, 2025
Maintainer

AlbertXTang Sep 3, 2025
Author

stes Sep 3, 2025
Maintainer