The meaning of "samples seen scales" #446

yuezewang · 2023-02-21T08:46:12Z

yuezewang
Feb 21, 2023

Hi~ May I know the meaning of "samples seen scales" in the OpenCLIP's paper (Reproducible scaling laws for contrastive language-image learning) ?

gabrielilharco · 2023-02-21T16:48:12Z

gabrielilharco
Feb 21, 2023
Maintainer

Hi @yuezewang, it refers to the number of image-text pairs seen by the model during training. For example if a model has batch size 100k and trains for 100k steps, it "sees" 10 billion samples during training. This number can be scaled up and down; in the paper, we have experiments ranging from 3 billion to 34 billion samples seen.

0 replies

yuezewang · 2023-02-22T06:31:56Z

yuezewang
Feb 22, 2023
Author

Hi @yuezewang, it refers to the number of image-text pairs seen by the model during training. For example if a model has batch size 100k and trains for 100k steps, it "sees" 10 billion samples during training. This number can be scaled up and down; in the paper, we have experiments ranging from 3 billion to 34 billion samples seen.

It's kind of you to answer, thank you~ Does it mean that people need to repeatedly sample a pre-defined number of samples (3B / 13B / 34B) from the original dataset in order to construct a "new" dataset? Alternatively, the number of steps should be accumulated during epoch-wise training so as to reach the pre-defined number of samples (3B / 13B / 34B)?

0 replies

gabrielilharco · 2023-02-22T18:20:42Z

gabrielilharco
Feb 22, 2023
Maintainer

@yuezewang, you don't need to explicitly construct a new dataset. For example, if you train on a dataset with 300M samples for 10 epochs, that will be 3B samples seen. Does this make sense?

1 reply

yuezewang Feb 23, 2023
Author

Ok, thanks for your patient answer~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The meaning of "samples seen scales" #446

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

The meaning of "samples seen scales" #446

Uh oh!

Uh oh!

yuezewang Feb 21, 2023

Replies: 3 comments · 1 reply

Uh oh!

gabrielilharco Feb 21, 2023 Maintainer

Uh oh!

yuezewang Feb 22, 2023 Author

Uh oh!

gabrielilharco Feb 22, 2023 Maintainer

Uh oh!

yuezewang Feb 23, 2023 Author

yuezewang
Feb 21, 2023

Replies: 3 comments 1 reply

gabrielilharco
Feb 21, 2023
Maintainer

yuezewang
Feb 22, 2023
Author

gabrielilharco
Feb 22, 2023
Maintainer

yuezewang Feb 23, 2023
Author