Skip to content

Releases: softmax1/EsperBERTo

Results for Activations and Weights with 5x data

25 Aug 14:12
bd01079

Choose a tag to compare

I retrained EsperBERTo for both softmax0 and softmax1 with almost five times the amount of data used in the initial runs.
I also computed the kurtosis in the activations this time instead of only looking at the weights.
There is still no difference in the kurtoses between the two models.

Associated commits on Hugging Face hub:

Initial Results

29 Jul 20:44
e9181ae

Choose a tag to compare

This commit that's tagged as v0.0.1 is equivalent to what was used to produce the initial results.
If more runs are done with this model, we'll need a way to distinguish the new results from the old ones.

Commits on Hugging Face hub associated with this release: