Releases: softmax1/EsperBERTo
Releases · softmax1/EsperBERTo
Results for Activations and Weights with 5x data
I retrained EsperBERTo for both softmax0 and softmax1 with almost five times the amount of data used in the initial runs.
I also computed the kurtosis in the activations this time instead of only looking at the weights.
There is still no difference in the kurtoses between the two models.
Associated commits on Hugging Face hub:
- chriswmurphy/esperberto-softmax1, f53e3099de21468ee7bdba5297e6114949a03085
- chriswmurphy/esperberto-softmax0, 1cd186cef8671372f247950c8d94663fc2ba4f3e
- chriswmurphy/esperanto, 20fae78fd138f6977312427256004820278b0653
Initial Results
This commit that's tagged as v0.0.1 is equivalent to what was used to produce the initial results.
If more runs are done with this model, we'll need a way to distinguish the new results from the old ones.
Commits on Hugging Face hub associated with this release:
- chriswmurphy/esperberto-softmax1, 5557a704fe787fc332c355b9ce3b69cf5973629
- chriswmurphy/esperberto-softmax0, c34e937b69171e92e35a9e42c64b6db89c68fbc5
- chriswmurphy/esperanto, 0fef7d18daee7d8f3aa8dd2a011d90046d94e71f