world's stupidest moe llm in 103M parameters
Beens-MiniMax (base model) - https://www.kaggle.com/models/abineshmathivanan/beens-2
Beens-MiniMax (SFT-tuned LoRA) - https://www.kaggle.com/models/abineshmathivanan/beens-lora/pyTorch/default
Beens-MiniMax (notebook implementation) - https://www.kaggle.com/code/abineshmathivanan/beens-mini-llm