@@ -170,10 +170,10 @@ Note: The 3.28 target was selected to match [Andrej Karpathy's GPT-2 (small) rep
17017066 | 1.595 minutes | Torch 2.10 | 01/31/26 | - | -
17117167 | 1.540 minutes | [ Tune fused softcap kernels and fuse fp8 quantization in LM head] ( https://x.com/classiclarryd/status/2021015642472869978 ) | 01/31/26 | [ log] ( records/track_1_short/2026-01-24_ImprovedLMHead/record/73a071ac-522d-4ce0-a4d6-cf3955a376e4.txt ) ,[ PR] ( https://github.com/KellerJordan/modded-nanogpt/pull/207 ) | @andrewbriand8
17217268 | 1.535 minutes | [ Move bigram hash to GPU] ( https://x.com/classiclarryd/status/2021450730117460439 ) | 01/31/26 | [ log] ( records/track_1_short/2026-01-31-BigramHashH2D/112c686e-b0d6-4dc8-814a-1ad1f5d5b274.txt ) ,[ PR] ( https://github.com/KellerJordan/modded-nanogpt/pull/216 ) | @dhrvji
173- 69 | 1.528 minutes | Kernel Optimizations | 02/02/26 | [ log] ( records/track_1_short/2026-02-02_KernelTuning/25afb73a-332f-4d69-b9ab-f6261497f2d8.txt ) ,[ PR] ( https://github.com/KellerJordan/modded-nanogpt/pull/217 ) | @EmmettBicker & AI System [ Aster] ( https://www.asterlab.ai/ )
174- 70 | 1.521 minutes | Tune value embed layout and ve_gates | 02/03/26 | [ log] ( records/track_1_short/2026-02-03_VeTuned/42cbebac-0599-4a89-a00e-26d1c4cad140.txt ) ,[ PR] ( https://github.com/KellerJordan/modded-nanogpt/pull/218 ) | @photon_mz
175- 71 | 1.516 minutes | Sparse bigram gradient comms and optimized loading on CPU | 02/06/26 | [ log] ( records/track_1_short/2026-02-06_SparseBigramGradient/02fee7bd-cd22-478b-9e8e-12e857ff3152.txt ) ,[ PR] ( https://github.com/KellerJordan/modded-nanogpt/pull/221 ) | @roeeshenberg
176- 71 | 1.496 minutes | Increase minimum lr and add max_seq_len schedule | 02/10/26 | [ log] ( records/track_1_short/2026-02-10_ShortWindow/Short-Window_1_1.txt ) ,[ PR] ( https://github.com/KellerJordan/modded-nanogpt/pull/224 ) | @dualverse-ai & AI System [ Station] ( https://github.com/dualverse-ai/station )
173+ 69 | 1.528 minutes | [ Kernel Optimizations] ( https://x.com/classiclarryd/status/2023319358303510719 ) | 02/02/26 | [ log] ( records/track_1_short/2026-02-02_KernelTuning/25afb73a-332f-4d69-b9ab-f6261497f2d8.txt ) ,[ PR] ( https://github.com/KellerJordan/modded-nanogpt/pull/217 ) | @EmmettBicker & AI System [ Aster] ( https://www.asterlab.ai/ )
174+ 70 | 1.521 minutes | [ Tune value embed layout and ve_gates] ( https://x.com/classiclarryd/status/2023319358303510719 ) | 02/03/26 | [ log] ( records/track_1_short/2026-02-03_VeTuned/42cbebac-0599-4a89-a00e-26d1c4cad140.txt ) ,[ PR] ( https://github.com/KellerJordan/modded-nanogpt/pull/218 ) | @photon_mz
175+ 71 | 1.516 minutes | [ Sparse bigram gradient comms and optimized loading on CPU] ( https://x.com/classiclarryd/status/2023319358303510719 ) | 02/06/26 | [ log] ( records/track_1_short/2026-02-06_SparseBigramGradient/02fee7bd-cd22-478b-9e8e-12e857ff3152.txt ) ,[ PR] ( https://github.com/KellerJordan/modded-nanogpt/pull/221 ) | @roeeshenberg
176+ 71 | 1.496 minutes | [ Increase minimum lr and add max_seq_len schedule] ( https://x.com/classiclarryd/status/2023319358303510719 ) | 02/10/26 | [ log] ( records/track_1_short/2026-02-10_ShortWindow/Short-Window_1_1.txt ) ,[ PR] ( https://github.com/KellerJordan/modded-nanogpt/pull/224 ) | @dualverse-ai & AI System [ Station] ( https://github.com/dualverse-ai/station )
177177## Rules
178178
179179New records must:
0 commit comments