Open a High-Score (HS) Track

## Opening High-Score (HS) Track

Rationale: The NanoGPT speedrun has been effective in optimizing training speed, but at the expense of code readability. For instance, without a comprehensive understanding of float precision, one might struggle to comprehend how the following code operates:
```python
    acc_m_u32 = (acc_bf16_view_u16.to(torch.uint32) << 16) | mantissa.to(torch.uint32)
    acc_m_u32.view(torch.float32).mul_(1 - eff_weight_decay)
    acc_m_u32.view(torch.float32).add_(other=v, alpha=-eff_lr)
    acc_bf16_view_u16.copy_((acc_m_u32 >> 16).to(torch.uint16))
    mantissa.copy_(acc_m_u32.to(torch.uint16))
```
It is even more unclear why this implementation is faster than a direct approach.

I propose opening a High-Score (HS) track aimed at balancing legibility and efficiency. This is my draft:

1. Models must be trained on a predefined number `x` of tokens (e.g., 2 billion). These tokens must appear in the same sequence during training. Early exiting, skipping data, or using any piece of data more than once is prohibited.
2. The total number of active parameters must not exceed `y` million (`y`M).
3. The total training time must not exceed `z` minutes. The value of `z` should be slightly higher than the typical runtime of a standard training run (without NanoGPT-specific optimizations). Runs that exceed the time limit without processing all tokens will be disqualified.
4. Evaluate the model on `w` predefined downstream NLP benchmarks. The score will be calculated as the average accuracy across these benchmarks.
5. (Optional) Penalize 0.001% of the score for every valid line of code. This can serve as a Kolmogorov complexity penalty term, encouraging concise and efficient implementations.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Open a High-Score (HS) Track #106

Opening High-Score (HS) Track

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Open a High-Score (HS) Track #106

Description

Opening High-Score (HS) Track

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions