Skip to content

fix: guard target-param-data-ratio against zero to avoid ZeroDivision…#579

Open
suraj-self wants to merge 3 commits intokarpathy:masterfrom
suraj-self:fix-scaling-zero-division
Open

fix: guard target-param-data-ratio against zero to avoid ZeroDivision…#579
suraj-self wants to merge 3 commits intokarpathy:masterfrom
suraj-self:fix-scaling-zero-division

Conversation

@suraj-self
Copy link

This PR adds a sanity check for the --target-param-data-ratio argument. Fix for #578

  • Prevents ZeroDivisionError when the user accidentally provides 0.
  • Ensures the script fails fast with a clear message: "target-param-data-ratio must be positive (or -1 to disable)".
  • Maintains compatibility with the -1 flag used to disable scaling-based iterations.

Logs

(nanochat) ss@ss-Predator-PH315-53:~/nanochat$ python -m scripts.base_train   --depth 1   --aspect-ratio 1   --target-param-data-ratio 0

                                                       █████                █████
                                                      ░░███                ░░███
     ████████    ██████   ████████    ██████   ██████  ░███████    ██████  ███████
    ░░███░░███  ░░░░░███ ░░███░░███  ███░░███ ███░░███ ░███░░███  ░░░░░███░░░███░
     ░███ ░███   ███████  ░███ ░███ ░███ ░███░███ ░░░  ░███ ░███   ███████  ░███
     ░███ ░███  ███░░███  ░███ ░███ ░███ ░███░███  ███ ░███ ░███  ███░░███  ░███ ███
     ████ █████░░████████ ████ █████░░██████ ░░██████  ████ █████░░███████  ░░█████
    ░░░░ ░░░░░  ░░░░░░░░ ░░░░ ░░░░░  ░░░░░░   ░░░░░░  ░░░░ ░░░░░  ░░░░░░░░   ░░░░░
    
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/ss/nanochat/scripts/base_train.py", line 82, in <module>
    assert args.target_param_data_ratio > 0 or args.target_param_data_ratio == -1, "target-param-data-ratio must be positive (or -1 to disable)"
AssertionError: target-param-data-ratio must be positive (or -1 to disable)
(nanochat) ss@ss-Predator-PH315-53:~/nanochat$

@suraj-self
Copy link
Author

Hi @svlandeg
I've implemented the fix using an assert to fail fast when 0 is provided, as it prevents the ZeroDivisionError and clarifies that -1 should be used to disable scaling.

However, would you prefer the script to gracefully fall back to the default Chinchilla ratio (10.5) when 0 is detected instead of crashing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ZeroDivisionError in scaling law math when --target-param-data-ratio is set to 0

2 participants