Startup: add argument-consistency checks & summary table (Fixes #124) #409
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
Summary
Adds a lightweight validation layer and a configuration summary printed at startup, inspired by GPT-NeoX, resolving Issue #124.
Key features
megatron/arguments.py_validate_and_summarize_args(args)— runs sanity checks:hidden_size % num_attention_heads == 0global_batch_size % data_parallel_size == 0pad_vocab_size_to(if set) divisible by TP sizeValueErrorif any rule fails, aborting early before costly init.Why it matters
Early mis-configs (e.g., mismatched hidden/head sizes or bad batch divisibility) now surface instantly, saving hours of debugging and wasted GPU time.
Testing
pytest -q tests— all existing tests pass.pretrain_gpt_tiny.shon 1 GPU and 4 GPU runs; summary appears once on rank 0.hidden_size(not divisible by heads) — run aborts immediately with clear error.Backward compatibility
Purely additive logging/validation. No impact on training logic or performance.
Fixes #124