Skip to content

Conversation

@ry2009
Copy link

@ry2009 ry2009 commented Nov 26, 2025

Summary

  • Add a safe vecenv fallback in create_environments: if breakout.so is missing symbols, we fall back to a minimal dummy VecEnv instead of crashing.
  • Fix batched_forward state shape to {num_layers, minibatch_segments, 1, hidden_size}, matching PolicyMinGRU.
  • Remove unused #include <stdatomic.h> in vecenv.h.

Why

  • Previously, pufferlib.pufferl sps would segfault when the native env shared library wasn’t available or lacked OBS_N/ACT_N/OBS_T/ACT_T.
  • The incorrect state shape in batched_forward could lead to shape mismatches for RNN policies.

Notes / Perf

  • Fused CUDA kernels (including RMSNorm) build and run; on A100×2:
    • compile_puffer.py: ~7.35M inference SPS, ~2.45M train SPS (model-only microbench).
    • python -m pufferlib.pufferl sps puffer_nmmo3: runs without crashing; ~2.7M SPS with the dummy fallback.
  • To report real env SPS, a native NMMO3 vecenv .so exporting OBS_N/ACT_N/OBS_T/ACT_T is still needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant