Hello, this is a very impressive work! I’m currently trying to conduct visualization analysis on models such as Mamba-2, and I’d like to build everything from scratch. I noticed entries like
"global_channel_indexes": "./remapping_configs/precomputed_tensors/mamba2-1.3b_global_channel_indexes.pth",
"per_channel_decays": "./remapping_configs/precomputed_tensors/mamba2-1.3b_per_channel_decays.pth"
Could you please share how these .pth files were computed? I couldn’t find the corresponding scripts—would it be possible to open-source them?
Additionally, I’m very curious about the lookup_func functions. I couldn’t locate their implementations either, and I see that in the mamba2.py
file, the default values for several related parameters are set to None.
Thank you very much!
Hello, this is a very impressive work! I’m currently trying to conduct visualization analysis on models such as Mamba-2, and I’d like to build everything from scratch. I noticed entries like
Could you please share how these .pth files were computed? I couldn’t find the corresponding scripts—would it be possible to open-source them?
Additionally, I’m very curious about the
lookup_funcfunctions. I couldn’t locate their implementations either, and I see that in the mamba2.pyfile, the default values for several related parameters are set to None.
Thank you very much!