80B/13B active MoE, good benchmarks. Seems right up ik_llama.cpp's alley, aka expert offloading like deepseek.
Uses a custom architecture with good old GQA and NTK rope scaling. At a glance it doesn't look like anything too exotic: https://huggingface.co/tencent/Hunyuan-A13B-Instruct/tree/main
Relevant main llama.cpp issue: ggml-org/llama.cpp#14415