Skip to content

Releases: PaddlePaddle/PaddleFormers

PaddleFormers v0.3

18 Sep 12:35
314910e

Choose a tag to compare

PaddleFormers 0.3 is officially released! This release introduces several key features and improvements:

✨ New Features

1. Hugging Face safetensor weight loading & saving

PaddleFormers now supports loading and saving Hugging Face safetensor model weights.

from paddleformers.transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-0.6B-Base",
    convert_from_hf=True
)
model.save_pretrained("Qwen/Qwen3-0.6B-Base-new", save_to_hf=True)

2. New model support

Added support for the following models:

  • qwen2
  • qwen3
  • qwen2moe
  • qwen3moe
  • ernie4_5
  • ernie4_5_moe
  • gpt_oss

3. Generalized large model modules (paddleformers/nn)

Introduced a generalized module library for large models to reduce the cost of integrating distributed training.
Includes:

  • Attention
  • Embedding
  • Pipeline parallel model
  • Normalization
  • MLP
  • LM Head
  • Linear

You can check out the implementation details here:
https://github.com/PaddlePaddle/PaddleFormers/tree/develop/paddleformers/nn

PaddleFormers v0.2

04 Sep 09:33
c905bd9

Choose a tag to compare

PaddleFormers 0.2 is officially released! This release introduces several key features and improvements:

✨ New Features

1. Multi-source Model Download

  • Added support for downloading models from HuggingFace Hub, ModelScope, and AI Studio, making model access more flexible and convenient.

2. HuggingFace Tokenizer Compatibility

  • PaddleFormers now wraps and supports HuggingFace Tokenizer, allowing users to directly leverage the HuggingFace tokenizer ecosystem while keeping the PaddleFormers experience consistent.

3. Lazy Import Optimization

  • Introduced lazy import mechanism, enabling the Tokenizer module to be used independently without requiring Paddle installation.
  • This makes it easier to use Tokenizer in lightweight scenarios, such as preprocessing or pure inference, while improving modularity and usability.

PaddleFormers v0.1

29 Jun 13:36
053a5a5

Choose a tag to compare

PaddleFormers 0.1 is officially released! This initial version supports SFT/DPO training paradigms, configurable distributed training via unified Trainer API, and integrates PEFT, MergeKit, and Quantization APIs for diverse LLM applications.

Highlights

⚙️ Simplified Distributed Training

Implements 4D parallel strategies through unified Trainer API, lowering the barrier to distributed LLM training.

🛠 Efficient Post-Training

Integrates Packing dataflow and FlashMask operators for SFT/DPO training, eliminating padding waste and boosting throughput.

💾 Industrial Storage Solution

Features Unified Checkpoint storage tools for LLMs, enabling training resumption and dynamic resource scaling. Additionally implements asynchronous storage (up to 95% faster) and Optimizer State Quantization (78% storage reduction), ensuring industrial training meets both efficiency and stability requirements.