Releases: SeanLee97/AnglE
Releases · SeanLee97/AnglE
v0.6.0
What's Changed
- refactor by @SeanLee97 in #112
Detailed changes:
- use uv to manage dependencies
- simplify the implementation
- Remove all imports of AngleDataTokenizer
- Remove all imports of DatasetFormats
- Remove all .map(AngleDataTokenizer(...)) calls
- Update dataset field names (text → query for Format B/C) OR use --column_rename_mapping
- Add is_llm=True to LLM model initialization
- Replace --prompt_template with --text_prompt, --query_prompt, or --doc_prompt
- Update training scripts to use accelerate launch
- Update evaluation code if using the return value
- Support input data as a list of strings. New data formats:
- A: {"text1": str | List[str], "text2": str | List[str], "label": float}
- B: {"query": str | List[str], "positive": str | List[str]}
- C: {"query": str | List[str], "positive": str | List[str], "negative": str | List[str]}
- Support fsdp training
- Update docs
Migration guide: https://github.com/SeanLee97/AnglE/blob/main/MIGRATION_GUIDE.md
Full Changelog: v0.5.6...v0.6.0
v0.5.6
What's Changed
- fix AttributeError caused by last_hidden_state by @SeanLee97 in #107
Full Changelog: v0.5.5...v0.5.6
v0.5.2
What's Changed
- Fix incorrect mean pooling. Thank @ir2718 for the contribution in #102
- Support mean pooling (alias to avg) to be compatible with
sentence-transformers. - Support the configuration for save_total_limit in
angle_trainer
Full Changelog: v0.5.1...v0.5.2
v0.5.1
v0.4.7
v0.4.4
v0.4.0
What's Changed
- Add sphinx 📘 document: https://angle.readthedocs.io/en/latest/index.html
- Reimplement 2DMSE; rename 2DMSE to ☕️ Espresso in #72
- Support fine-tuning and inferring BiLLM-based sentence embeddings in #72
- Upgrade
angle-trainerin #72 - Support converting AnglE model to sentence-transformers in #72
- Overhaul README in #69
Full Changelog: v0.3.10...v0.4.0