Releases: MinishLab/semhash
Releases · MinishLab/semhash
v0.4.1
v0.4.0
What's Changed
- chore: Deprecated Python 3.9 support by @Pringled in #74
- feat: Add selected_with_duplicates caching by @Pringled in #75
- feat: Add Pyversity for MMR, add configurable diversity re-ranking, deprecate lambda_param by @Pringled in #76
- chore: Deprecate 'deduplicated' and 'duplicates' attributes by @Pringled in #77
- feat: Add from_embeddings functionality, refactored utils by @Pringled in #78
- chore: Deprecated use_ann by @Pringled in #79
- docs: Moved benchmarks results to benchmarks directory by @Pringled in #80
- fix: Fixed from_embeddings exact duplicate removal by @Pringled in #82
- feat: Added multimodal support by @Pringled in #83
Deprecation warnings ⚠️
.deduplicatedandduplicatesattributes are deprecated in favor ofselectedandfiltereduse_annis deprecated for SemHash constructors. Useann_backend=Backend.BASICinsteadlambda_paramis deprecated forself_find_representative()andfind_representative(). Usediversityinstead.- Python 3.9 is not longer officially supported
Full Changelog: v0.3.3...v0.4.0
v0.3.3
v0.3.2
What's Changed
- docs: edited broken link by @bravesasha in #66
- feat: Add configurable ANN backends by @Pringled in #67
New Contributors
- @bravesasha made their first contribution in #66
Full Changelog: v0.3.1...v0.3.2
v0.3.1
v0.3.0
What's Changed
- feat: Add entropy scoring functionality by @davidberenstein1957 in #25
- Refactor results data model by @davidberenstein1957 in #40
- feat: Updates for entropy filtering by @Pringled in #55
- feat: Added outlier and representative filtering based on average cosine similarity by @Pringled in #56
- Bump version by @Pringled in #58
- fix: Fixed setuptools bug by @Pringled in #59
New Contributors
- @davidberenstein1957 made their first contribution in #25
Full Changelog: 0.2.1...v0.3.0
0.2.1
What's Changed
- docs: Add threshold and explainability to docs by @Pringled in #27
- fix broken link in README by @stephantul in #28
- docs: Added pandas dataframe example by @Pringled in #33
- docs: update README.md by @eltociear in #37
- docs: Added citation by @Pringled in #38
- Fix wrong bibtex citation format by @amitness in #39
- fix: issue with exact duplicates not being returned by @stephantul in #35
- docs: Added discord badge by @Pringled in #45
- Bump version by @Pringled in #50
New Contributors
- @eltociear made their first contribution in #37
- @amitness made their first contribution in #39
Full Changelog: v0.2.0...0.2.1
v0.2.0
What's Changed
- feat: Added ci by @Pringled in #13
- docs: Add badges by @Pringled in #14
- feat: Make ann default by @Pringled in #15
- docs: Added encoder documentation and rewrote general docs by @Pringled in #16
- fix: typing, use_ann argument by @stephantul in #17
- feat: Add deduplication records by @stephantul in #19
- feat: Refactor semhash logic by @Pringled in #20
- Add rethresholding by @stephantul in #22
- Add records by @stephantul in #23
- feat: Updated documentation, added docstrings, changed least_similar functionality by @Pringled in #24
- Bumped version by @Pringled in #26
Full Changelog: v0.1.0...v0.2.0
v0.1.0
What's Changed
- feat: Add initial code by @Pringled in #1
- feat: Switch to nearest by @Pringled in #2
- feat: Switched to vicinity by @Pringled in #3
- feat: Updated vicinity integration by @Pringled in #4
- feat: Switch to column based approach by @Pringled in #5
- feat: Added encoder protocol by @Pringled in #6
- feat: Added ann support by @Pringled in #7
- feat: Added exact duplicate removal by @Pringled in #8
- feat: Added benchmarks by @Pringled in #9
- docs: Updated documentation by @Pringled in #10
- feat: Fixed path and added ann to tests by @Pringled in #11
- feat: Prepared package for release by @Pringled in #12
New Contributors
Full Changelog: https://github.com/MinishLab/semhash/commits/v0.1.0