Skip to content

Conversation

@giladgd
Copy link
Member

@giladgd giladgd commented Jan 1, 2025

Description of change

  • feat: token prediction (speculative decoding)
  • feat: DraftSequenceTokenPredictor
  • feat: InputLookupTokenPredictor
  • feat: controlledEvaluate
  • feat: reranking (LlamaRankingContext)
  • feat: experimentalChunkDocument
  • feat: evaluateWithMetadata
  • feat: token confidence
  • feat: build on arm64 using LLVM, use Visual Studio's CMake when available
  • feat: try compiling with LLVM on Windows x64 when available
  • feat(minor): dynamically load llama.cpp backends
  • feat(minor): more token values support in SpecialToken
  • feat(minor): improve memory usage estimation
  • fix: check for Rosetta usage on macOS x64 when using the inspect gpu command
  • fix: detect running under Rosetta on Apple Silicone and show an error message instead of crashing
  • fix: switch from "nextTick" to "nextCycle" for the default batch dispatcher
  • fix: remove deprecated CLS token
  • fix: pipe error logs in inspect gpu command
  • docs: improve building from source
  • docs: CUDA in Docker troubleshooting
  • docs: context shift strategy
  • docs: improve type docs and types
  • docs: user input safety
  • docs: sitemap fixes
  • docs: remove Intel AMX trick, since it's being automatically used in the prebuilt binaries now
  • docs: parse custom cmake options nested under ifs
  • docs: update custom cmake options

Pull-Request Checklist

  • Code is up-to-date with the master branch
  • npm run format to apply eslint formatting
  • npm run test passes with this change
  • This pull request links relevant issues as Fixes #0000
  • There are new or updated unit tests validating the change
  • Documentation has been updated to reflect this change
  • The new commits and pull request title follow conventions explained in pull request guidelines (PRs that do not follow this convention will not be merged)

* `DraftSequenceTokenPredictor`
* `InputLookupTokenPredictor`
for improved performance and compatibility
@giladgd giladgd requested a review from ido-pluto January 1, 2025 03:22
Copy link
Contributor

@ido-pluto ido-pluto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@giladgd giladgd merged commit 632a7bf into master Jan 7, 2025
18 checks passed
@giladgd giladgd deleted the gilad/dynamicBackends branch January 7, 2025 00:03
@github-actions
Copy link

github-actions bot commented Jan 8, 2025

🎉 This PR is included in version 3.4.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants