Bump torch from 2.0.1 to 2.2.0 in /inference#3
Open
dependabot[bot] wants to merge 11 commits intomainfrom
Open
Bump torch from 2.0.1 to 2.2.0 in /inference#3dependabot[bot] wants to merge 11 commits intomainfrom
dependabot[bot] wants to merge 11 commits intomainfrom
Conversation
BFCL V3 release. Introducing new multi-turn dataset and state-based evaluation metric for category: `multi_turn_base`, `multi_turn_miss_func`, `multi_turn_miss_param`, `multi_turn_long_context`, `multi_turn_composite`; a significant leap towards multi-turn, and multi-step function calling (tool usage) benchmarking. BFCL V3 is a critical advancement in evaluating how Large Language Models (LLMs) interact with diverse scenarios through invoking right functions. Multi-turn function calling allows models to engage in a back-and-forth interaction with users, making it possible for LLMs to navigate through the complex tasks by asking clarifying questions. In contrast to multi-turn `(user t0, assistant t1, user t2, assistant t3, ..)`, multi-step is where the LLM can break the response down into multiple steps `(user t0, assistant t1, assistant t2,..)`. This new paradigm mimics real-world agentic behaviors where AI assistants might have to plan execution paths, request and extract critical information, and handle sequential function invokes to complete a task. To read more about the composition and construction of this live dataset, please refer to our [blog](https://gorilla.cs.berkeley.edu/blogs/13_bfcl_v3_multi_turn.html). --------- **Also in this PR**: 1. Switch to use vllm serve for OSS model inference 2. Switch to Vertex AI Python SDK for Gemini models inference 3. Split out ast_checker and executable_checker for readability 4. Several outdated or deprecated models will be excluded from the leaderboard and replaced with their updated successors to improve the leaderboard's overall maintainability. --------- Co-authored-by: Fanjia Yan <fanjiayan@berkeley.edu> Co-authored-by: Charlie Cheng-Jie Ji <charliechengjieji@berkeley.edu> Co-authored-by: Jason Huang <jasonhuang1103@berkeley.edu> Co-authored-by: Vishnu Suresh <vishnusuresh@berkeley.edu> Co-authored-by: Yixin Huang <yixinhuang1@berkeley.edu> Co-authored-by: Xiaowen Yu <yxw2002@berkeley.edu>
Last time when I contributed the `raft_local.py` in directory named `raft` there was some unnecessary were there, which I removed in this pull request. It will not confuse the developers when they read the file.
This PR separate out the change log from the READMD.md to make it more readable. Some setup instructions have also been updated. --------- Co-authored-by: Devansh Amin <devanshamin97@gmail.com>
…ishirPatil#656) There are some dataset format issues for the single turn entries. The code wraps the question field in an additional unnecessary list. Fix ShishirPatil#651
…hishirPatil#660) In the parse_nested_value function, added a check to determine whether we are dealing with another function call or if its a regular dictionary. Previous version of the code incorrectly assumed that this was always a function call and did not consider the case where the function argument is a dictionary. Fix ShishirPatil#652 --------- Co-authored-by: Huanzhi (Hans) Mao <huanzhimao@gmail.com>
Added handler for: phi-3-mini-4k-instruct phi-3-mini-128k-instruct phi-3-small-8k-instruct phi-3-small-128k-instruct phi-3-medium-4,-instruct phi-3-medium-128k-instruct phi-3.5-mini-instruct |Rank|Model |Model Link |Organization|License |AST Summary|Simple AST|Multiple AST|Parallel AST|Parallel Multiple AST|Irrelevance Detection|Relevance Detection| |----|---------------------------------|---------------------------------------------------------|------------|------------|-----------|----------|------------|------------|---------------------|---------------------|-------------------| |1 |Phi-3-small-8k-instruct (Prompt) |https://huggingface.co/microsoft/Phi-3-small-8k-instruct |Microsoft |MIT |66.39% |59.70% |64.20% |76.75% |64.92% |47.06% |87.80% | |2 |Phi-3-medium-4k-instruct (Prompt)|https://huggingface.co/microsoft/Phi-3-medium-4k-instruct|Microsoft |MIT |62.10% |66.67% |67.40% |62.00% |52.33% |46.79% |78.05% | |3 |Phi-3-mini-4k-instruct (Prompt) |https://huggingface.co/microsoft/Phi-3-mini-4k-instruct |Microsoft |MIT |66.63% |70.76% |75.67% |69.75% |50.33% |20.25% |75.61% | |4 |Phi-3.5-mini-instruct (Prompt) |https://huggingface.co/microsoft/Phi-3.5-mini-instruct |Microsoft |MIT |55.13% |64.22% |66.12% |52.00% |38.17% |64.93% |70.73% | |5 |Phi-3-mini-128k-instruct (Prompt)|https://huggingface.co/microsoft/Phi-3-mini-128k-instruct|Microsoft |MIT |51.49% |67.60% |72.50% |41.12% |24.75% |44.07% |85.37% | --------- Co-authored-by: Huanzhi (Hans) Mao <huanzhimao@gmail.com>
Bumps [torch](https://github.com/pytorch/pytorch) from 2.0.1 to 2.2.0. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](pytorch/pytorch@v2.0.1...v2.2.0) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bumps torch from 2.0.1 to 2.2.0.
Release notes
Sourced from torch's releases.
... (truncated)
Commits
8ac9b20Run docker release build on final tag (#117131) (#117182)2490352Fix cuInit test on Windows (#117095)3a44bb7[CI] Test that cuInit is not called during import (#117043)1c8ba38[CI] Use jemalloc for CUDA builds (#116900) (#116988)96d2ddbStore user model to simplify ONNXProgram.{adapt_torch_*,call} APIs (#1152...738b4a5Update ONNX's IO Adapter to support FakeTensor with ExportedProgram (#114407)...4cf10bf[Cherry-pick] [Quant] [PT2] Enable batchnorm in _move_exported_model_to_eval ...7e97e4b[AARCH64] Fall back to GEMM if mkldnn_matmul fails (#115936) (#116666)1a3e3c7[CUDA] baddmm should fall back to addmm for batch=1 (#114992) (#116518)ab7505fFix broken PyYAML 6.0 on MacOS x86 (#115956) (#116551)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebasewill rebase this PR@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it@dependabot mergewill merge this PR after your CI passes on it@dependabot squash and mergewill squash and merge this PR after your CI passes on it@dependabot cancel mergewill cancel a previously requested merge and block automerging@dependabot reopenwill reopen this PR if it is closed@dependabot closewill close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually@dependabot show <dependency name> ignore conditionswill show all of the ignore conditions of the specified dependency@dependabot ignore this major versionwill close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor versionwill close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependencywill close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)You can disable automated security fix PRs for this repo from the Security Alerts page.