From 0d35ab653aa302840335256bd95678e071ce33aa Mon Sep 17 00:00:00 2001 From: fpagny Date: Wed, 23 Jul 2025 14:01:11 +0200 Subject: [PATCH] feat(genapi): add parallel tool calling support --- .../reference-content/model-catalog.mdx | 27 +++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/pages/managed-inference/reference-content/model-catalog.mdx b/pages/managed-inference/reference-content/model-catalog.mdx index fec220a148..326b919234 100644 --- a/pages/managed-inference/reference-content/model-catalog.mdx +++ b/pages/managed-inference/reference-content/model-catalog.mdx @@ -91,6 +91,7 @@ google/gemma-3-27b-it:bf16 ``` | Attribute | Value | |-----------|-------| +| Supports parallel tool calling | No | | Supported image formats | PNG, JPEG, WEBP, and non-animated GIFs | | Maximum image resolution (pixels) | 896x896 | | Token dimension (pixels)| 56x56 | @@ -103,6 +104,7 @@ This model was optimized to have a dense knowledge and faster tokens throughput | Attribute | Value | |-----------|-------| +| Supports parallel tool calling | Yes | | Supported images formats | PNG, JPEG, WEBP, and non-animated GIFs | | Maximum image resolution (pixels) | 1540x1540 | | Token dimension (pixels)| 28x28 | @@ -123,6 +125,7 @@ It can analyze images and offer insights from visual content alongside text. | Attribute | Value | |-----------|-------| +| Supports parallel tool calling | Yes | | Supported images formats | PNG, JPEG, WEBP, and non-animated GIFs | | Maximum image resolution (pixels) | 1024x1024 | | Token dimension (pixels)| 16x16 | @@ -148,6 +151,10 @@ allenai/molmo-72b-0924:fp8 Released December 6, 2024, Meta’s Llama 3.3 70b is a fine-tune of the [Llama 3.1 70b](/managed-inference/reference-content/model-catalog/#llama-31-70b-instruct) model. This model is still text-only (text in/text out). However, Llama 3.3 was designed to approach the performance of Llama 3.1 405B on some applications. +| Attribute | Value | +|-----------|-------| +| Supports parallel tool calling | Yes | + #### Model name ``` meta/llama-3.3-70b-instruct:fp8 @@ -158,6 +165,10 @@ meta/llama-3.3-70b-instruct:bf16 Released July 23, 2024, Meta’s Llama 3.1 is an iteration of the open-access Llama family. Llama 3.1 was designed to match the best proprietary models and outperform many of the available open source on common industry benchmarks. +| Attribute | Value | +|-----------|-------| +| Supports parallel tool calling | Yes | + #### Model names ``` meta/llama-3.1-70b-instruct:fp8 @@ -168,6 +179,10 @@ meta/llama-3.1-70b-instruct:bf16 Released July 23, 2024, Meta’s Llama 3.1 is an iteration of the open-access Llama family. Llama 3.1 was designed to match the best proprietary models and outperform many of the available open source on common industry benchmarks. +| Attribute | Value | +|-----------|-------| +| Supports parallel tool calling | Yes | + #### Model names ``` meta/llama-3.1-8b-instruct:fp8 @@ -197,6 +212,10 @@ nvidia/llama-3.1-nemotron-70b-instruct:fp8 Released January 21, 2025, Deepseek’s R1 Distilled Llama 70B is a distilled version of the Llama model family based on Deepseek R1. DeepSeek R1 Distill Llama 70B is designed to improve the performance of Llama models on reasoning use cases such as mathematics and coding tasks. +| Attribute | Value | +|-----------|-------| +| Supports parallel tool calling | No | + #### Model name ``` deepseek/deepseek-r1-distill-llama-70b:fp8 @@ -247,6 +266,10 @@ Mistral Nemo is a state-of-the-art transformer model of 12B parameters, built by This model is open-weight and distributed under the Apache 2.0 license. It was trained on a large proportion of multilingual and code data. +| Attribute | Value | +|-----------|-------| +| Supports parallel tool calling | Yes | + #### Model name ``` mistral/mistral-nemo-instruct-2407:fp8 @@ -302,6 +325,10 @@ kyutai/moshika-0.1-8b:fp8 Qwen2.5-coder is your intelligent programming assistant familiar with more than 40 programming languages. With Qwen2.5-coder deployed at Scaleway, your company can benefit from code generation, AI-assisted code repair, and code reasoning. +| Attribute | Value | +|-----------|-------| +| Supports parallel tool calling | No | + #### Model name ``` qwen/qwen2.5-coder-32b-instruct:int8