|
| 1 | +--- |
| 2 | +rank: 4 |
| 3 | +related_endpoints: |
| 4 | + - get_ai_agent_default |
| 5 | + - post_ai_text_gen |
| 6 | + - post_ai_ask |
| 7 | +related_guides: |
| 8 | + - box-ai/ai-tutorials/prerequisites |
| 9 | + - box-ai/ai-tutorials/ask-questions |
| 10 | + - box-ai/ai-tutorials/generate-text |
| 11 | + - box-ai/ai-tutorials/default-agent-overrides |
| 12 | +--- |
| 13 | + |
| 14 | +# Override AI model configuration |
| 15 | + |
| 16 | +<Message type="notice"> |
| 17 | +Endpoints related to metadata extraction are currently a beta feature offered subject to Box’s Main Beta Agreement, and the available capabilities may change. Box AI API is available to all Enterprise Plus customers. |
| 18 | +</Message> |
| 19 | + |
| 20 | +The `ai_agent` configuration allows you to override the default AI model configuration. It is available for the following endpoints: |
| 21 | + |
| 22 | +* [`POST ai/ask`][ask] |
| 23 | +* [`POST ai/text_gen`][text-gen] |
| 24 | +* [`POST ai/extract`][extract] |
| 25 | +* [`POST ai/extract_structured`][extract-structured] |
| 26 | + |
| 27 | +<Message type='tip'> |
| 28 | + |
| 29 | +Use the [`GET ai_agent_default`][agent] endpoint to fetch the default configuration. |
| 30 | + |
| 31 | +</Message> |
| 32 | + |
| 33 | +The override examples include: |
| 34 | + |
| 35 | +* Replacing the default AI model with a custom one based on your organization's needs. |
| 36 | +* Tweaking the base `prompt` to allow a more customized user experience. |
| 37 | +* Changing a parameter, such as `temperature`, to make the results more or less creative. |
| 38 | + |
| 39 | +## Sample configuration |
| 40 | + |
| 41 | +A complete configuration for `ai/ask` is as follows: |
| 42 | + |
| 43 | +```sh |
| 44 | +{ |
| 45 | + "type": "ai_agent_ask", |
| 46 | + "basic_text": { |
| 47 | + "llm_endpoint_params": { |
| 48 | + "type": "openai_params", |
| 49 | + "frequency_penalty": 1.5, |
| 50 | + "presence_penalty": 1.5, |
| 51 | + "stop": "<|im_end|>", |
| 52 | + "temperature": 0, |
| 53 | + "top_p": 1 |
| 54 | + }, |
| 55 | + "model": "azure__openai__gpt_3_5_turbo_16k", |
| 56 | + "num_tokens_for_completion": 8400, |
| 57 | + "prompt_template": "It is `{current_date}`, consider these travel options `{content}` and answer the `{user_question}`.", |
| 58 | + "system_message": "You are a helpful travel assistant specialized in budget travel" |
| 59 | + }, |
| 60 | + "basic_text_multi": { |
| 61 | + "llm_endpoint_params": { |
| 62 | + "type": "openai_params", |
| 63 | + "frequency_penalty": 1.5, |
| 64 | + "presence_penalty": 1.5, |
| 65 | + "stop": "<|im_end|>", |
| 66 | + "temperature": 0, |
| 67 | + "top_p": 1 |
| 68 | + }, |
| 69 | + "model": "azure__openai__gpt_3_5_turbo_16k", |
| 70 | + "num_tokens_for_completion": 8400, |
| 71 | + "prompt_template": "It is `{current_date}`, consider these travel options `{content}` and answer the `{user_question}`.", |
| 72 | + "system_message": "You are a helpful travel assistant specialized in budget travel" |
| 73 | + }, |
| 74 | + "long_text": { |
| 75 | + "embeddings": { |
| 76 | + "model": "openai__text_embedding_ada_002", |
| 77 | + "strategy": { |
| 78 | + "id": "basic", |
| 79 | + "num_tokens_per_chunk": 64 |
| 80 | + } |
| 81 | + }, |
| 82 | + "llm_endpoint_params": { |
| 83 | + "type": "openai_params", |
| 84 | + "frequency_penalty": 1.5, |
| 85 | + "presence_penalty": 1.5, |
| 86 | + "stop": "<|im_end|>", |
| 87 | + "temperature": 0, |
| 88 | + "top_p": 1 |
| 89 | + }, |
| 90 | + "model": "azure__openai__gpt_3_5_turbo_16k", |
| 91 | + "num_tokens_for_completion": 8400, |
| 92 | + "prompt_template": "It is `{current_date}`, consider these travel options `{content}` and answer the `{user_question}`.", |
| 93 | + "system_message": "You are a helpful travel assistant specialized in budget travel" |
| 94 | + }, |
| 95 | + "long_text_multi": { |
| 96 | + "embeddings": { |
| 97 | + "model": "openai__text_embedding_ada_002", |
| 98 | + "strategy": { |
| 99 | + "id": "basic", |
| 100 | + "num_tokens_per_chunk": 64 |
| 101 | + } |
| 102 | + }, |
| 103 | + "llm_endpoint_params": { |
| 104 | + "type": "openai_params", |
| 105 | + "frequency_penalty": 1.5, |
| 106 | + "presence_penalty": 1.5, |
| 107 | + "stop": "<|im_end|>", |
| 108 | + "temperature": 0, |
| 109 | + "top_p": 1 |
| 110 | + }, |
| 111 | + "model": "azure__openai__gpt_3_5_turbo_16k", |
| 112 | + "num_tokens_for_completion": 8400, |
| 113 | + "prompt_template": "It is `{current_date}`, consider these travel options `{content}` and answer the `{user_question}`.", |
| 114 | + "system_message": "You are a helpful travel assistant specialized in budget travel" |
| 115 | + } |
| 116 | +} |
| 117 | +``` |
| 118 | + |
| 119 | +### Differences in parameter sets |
| 120 | + |
| 121 | +The set of parameters available for `ask`, `text_gen`, `extract`, `extract_structured` differs slightly, depending on the API call. |
| 122 | + |
| 123 | + * The agent configuration for the `ask` endpoint includes `basic_text`, `basic_text_multi`, `long_text` and `long_text_multi` parameters. This is because of the `mode` parameter you use to specify if the request is for a single item or multiple items. If you selected `multiple_item_qa` as the `mode`, you can also use `multi` parameters for overrides. |
| 124 | + |
| 125 | + * The agent configuration for `text_gen` includes the `basic_gen` parameter |
| 126 | + that is used to generate text. |
| 127 | + |
| 128 | +### LLM endpoint params |
| 129 | + |
| 130 | +The `llm_endpoint_params` configuration options differ depending on the overall AI model being [Google][google-params], [OpenAI][openai-params] or [AWS][aws-params] based. |
| 131 | + |
| 132 | +For example, both `llm_endpoint_params` objects accept a `temperature` parameter, but the outcome differs depending on the model. |
| 133 | + |
| 134 | +For Google and AWS models, the [`temperature`][google-temp] is used for sampling during response generation, which occurs when `top-P` and `top-K` are applied. Temperature controls the degree of randomness in the token selection. |
| 135 | + |
| 136 | +For OpenAI models, [`temperature`][openai-temp] is the sampling temperature with values between 0 and 2. Higher values like 0.8 make the output more random, while lower values like 0.2 make it more focused and deterministic. When introducing your own configuration, use `temperature` or or `top_p` but not both. |
| 137 | + |
| 138 | +### System message |
| 139 | + |
| 140 | +The `system_message` parameter's aim is to help the LLM understand its role and what it’s supposed to do. |
| 141 | +For example, if your solution is processing travel itineraries, you can add a system message saying: |
| 142 | + |
| 143 | +```sh |
| 144 | +You are a travel agent aid. You are going to help support staff process large amounts of schedules, tickets, etc. |
| 145 | +``` |
| 146 | + |
| 147 | +This message is separate from the content you send in, but it can improve the results. |
| 148 | + |
| 149 | +### Number of tokens for completion |
| 150 | + |
| 151 | +The `num_tokens_for_completion` parameter represents the number of [tokens][openai-tokens] Box AI can return. This number can vary based on the model used. |
| 152 | + |
| 153 | +[ask]: e://post_ai_ask#param_ai_agent |
| 154 | +[text-gen]: e://post_ai_text_gen#param_ai_agent |
| 155 | +[extract]: e://post_ai_extract#param_ai_agent |
| 156 | +[extract-structured]: e://post_ai_extract_structured#param_ai_agent |
| 157 | +[google-params]: r://ai-llm-endpoint-params-google |
| 158 | +[openai-params]: r://ai-llm-endpoint-params-openai |
| 159 | +[openai-tokens]: https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them |
| 160 | +[agent]: e://get_ai_agent_default |
| 161 | +[google-temp]: https://ai.google.dev/gemini-api/docs/models/generative-models#model-parameters |
| 162 | +[openai-temp]: https://community.openai.com/t/temperature-top-p-and-top-k-for-chatbot-responses/295542 |
| 163 | +[aws-params]: r://ai-llm-endpoint-params-aws |
0 commit comments