Skip to content

Conversation

@jiacheng-xu
Copy link
Collaborator

WIP

Known issue: tool calling doesn't work in the customized generation module.

Signed-off-by: Jiacheng Xu <[email protected]>
@jiacheng-xu
Copy link
Collaborator Author

Hi @Kipok I need some help here. I found that in the generation task I designed, the model can't use tool. The conversation always ended with a tool call (eg. import numpy) and thats the end of conversation.
I also did not find a very elegant way to handle all the if conditions like if self.cfg.code_execution: or parse_reasoning. Is there a centralized util function to handle all these?

Thanks a lot!

file: nemo_skills/inference/eval/critpt.py

@gwarmstrong
Copy link
Collaborator

@jiacheng-xu can you show an example ns generate/ns eval script that you are using? I suspect the issue would be if you are setting ++code_execution=True--I suspect that's not what you actually want here, as that is an older way of calling python execution before we had tool parsers/structured tool calls integrated.

@jiacheng-xu
Copy link
Collaborator Author

@jiacheng-xu can you show an example ns generate/ns eval script that you are using? I suspect the issue would be if you are setting ++code_execution=True--I suspect that's not what you actually want here, as that is an older way of calling python execution before we had tool parsers/structured tool calls integrated.

Yes I am using code_execution=True.

{
  "benchmarks": "critpt:1",
  "split": "test",
  "num_chunks": 1,
  "server_type": "vllm",
  "server_gpus": 4,
  "server_args": "--async-scheduling",
  "model": "/hf_models/gpt-oss-20b",
  "with_sandbox": true,
  "expname": "critpttest-ghb-model_gpt_oss_20b-oci-debug",
  "cluster": "oci",
  "output_dir": "/workspace/critpttest-ghb-model_gpt_oss_20b-oci-debug",
  "wandb_project": "critpt",
  "wandb_name": "012918-critpttest-ghb-model_gpt_oss_20b-oci",
  "__ctx_args": "++max_samples=10 ++inference.temperature=1.0 ++inference.tokens_to_generate=65536 ++code_tags=gpt-oss ++server.code_execution.max_code_executions=100 ++inference.endpoint_type=text --config-path=/nemo_run/code/nemo_skills_stem/configs --config-name=gpt_oss ++chat_template_kwargs.builtin_tools=[python] ++chat_template_kwargs.reasoning_effort=high ++parse_reasoning=True ++code_execution=true"
}

Could you share what's the recommended way for tool calling, one for endpoint='text' (like gpt oss) and one for endpoint='chat'? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants