Skip to content

doraemonaaaa/FreeAskAgent

Repository files navigation

FreeAskAgent: Human Cognition-inspired Zero-Shot Vision-Language Navigation in Dynamic Environments

🌟 Why FreeAskAgent?

FreeAskAgent is a trainable, tool-integrated agentic framework designed to overcome the scalability and generalization limits of today’s tool-augmented reasoning approaches and embodied ai agent framework.

Unlike prevailing approaches such as Search-R1 which train a single LLM to interleave reasoning steps with tool calls, FreeAskAgent introduces a modular agentic system with four specialized modules: Planner, Verifier, Executor.

Agent Framework

🚀 Key Features

  • 🧩 Modular Agentic System – Four specialized agent modules (Planner, Executor, Verifier, Generator) that coordinate via evolving memory and integrated tools across multiple turns.
  • 🔗 Multi-Tool Integration – Seamlessly connect with diverse tool ecosystems, including base_generator, python_coder, google_search, wikipedia_search, web_search, Grounded_SAN2, and more.
  • 🎯 Flow-GRPO Algorithm – Enables in-the-flow agent optimization for long-horizon reasoning tasks with sparse rewards.

📑 Table of Contents

⚙️ Setup

Prerequisites

  • Python 3.11 (recommended)
bash setup.sh  # set up environment automatically
git submodule update --init --recursive  # Download submodule

cd closed_loop/ros2_agent_baseline.md # If you want to set up ros2 version, follwing this readme step
cd closed_loop/ros2.md

Installation

bash setup.sh
source .venv/bin/activate
# (Optional) Install `parallel` for running benchmark experiments in parallel:
sudo apt-get update
sudo apt-get install parallel

Install Low Level Module from low_level/NavDP/README.md

Setup Environment Variables

Copy the .env.template file from FreeAskAgent/.env.template and rename it to .env, then place it in the FreeAskAgent/ folder. Update the following variables with your own API keys:

  • OPENAI_API_KEY (for judging reasponse)
  • GOOGLE_API_KEY (for Google Search tool)
  • DASHSCOPE_API_KEY (for calling Qwen-2.5-7B-Instruct as engine for agents and tools)
  • TOGETHER_API_KEY (alternative for calling Qwen-2.5-7B-Instruct as engine for agents and tools - recommended for international users)
  • More ways: serve Qwen2.5-7B-instruct model with vLLM (details refer to serve_vllm_local.md).

Please check API Key Setup Guide for detailed instructions on how to obtain these keys.

cp FreeAskAgent/.env.template FreeAskAgent/.env
# Then edit FreeAskAgent/.env with your API keys

⚡ Quick Start on FreeAskAgent Inference

FreeAskAgent provides a modular agentic system with four specialized modules (planner, executor, verifier, generator) that coordinate through evolving memory and a toolkit over multiple turns to solve complex reasoning tasks.

To quickly experience the system in action, run the command below (don’t forget to set up your API key):

python quick_start_embodied.py

💥 Quick Start on FreeAskAgent Flow-GRPO Training

For effective planning and tool use, the framework directly optimizes the planner agent within the system in an online fashion using Flow-GRPO. Below is a quick start for training.

Flow-GRPO Training

Start FreeAskAgent training using Flow-GRPO with tmux:

# Create tmux session and start FreeAskAgent service (Window 0)
tmux new-session -s FreeAskAgent
bash train/serve_with_logs.sh

# Create new window (Ctrl+B then C) and start training (Window 1)
bash train/train_with_logs.sh

Configuration: All training hyperparameters are in train/config.yaml (model settings, tools, RL parameters, resources, etc.)

Logging: We provide a comprehensive logging to monitor training. See logs.md for more details.

🎯 FreeAskWorld Benchmark

Communicationn with FreeAskWorld based on ROS2, main pack in closed_loop/ros2/src/vln_connector. Run benchmark, start simulator at first, then:

bash closed_loop/ros2server.bash # Then start the FreeAskWorld simulator

Run other baselines on FreeAskWorld

Vint

InstructNav

🧩 Use Your Own Model in FreeAskAgent

FreeAskAgent supports different LLM engines for each agent module. See llm_engine.md for supported models and factory.py for the corresponding model_string configuration:

Planner Agent:

Other Agents (Executor, Verifier, Generator):

self.llm_engine_fixed = create_llm_engine(model_string="your-engine", is_multimodal=False, temperature=temperature)

and

# Instantiate Executor
executor = Executor(
    # llm_engine_name=llm_engine_name,
    llm_engine_name="dashscope",
    root_cache_dir=root_cache_dir,
    verbose=verbose,
    # base_url=base_url,
    temperature=temperature
)
  • For detailed information on supported engines and model_string formats, see llm_engine.md

Acknowledgement

FreeAskAgent is built upon AgentFlow and RTAB-Map. We sincerely thank the developers of these projects for their significant contributions, which made this work possible.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published