feat(tool): incorporate open-source tools from MiroThinker#60
feat(tool): incorporate open-source tools from MiroThinker#60BinWang28 merged 20 commits intoMiroMindAI:miroflow-v0.3from
Conversation
- Resolved formatting conflicts in utils/extract_futurex_results.py - Resolved formatting conflicts in utils/prepare_benchmark/gen_futurex.py - Resolved formatting conflicts in utils/progress_check/check_futurex_progress.py All conflicts were due to code formatting differences (whitespace, line breaks, trailing commas). Functionality remains identical between branches.
…ress file to exclude T1.
… greater china respectively.
There was a problem hiding this comment.
Pull Request Overview
This PR adapts and incorporates open-source tools from MiroThinker, adding three new MCP servers that provide vision, reasoning, and audio processing capabilities using open-source models.
- Added three new open-source MCP servers (vision, reasoning, and audio) with robust error handling
- Created comprehensive documentation for deploying and using the open-source models
- Added YAML configuration files to integrate the new tools into the existing tool system
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/tool/mcp_servers/vision_mcp_server_os.py |
New vision MCP server for VQA using open-source models like Qwen2.5-VL |
src/tool/mcp_servers/reasoning_mcp_server_os.py |
New reasoning MCP server with retry logic for complex problem solving |
src/tool/mcp_servers/audio_mcp_server_os.py |
New audio transcription server using open-source Whisper models |
docs/mkdocs/mkdocs.yml |
Updated navigation to include documentation for new open-source tools |
docs/mkdocs/docs/tool_vqa_os.md |
Documentation for open-source vision tool deployment and usage |
docs/mkdocs/docs/tool_reasoning_os.md |
Documentation for open-source reasoning tool deployment and usage |
docs/mkdocs/docs/tool_audio_os.md |
Documentation for open-source audio tool deployment and usage |
config/tool/tool-reasoning-os.yaml |
Configuration file for reasoning tool integration |
config/tool/tool-image-video-os.yaml |
Configuration file for vision tool integration |
config/tool/tool-audio-os.yaml |
Configuration file for audio tool integration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
|
||
| payload = {"model": VISION_MODEL_NAME, "messages": messages_for_llm} | ||
|
|
||
| response = requests.post(VISION_BASE_URL, json=payload, headers=headers) |
There was a problem hiding this comment.
Using synchronous requests.post in an async function can block the event loop. Consider using aiohttp.ClientSession().post() instead since you're already importing and using aiohttp elsewhere in the function.
| if duration > 0: | ||
| return duration | ||
| except Exception as e: | ||
| return f"[ERROR]: Failed to get audio duration: {e}" |
There was a problem hiding this comment.
The function _get_audio_duration should return a float according to its type hint and usage context, but this exception handler returns a string. This could cause type errors when the returned value is used in calculations.
| return f"[ERROR]: Failed to get audio duration: {e}" | |
| return 0.0 |
|
|
||
| @mcp.tool() | ||
| async def reasoning(question: str) -> str: | ||
| """You can use this tool use solve hard math problem, puzzle, riddle and IQ test question that requires a lot of chain of thought efforts. |
There was a problem hiding this comment.
Grammar error: 'use solve' should be 'to solve'. The sentence should read: 'You can use this tool to solve hard math problem...'
| """You can use this tool use solve hard math problem, puzzle, riddle and IQ test question that requires a lot of chain of thought efforts. | |
| """You can use this tool to solve hard math problem, puzzle, riddle and IQ test question that requires a lot of chain of thought efforts. |
docs/mkdocs/docs/tool_audio_os.md
Outdated
| --- | ||
|
|
||
| !!! info "Documentation Info" | ||
| **Last Updated:** January 2025 · **Doc Contributor:** Team @ MiroMind AI |
…I#60) * upd: add futurex evaluation support. * upd: support multiple eval for futurex and add relavent doc. * upd: fix bugs with doc for futurex. * debug: fix wrong calling path. * add preparation for finsearchcomp. * update a premature version of finsearchcomp benchmark. * clean redundent code in merging. * upd: modify yaml to use Mirothinker as the main agent, add check progress file to exclude T1. * upd: check_progress function for finsearchcomp now consider globe and greater china respectively. * upd: add docs and shell script for multiple runs. * fix: check_finsearchcomp_progress not displaying results from greater china region. * fix: catch ContextLimitError in more observed cases. * initialize open source tools for audio, vision and reasoning. * upd: docs for open-source tools. * fix wrong date.
Describe this PR
Adapted open-source tools from Mirothinker and add relevant docs on deploying open-source models.
Checklist for PR
Must Do
feat(agent): add pdf tool via mcp,perf: make llm client asyncandfix(utils): load custom config via importlibetc. CI jobcheck-pr-titleenforces Angular commit message format to PR title.make precommitlocally. CI joblintenforce ruff default format/lint rules on all new codes.make pytest. Check test summary (located atreport.html) and coverage report (located athtmlcov/index.html) on new codes.Nice To Have
/testsforfeatandtestPR./docsfordocsandciPR.