Skip to content

build(requirements): bump the prod group with 6 updates#4

Closed
dependabot[bot] wants to merge 999 commits intomainfrom
dependabot/pip/prod-7ae5172c28
Closed

build(requirements): bump the prod group with 6 updates#4
dependabot[bot] wants to merge 999 commits intomainfrom
dependabot/pip/prod-7ae5172c28

Conversation

@dependabot
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Mar 8, 2026

Updates the requirements on azure-identity, dill, pillow, psutil, scipy and snowballstemmer to permit the latest version.
Updates azure-identity from 1.17.1 to 1.25.2

Release notes

Sourced from azure-identity's releases.

azure-identity_1.25.2

1.25.2 (2026-02-10)

Bugs Fixed

  • Fixed an issue with certain credentials not bypassing the token cache when claims are provided in get_token or get_token_info calls. (#44552) (#44815)
  • Fixed an issue where an unhelpful TypeError was raised during Entra ID token requests that returned empty responses. Now, a ClientAuthenticationError is raised with the full response for better troubleshooting. (#44258)

Other Changes

  • Bumped minimum dependency on msal to >=1.31.0.
  • Added debug logging of access token cache hits in several credentials to improve troubleshooting of token cache behavior. (#44963)
  • Replace instances of azure.core.pipeline.transport.HttpRequest with azure.core.rest.HttpRequest. (#44993)
Commits

Updates dill from 0.3.9 to 0.4.1

Commits

Updates pillow from 10.4.0 to 12.1.1

Release notes

Sourced from pillow's releases.

12.1.1

https://pillow.readthedocs.io/en/stable/releasenotes/12.1.1.html

Dependencies

Other changes

12.1.0

https://pillow.readthedocs.io/en/stable/releasenotes/12.1.0.html

Deprecations

Documentation

Dependencies

Testing

... (truncated)

Changelog

Sourced from pillow's changelog.

Changelog (Pillow)

11.1.0 and newer

See GitHub Releases:

11.0.0 (2024-10-15)

  • Update licence to MIT-CMU #8460 [hugovk]

  • Conditionally define ImageCms type hint to avoid requiring core #8197 [radarhere]

  • Support writing LONG8 offsets in AppendingTiffWriter #8417 [radarhere]

  • Use ImageFile.MAXBLOCK when saving TIFF images #8461 [radarhere]

  • Do not close provided file handles with libtiff when saving #8458 [radarhere]

  • Support ImageFilter.BuiltinFilter for I;16* images #8438 [radarhere]

  • Use ImagingCore.ptr instead of ImagingCore.id #8341 [homm, radarhere, hugovk]

  • Updated EPS mode when opening images without transparency #8281 [Yay295, radarhere]

  • Use transparency when combining P frames from APNGs #8443 [radarhere]

  • Support all resampling filters when resizing I;16* images #8422 [radarhere]

  • Free memory on early return #8413 [radarhere]

  • Cast int before potentially exceeding INT_MAX #8402 [radarhere]

... (truncated)

Commits

Updates psutil from 6.1.0 to 7.2.2

Changelog

Sourced from psutil's changelog.

7.2.2

2026-01-28

Enhancements

  • 2705_: [Linux]: Process.wait()_ now uses pidfd_open() + poll() for waiting, resulting in no busy loop and faster response times. Requires Linux >= 5.3 and Python >= 3.9. Falls back to traditional polling if unavailable.
  • 2705_: [macOS], [BSD]: Process.wait()_ now uses kqueue() for waiting, resulting in no busy loop and faster response times.

Bug fixes

  • 2701_, [macOS]: fix compilation error on macOS < 10.7. (patch by Sergey Fedorov)
  • 2707_, [macOS]: fix potential memory leaks in error paths of Process.memory_full_info() and Process.threads().
  • 2708_, [macOS]: Process.cmdline()_ and Process.environ()_ may fail with ``OSError: [Errno 0] Undefined error`` (from ``sysctl(KERN_PROCARGS2)``). They now raise AccessDenied`_ instead.

7.2.1

2025-12-29

Bug fixes

  • 2699_, [FreeBSD], [NetBSD]: heap_info()_ does not detect small allocations (<= 1K). In order to fix that, we now flush internal jemalloc cache before fetching the metrics.

7.2.0

2025-12-23

Enhancements

  • 1275_: new heap_info()_ and heap_trim()_ functions, providing direct access to the platform's native C heap allocator (glibc, mimalloc, libmalloc). Useful to create tools to detect memory leaks.
  • 2403_, [Linux]: publish wheels for Linux musl.
  • 2680_: unit tests are no longer installed / part of the distribution. They now live under tests/ instead of psutil/tests.

Bug fixes

... (truncated)

Commits
  • 9eea97d Pre-release
  • 938ac64 Rm sphinxcontrib.googleanalytics; override layout.html
  • 9dcbb7e Add sphinxcontrib-googleanalytics to requirements.txt
  • 76eaf9a Try to add google analytics to doc
  • de1cafa Update doc mentioning Process.wait() internal details
  • bb30943 Refact can_use_pidfd_open() and can_use_kqueue()
  • a571717 #2708, macos / cmdline / environ; raise AD instead of OSError(0) (#2709)
  • 8b98c3e Pre-release
  • 700b7e6 [macOS] fix potential leaks in error paths (#2707)
  • 7cc7923 Windows / cmdline(): be more defensive in free()ing in case of error
  • Additional commits viewable in compare view

Updates scipy from 1.14.1 to 1.15.3

Release notes

Sourced from scipy's releases.

SciPy 1.15.3 Release Notes

SciPy 1.15.3 is a bug-fix release with no new features compared to 1.15.2.

For the complete issue and PR lists see the raw release notes.

Authors

  • Name (commits)
  • aiudirog (1) +
  • Nickolai Belakovski (1)
  • Florian Bourgey (1) +
  • Richard Strong Bowen (2) +
  • Jake Bowhay (1)
  • Dietrich Brunn (2)
  • Evgeni Burovski (1)
  • Lucas Colley (1)
  • Ralf Gommers (1)
  • Saarthak Gupta (1) +
  • Matt Haberland (4)
  • Chengyu Han (1) +
  • Lukas Huber (1) +
  • Nick ODell (2)
  • Ilhan Polat (4)
  • Tyler Reddy (52)
  • Neil Schemenauer (1) +
  • Dan Schult (1)
  • sildater (1) +
  • Gagandeep Singh (4)
  • Albert Steppi (2)
  • Matthias Urlichs (1) +
  • David Varela (1) +
  • ਗਗਨਦੀਪ ਸਿੰਘ (Gagandeep Singh) (3)

A total of 24 people contributed to this release. People with a "+" by their names contributed a patch for the first time. This list of names is automatically generated, and may not be fully complete.

SciPy 1.15.2 Release Notes

SciPy 1.15.2 is a bug-fix release with no new features compared to 1.15.1. Free-threaded Python 3.13 wheels for Linux ARM platform are available on PyPI starting with this release.

Authors

... (truncated)

Commits
  • e29dcb6 REL: 1.15.3 rel commit [wheel build]
  • 61e6aa1 Merge pull request #22840 from tylerjereddy/treddy_1.15.3_backports
  • 18c4ca8 MAINT: PR 22840 wheel build [wheel build]
  • bd0f132 MAINT: PR 22840 revisions
  • 033b138 MAINT: PR 22840 revisions
  • 7a283cc DOC: PR 22840 revisions
  • 3d1ea40 BUG: spatial.HalfspaceIntersection: raise on non-feasible half space (#20035)
  • d01b984 BUG: ndimage.median_filter: fix segfault when using mode='mirror' (#22608)
  • 0879108 MAINT: special.logsumexp: fix bug when weight of largest magnitude component ...
  • 9b3b2d8 Merge pull request #22869 from smurfix/main
  • Additional commits viewable in compare view

Updates snowballstemmer to 3.0.1

Changelog

Sourced from snowballstemmer's changelog.

Snowball 3.0.1 (2025-05-09)

Python

  • The init.py in 3.0.0 was incorrectly generated due to a missing build dependency and the list of algorithms was empty. First reported by laymonage. Thanks to Dmitry Shachnev, Henry Schreiner and Adam Turner for diagnosing and fixing. (#229, #230, #231)

  • Add trove classifiers for Armenian and Yiddish which have now been registered with PyPI. Thanks to Henry Schreiner and Dmitry Shachnev. (#228)

  • Update documented details of Python 2 support in old versions.

Snowball 3.0.0 (2025-05-08)

Ada

  • Bug fixes:

    • Fix invalid Ada code generated for Snowball loop (it was partly Pascal!) None of the stemmers shipped in previous releases triggered this bug, but the Turkish stemmer now does.

    • The Ada runtime was not tracking the current length of the string but instead used the current limit value or some other substitute, which manifested as various incorrect behaviours for code inside of setlimit.

    • size was incorrectly returning the difference between the limit and the backwards limit.

    • lenof or sizeof on a string variable generated Ada code that didn't even compile.

    • Fix incorrect preconditions on some methods in the runtime.

    • Fix bug in runtime code used by attach, insert, <- and string variable assignment when a (sub)string was replaced with a larger string. This bug was triggered by code in the Kraaij-Pohlmann Dutch stemmer implementation (which was previously not enabled by default but is now the standard Dutch stemmer).

    • Fix invalid code generated for insert, <- and string variable assignment. This bug was triggered by code in the Kraaij-Pohlmann Dutch stemmer implementation (which was previously not enabled by default but is now the standard Dutch stemmer).

... (truncated)

Commits
  • e4b3efb Update for 3.0.1
  • bbd3319 Protect empty languages dict
  • 298ff9f Update details of Python 2 support in old versions
  • 53fe098 python: Specify correct dependencies for $(python_output_dir)/__init__.py
  • 00a22de Stop excluding classifiers for Armenian and Yiddish
  • abd9adc Update for 3.0.0
  • d23d356 Back out incomplete ESM support for 3.0.0
  • ff42274 Update draft NEWS entry
  • cd61f01 tamil: remove_tense_suffix signals if ending removed
  • edfe576 nepali: Reformat amongs to be clearer
  • Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore <dependency name> major version will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself)
  • @dependabot ignore <dependency name> minor version will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself)
  • @dependabot ignore <dependency name> will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself)
  • @dependabot unignore <dependency name> will remove all of the ignore conditions of the specified dependency
  • @dependabot unignore <dependency name> <ignore condition> will remove the ignore condition of the specified dependency and ignore conditions

📚 Documentation preview 📚: https://RDAgent--4.org.readthedocs.build/en/4/

XianBW and others added 30 commits June 13, 2025 17:49
* remove state.times in old ui

* remove "r" tag

* remove "d" tag

* remove "ef" tag

* remove "init" tag

* fix CI

* remove old tag in app UI

* fix bugs

* fix CI

* some updates

* filter tags
* docs: update explanation for separate config use in litellm

* docs: update default backend to `rdagent.oai.backend.LiteLLMAPIBackend`

* docs: update .rst format

* Update installation_and_configuration.rst
* fix log caller_info

* make env info beauty
* add custom data setting for the data science scene

* fix ci?

* fix ci

* add custom data as an example

* fix ci

* add package

* fix test_import ci error
* raise loop termination in execute_loop

* add SENTINEL
* use simple stdout and stderr
* add live_output config in LocalConf
* refactor rdagent(q) conf files

* fix

* fix ci
* feat: parameterize cache paths with USER to avoid conflicts

* guide for missing training_hyperparameters

* guidance for  KeyError: 'concise_reason'

* fixed three bugs in the test

* fix general_model task bug

* fixed some bugs in the med_model scenario

* delete comments

* format with black

* fix mypy error

* fix ruff error

* fix isort error

* sync code

* revert cache_path code

* revert cache_path code

* delete data mining scenario

* fix factor report loop

* fix LiteLLMAPIBackend log_llm_chat_content setting

* refine fin factor report scenario

* remove unused LogColors

* fix UI

* remove medical scenario docs

* change **kaggle** to **data_science**

* remove default dataset_path in create_debug_data

* remove KAGGLE_SETTINGS in kaggle_crawler

* limit litellm versions

* reformat with black

* change README

* fix_data_science_docs

* make hypothesis observations string

* Hiding old versions of kaggle docs

* hidding kaggle agent docs

---------

Co-authored-by: Young <afe.young@gmail.com>
Co-authored-by: Bowen Xian <xianbowen@outlook.com>
Co-authored-by: yuanteli <1957922024@qq.com>
Release-As: 0.5.0
* add coder version

* merge cooder and feedback prompts

* align v2 and v3 proposal prompts

* fix a small bug

* fix a bug

* fix another bug

* support both function calling and json mode in v2 proposal

* fix minor bug

* reformat

* remove proposal v3

* fix a small bug in json mode

* fix CI

* remove tmp file

* remove v3 check

---------

Co-authored-by: Xu Yang <xuyang1@microsoft.com>
…down (#975)

* Initial plan for issue

* Fix Docker container cleanup issue by using try-finally block

Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>

* Fix additional Docker container leaks in health_check and GPU test functions

Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>

* Remove temporary test files and finalize Docker container cleanup fix

Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>

* Refactor container cleanup code to reduce duplication as requested in review feedback

Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>

* Refactor container cleanup to use shared function and always stop before remove

Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>

* fix CI

* Fix mypy type checking errors for Docker container cleanup

Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>

* fix CI

* Remove unnecessary _cleanup_container wrapper method in DockerEnv class

Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>
Co-authored-by: Xu Yang <xuyang1@microsoft.com>
* feat: add parquet preview and extract common DataFrame preview logic

* refactor: improve error messages, prompts, regex, and session loading

* lint
* merge support more traces
* use feedback from all traces
Co-authored-by: Xu Yang <xuyang1@microsoft.com>
* refactor: rename failed_exp_and_feedback_list to include _after_sota suffix

* refactor: merge prompts_v3 into prompts_v2 and update references
* start to work on multi-trace + async

* init ver of async-multi-tarce, to test

* add eng-ver log

* complete version of async+ mul-trace

* debug

* fix bug on         DS_RD_SETTING.get()

* update

* fix bug + simplif the usage of async in multi-trace

* fix mini bug of arg_name

* Move local_selection into class Experiment & clean the code
* refactor: convert direct_exp_gen to async and enforce parallel limit

* fix bug

* change coroutine function position

* fix fin_quant's direct_exp_gen

* format with isort

---------

Co-authored-by: Bowen Xian <xianbowen@outlook.com>
Co-authored-by: SunsetWolf <Lv.Linlang@hotmail.com>
Release-As: 0.6.0
* fix the dag_parant_index bug caused by the wrong calling order of

* auto-lint
* docs: document extra_volumes dict format in DockerConf

* feat: accept dict values in extra_volumes to specify bind and mode

* fix: skip invalid PDF reports to prevent infinite loop

* from break to raise self.LoopTerminationError

* format with black

---------

Co-authored-by: Young <afe.young@gmail.com>
Release-As: 0.6.1
…on and add output example (#999)

* feat: Enhance data folder description for clarity and robustness

* fix bug

* fix present bugs

* delete useless files

* add output example and refactor the hole util.py

* fix bug for file tree

* add corner case example

* delete useless file
* feat: add code change summary and dict_get_with_warning util

* feat: support code_change_summary in feedback classes

* lint

* feat: validate response_format using BaseModel and warn unknown formats
XianBW and others added 21 commits October 24, 2025 12:30
* fix: handle mixed str and dict types in code_list

* fix: handle missing token_costs entry for loop 0 in summarize_win
Release-As: 0.8.0
…… (#1285)

* fix: avoid triggering errors like "RuntimeError: dictionary changed size during iteration"

* style: reformat run_in_executor call for improved readability
… (#1288)

* docs: add execution environment configuration guide (Docker vs Conda)

* docs: extend execution environment configuration with additional scenario support
…raints (#1313)

* fix(collect_info): parse package names safely from requirements constraints

* chore(collect_info): replace custom requirement parser with packaging.Requirement

* chore(collect_info): improve variable naming when parsing package requirements
* refactor: unify qlib experiment configs, runners, and templates

* fix: use PropSetting instances instead of class attributes in qlib runners

* docs: add configurable train/valid/test time segments for fintech scenarios
* fix: prevent calendar index overflow when signal data ends early

* fix: make test_end optional to resolve Qlib backtest calendar misalignment

* fix: enhance GPU information output in get_gpu_info function

* fix: improve GPU information output in get_gpu_info function for better clarity

---------

Co-authored-by: Xu Yang <peteryang@vip.qq.com>
…(#1326)

* fix: preserve null end_time when rendering dataset segments template

* deps(qlib): bump qlib revision to 2fb9380

* fix: lint error
* refine prompt

* small update

* fix a small bug

* remove debug config after execution

* fix: only remove <think> at start

* feat: support creating dataset & multi-eval frame (#1302)

* feat: add iterative evolve and evaluation support with partial chain stop

* feat: add FTDataEvaluator and support multiple implement functions in finetune

* feat: data implement for pre-proposal and proposal and add datasets (#1303)

* feat:(1) support for multi layer dataset extraction (2) add category.json for dataset in datasets/

* fix: fix bug for generate category.json

* feat: add get_dataset_folder_desc

* init data proposal and merge qzli/ft

* update data proposal prompts and add max_position_embeddings and resolve confilcts

* remove sample counts in data proposal

* turn data and train to unified hypo_gen

* refine prompts

* remove category.json and add it to dataset_info

* fix jinja problem and proposal done

* lint

* add ai-generated description and raw readme into dataset_info.json

* update prompt for description

* add datasets

* initial fix for proposal of data

* final version for data proposal

* lint

* feat: add stats in dataset_info, and enable data coder (#1306)

* refactor(dataset): add stats into dataset_info.json, and remove dataset from gitignore_folder

* feat: enable data coder and run data process

* feat: Merge data coder (#1307)

* feat: implement finetune data coding, evaluation, and config improvements

* fix: deepspeed config path

* fix: dataset info columns

---------

Co-authored-by: Young <afe.young@gmail.com>

* replace str length with token_limit

* add readme to dataset_info and remove useless blank lines in scenario description

* feat: dataset prepare

* fix: extract prams script name

* feat: add loss&predictions samples to feedback

* remove duplicate envs and and add llm_api_preferences and enhance reasoning token limits

* feat: network for ft_env

* fix: remove gpt-4o, which has low quota

* feat: a simple ui

* feat: merge data and train task type (#1309)

* feat: filter redundant prams of lf

* fix: ui bug caused by removing task_type

* fix: force agent to use high concurrency, and remove redundant prompt

* feat: extract info from llama factory log, and check data exists before download

* fix: add compatibility rules

* feat: llm evaluator for data coder

* feat: openai package in ft docker, and refine prompt

* feat: refine ft ui, add more info

* feat: add raw logs

* refine data coder prompt(for feedback debug)

* feat: select dataset in scen init

* fix: ui for docker log seperately

* feat: sync log through blob

* improve ui, and add llm feedback in Runner&Exp2FB (#1312)

* fix: ui bug to visualize docker log, and lint

* feat: unified docker log for ft env, and some refactor

* fix bugs and improve ui

* feat: save log of evaluator(single feedback)

* feat: add evaluator, set cleanup docker log

* feat: call llm in RunnerEvaluator and Feedback

* fix: extract structured error message in RunnerEvaluator

* feat: feedback improve, and fix some bugs

* feat: feedback improve when runner fails

* small update

* feat(UI): add running info and benchmark metric in loop expander

* feat(UI): add render markdown toggle

* feat: refine prompts and add error type in exp2fb

* feat: add filterd params reason, set default benchmark timeout to infinite, and refine train loss express

* recover dataset deepscaler

* feat: set timeout in .env

* refactor: unifiied ft_env timeout

* feat: debug mode for data coder

* feat: deliver data_stats after generate debug_data

* feat: use gpt-5.1 as judge model, set judge_retry, and refine debug mode prompt

* refine prompt

* refactor: llama factory manager logic, and refine data processing prompt

* feat(DockerEnv): support GPU selection via CUDA_VISIBLE_DEVICES

* feat: set api concurrency via .env

* fix: ft env timeout bug

* feat: enable CondaEnv run

* fix: can't update bin path in first run, and path bug in lf manager

* feat(ui): set log path through .env

* refactor(ui): wrap_lines, remove css

* feat(coder): retry when parse code-block fail

* fix: refine single-fb in ui, and fix path bug(not allow proposal to decide path)

* fix: opencompass CondaEnv torch compatible with vllm

* fix: refine error text in coding

* feat: deepspeed config for CondaEnv

* feat: memory estimator

* fix: deepspeed package for condaenv

* fix: use `client.chat.completions.create()` only

* feat: flash attention for condaenv

* feat: strong and weak models interface

* fix: condaenv package dependency

* use multi round conversation in llm finetune proposal

* refine prompt for data processing

* enable evolving in data coder

* maximize output token size

* fix: refine ui

* fix: optional packages for llama factory

* fix: torch denpendency for b200

* fix: opencompass dependency

* update cot prompts

* skip the sub implement

* skip conda preparation if env exists

* update chemcot datasets

* fix: unify docker to use litellm

* update readme and instructions

* fix: set CUDA_VISIBLE_DEVICES for CondaEnv

* feat: add panorama dataset, refactor dataset interface

* feat: calculate token using tiktoken, and ndarray bug

* fix: download subtasks of chemcotdataset seperately

* feat: customized prepare func for datasets

* feat: update new benchmarks

* add datasets package

* docs: readme for llm finetune

* feat: download raw data directly, with post-process function

* feat: analyze raw dataset

* suppress litellm debug info

* feat(ui): summary page

* feat: run multi-jobs

* feat: improve ui

* feat: add path and checkout options to LLM finetune loop entrypoint

* feat: add FinanceIQ_ppl benchmark with auto-download and dataset desc rendering

* refactor: remove unused imports and dead code, fix session folder logging

* feat: enable tablebench and tableInstruct dataset

* refine dataset readme, and coder prompt

* refine proposal and coder prompt

* fix: ui path (default log path)

* feat: add automatic LoRA model merging for benchmarking with vLLM

* refactor: reorganize finetune benchmark and merge modules under benchmark dir

* refactor: modularize benchmark config and error extraction for finetune scenario

* fix: update benchmark import paths and disable env cache for device info

* refactor docke&conda env and fix import bugs

* modify init python file

* feat: add FinanceIQ dataset split utility and integrate with pipeline

* feat: set weak and strong model by env, distribute workload across models

* feat: sample dataset and rm params for tensorboard, wandb

* update script to run jobs

* refine proposal prompt, remove specific dataset name

* fix(ui): auto switch log folder

* fix: estimate the processed full data after sample

* feat: filter raw data more aggressively, and lower data_eval standard

* feat: sync workspace to blob

* feat: rdkit for chemcotbench

* update qwen2.5&llama3.1 context

* fix: force failure on validation error and remove try/except in validator

* feat: unified error sample extraction (with test scripts)

* feat: set conda cache with .env

* feat: skip data eval if data pass in last evo

* fix: rm redundant param

* fix ui bug

* refactor: centralize assign_code_list_to_evo in MultiProcessEvolvingStrategy

* feat: add test_params.yaml generation and workspace cleanup improvements for finetune

* refactor: replace get_clear_ws_cmd with clear_workspace and update prompts for hard check criteria

* add bioprobench dataset

* fix: handle commas in training config extraction and refactor prompt includes

* bioprobench description

* add bioprobench readme

* feat: merge lora adapter for blackwell gpu

* feat: support for multi benchmarks in one job

* change dfficult aware content for training

* update difficulty-aware and logging principles

* fix: resolve variable name conflict in FTRunnerEvaluator

* set job id accuracy to minute

* feat(ui): display one selected metric per benchmark

* feat: store sota exp, and fix ws_ckp bug

* fix: truncate data.json in feedback

* fix: opencompass data for conda env

* fix: save only the last model

* feat: set log path and ws path

* fix: set overwrite_cache to avoid lock contention(through injecting params)

* feat: redirect stdout to file in localenv

* add pickle cache to dataset desc

* fix CI

* fix: remove redundant wrapper

* feat: set python_unbuffered

* move redirect stdout to env run

* fix a small bug

* move model folder

* feat(ui): display benchmark baseline

* fix: enrich scenario and benchmark description

* fix: rewrite runner eval to accept easier

* feat: compare with baseline when no SOTA

* update tablebench readme

* fix: switch back to single benchmark (for baseline)

* feat(ui): add ws path in ui

* refactor: update SOTA tracking to use DAG traversal and parent selection

* fix: prioritize local_selection in trace and refactor sibling retrieval logic

* refactor: unify error handling in feedback generation and update workspace injection

* feat: add skip_loop_error_stepname to control error skip step in LoopBase

* fix: set local_selection to NEW_ROOT for experiments without parent

* feat: set different ports for jobs

* feat: set different ports for jobs

* feat: add upper data size limit for LLM fine-tuning and update related prompts

* fix: replace get_truncated_stdout() with stdout for consistent output handling

* refactor: remove data.json from cache and workspace logic, focus on script-based reuse

* fix: rm target_scenario

* feat: add selective cache extraction and custom cache key for data processing

* fix(ui): bug when displaying tablebench

* fix: filter config in dataset_info.json

* feat: add test set, set valid set

* feat(ui): update test score, and set color for final decision

* feat: add test score for baseline and update ui

* fix: use [-100:] as test range

* feat: update data_stats in runner

* feat: wait for opencompass init when run multi jobs

* fix: adjust test&valid split

* feat: force to generate COT(with <think> token), and add answer format in scenarios.json

* feat: improve ui

* fix: unify benchmark volume mounts and set extra_volumes for conda env

* fix(ui): number color

* fix: update GPU memory handling to use total memory in GB and streamline code

* fix: set use_cot_postprocessor

* feat: add env_dict to config classes and merge env vars in Env run

* fix: let coder obey proposal

* fix(ui): direction bug and update chemcot core metirc

* fix: set consistent benchmark mount points and env vars for docker and conda

* fix: addintional target for LoRA

* feat: workspace dir log for benchmark running

* fix: tableInstruct path bug and update benchmark description

* feat: timeout for whole job

* fix: align FinanceIQ import to opencompass

* feat: use llm_judge for FinanceIQ

* feat: switch to turn on <think> or not

* feat: using scripts to redirect stdout, and run in different windows

* feat: sync litellm log

* fix: gpu memory format

* fix: escape special characters in benchmark desc

* fix: set data processing timeout to 1h

* feat: set valid_loss and save_best_model

* fix: inject timeout and stage

* fix: loss history extract logic

* feat: inject output dir

* feat: inject eval batch size

* feat: inject save_total_limit

* feat: update data prompt

* fix:  escape shell special characters

* fix: tablebench visualization UI

* fix: move implementation validation to coder, and ignore injected params

* feat: README for FinanceIQ dataset

* fix: bioprobench desc error

* fix: remove task alignment when coder eval

* fix: FinanceIQ now extracts last capital as answer

* fix: stdout contains binary data

* feat: recover estimate full output and set eval setting automatically

* fix(ui): precision for summary table

* fix(ui): import error

* feat: try to use lora

* fix(api): fix litellm bug for code block

* fix: refine prompts to give agent more decision space

* chore(ci): fix mypy typing issues

* chore(ci): format code with black

* chore(ci): fix ruff lint violations

* chore(ci): sort imports with isort

* chore(ci): format code with black

* test: temporarily skip extract_parameters imports due to numpy pin

* fix: compatibility issues for qlib scenarios on finetune branch

* fix(fin_factor): skip to fb for coder error

* fix(loop): default skip to feedback step on skip_loop_error

When skip_loop_error exception happens and skip_loop_error_stepname is not
explicitly set, default to jumping to 'feedback' step if it exists,
otherwise fall back to the last step (record).

This prevents KeyError when record step tries to access feedback data that
doesn't exist because we skipped the feedback phase.

Also removed redundant skip_loop_error_stepname from finetune loop since
it's now the default behavior.

* add 'skip to record' to DS scenario like other scenarios

* fix 2 scenarios bug about rd_loop class

* fix: lint(mypy, ruff, black) error

* fix: mypy lint error

* fix data science scenario bug

---------

Co-authored-by: Xu Yang <peteryang@vip.qq.com>
Co-authored-by: Qizheng Li <jenssenlee@163.com>
Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>
Co-authored-by: amstrongzyf <201840057@smail.nju.edu.cn>
Co-authored-by: Young <afe.young@gmail.com>
Co-authored-by: amstrongzyf <amstrongzyf@126.com>
Co-authored-by: chelsea97 <zhuowbrown@gmail.com>
Co-authored-by: SunsetWolf <Lv.Linlang@hotmail.com>
…SSOT

- artifact_utils: create_run_dir, create_round_dir, resolve/load/save helpers
- ClaudeCodeAPIBackend: compatibility shim (chat→CLI, token→LiteLLM, embedding→fail-fast)
- 4 stub adapters: HypothesisGen, H2E, Coder, Summarizer (factor scenario)
- Tests: 9 artifact, 6 shim, 2-round scenario, 4-checkpoint resume (19 passed)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… definitions

- trace_view.py: Trace → lightweight JSON (SOTA, recent rounds, failed hypotheses)
- planner.py: LLM-driven hypothesis+experiment generation with retry/validation
- evaluator.py: LLM-driven feedback with information separation (no source code)
- Tests: 6 trace_view, 13 planner/evaluator (mock LLM), all passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase 2 (core migration):
- llm_conf.py: chat_model -> anthropic/claude-sonnet-4-20250514, token_limit -> 200k
- Delete deprec.py (491-line deprecated OpenAI/Azure/Llama2/GCR backend)
- Replace tiktoken with litellm.token_counter
- Migrate embedding default to voyage/voyage-3
- Replace langchain with direct pypdf usage
- Add anthropic to pydantic_ai PROVIDER_TO_ENV_MAP

Phase 3 (public preparation):
- Add CLAUDE.md for project documentation
- Update README/docs with Anthropic setup examples
- Add adapter-tests and openai-allowlist CI jobs
- Update kaggle_environment.yaml (openai->anthropic)

Phase 4 (legacy cleanup):
- Replace openai exceptions in base.py with litellm equivalents
- Zero import openai / import tiktoken in rdagent/

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…御可能に

- requirements.txt から openai, litellm, pydantic-ai-slim を削除
- requirements/llm.txt に隔離(pip install rdagent[llm] で復元)
- pyproject.toml に llm optional extra 追加
- rdagent/oai/backend/__init__.py: トップレベル import 削除(動的ロードに委譲)
- rdagent/oai/utils/embedding.py: litellm を try/except ガード
- rdagent/scenarios/finetune/scen/utils.py: litellm を try/except ガード
- rdagent/log/ui/ds_trace.py: litellm を try/except ガード
- rdagent/app/utils/health_check.py: litellm を try/except ガード
- rdagent/utils/workflow/loop.py: use_pickle_session フラグ追加

Claude Code = LLM 自身なので Python レベルの SDK は不要。
Claudex factor シナリオは SDK なしで動作する。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Updates the requirements on [azure-identity](https://github.com/Azure/azure-sdk-for-python), [dill](https://github.com/uqfoundation/dill), [pillow](https://github.com/python-pillow/Pillow), [psutil](https://github.com/giampaolo/psutil), [scipy](https://github.com/scipy/scipy) and [snowballstemmer](https://github.com/snowballstem/snowball) to permit the latest version.

Updates `azure-identity` from 1.17.1 to 1.25.2
- [Release notes](https://github.com/Azure/azure-sdk-for-python/releases)
- [Commits](Azure/azure-sdk-for-python@azure-identity_1.17.1...azure-identity_1.25.2)

Updates `dill` from 0.3.9 to 0.4.1
- [Release notes](https://github.com/uqfoundation/dill/releases)
- [Commits](uqfoundation/dill@0.3.9...0.4.1)

Updates `pillow` from 10.4.0 to 12.1.1
- [Release notes](https://github.com/python-pillow/Pillow/releases)
- [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst)
- [Commits](python-pillow/Pillow@10.4.0...12.1.1)

Updates `psutil` from 6.1.0 to 7.2.2
- [Changelog](https://github.com/giampaolo/psutil/blob/master/HISTORY.rst)
- [Commits](giampaolo/psutil@release-6.1.0...release-7.2.2)

Updates `scipy` from 1.14.1 to 1.15.3
- [Release notes](https://github.com/scipy/scipy/releases)
- [Commits](scipy/scipy@v1.14.1...v1.15.3)

Updates `snowballstemmer` to 3.0.1
- [Changelog](https://github.com/snowballstem/snowball/blob/master/NEWS)
- [Commits](snowballstem/snowball@v2.0.0...v3.0.1)

---
updated-dependencies:
- dependency-name: azure-identity
  dependency-version: 1.25.2
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod
- dependency-name: dill
  dependency-version: 0.4.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod
- dependency-name: pillow
  dependency-version: 12.1.1
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: prod
- dependency-name: psutil
  dependency-version: 7.2.2
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: prod
- dependency-name: scipy
  dependency-version: 1.15.3
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod
- dependency-name: snowballstemmer
  dependency-version: 3.0.1
  dependency-type: direct:production
  dependency-group: prod
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Mar 8, 2026
@dependabot @github
Copy link
Contributor Author

dependabot bot commented on behalf of github Mar 8, 2026

This pull request was built based on a group rule. Closing it will not ignore any of these versions in future pull requests.

To ignore these dependencies, configure ignore rules in dependabot.yml

@dependabot dependabot bot deleted the dependabot/pip/prod-7ae5172c28 branch March 8, 2026 06:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.