feat(classic): update classic autogpt a bit to make it more useful for my day to day#11797
feat(classic): update classic autogpt a bit to make it more useful for my day to day#11797
Conversation
- Add Claude 3.5 v2, Claude 4 Sonnet, Claude 4 Opus, and Claude 4.5 Opus models - Add rolling aliases (CLAUDE_SONNET, CLAUDE_OPUS, CLAUDE_HAIKU) - Fix deprecated beta.tools.messages.create API call to use standard messages.create - Update anthropic SDK from ^0.25.1 to >=0.40,<1.0 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove deprecated Flutter frontend (replaced by autogpt_platform) - Remove shell scripts (run, setup, autogpt.sh, etc.) - Remove tutorials (outdated) - Remove CLI-USAGE.md and FORGE-QUICKSTART.md - Add CLAUDE.md files for Claude Code guidance Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add --workspace option to CLI that defaults to current working directory, allowing users to run `autogpt` from any folder. Agent data is now stored in `.autogpt/` subdirectory of the workspace instead of a hardcoded path. Changes: - Add -w/--workspace CLI option to run and serve commands - Remove dependency on forge package location for PROJECT_ROOT - Update config to use workspace instead of project_root - Store agent data in .autogpt/ within workspace directory - Update pyproject.toml files with proper PyPI metadata - Fix outdated tests to match current implementation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The .autogpt/ directory is where AutoGPT stores agent data when running from any directory. This should not be committed to version control. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Important Review skippedToo many files! This PR contains 221 files, which is 71 over the limit of 150. You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
This PR targets the Automatically setting the base branch to |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## dev #11797 +/- ##
======================================
Coverage ? 49.06%
======================================
Files ? 176
Lines ? 14235
Branches ? 1624
======================================
Hits ? 6985
Misses ? 7060
Partials ? 190
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add a layered permission system that controls agent command execution:
- Create autogpt.yaml in .autogpt/ folder with default allow/deny rules
- File operations in workspace allowed by default
- Sensitive files (.env, .key, .pem) blocked by default
- Dangerous shell commands (sudo, rm -rf) blocked by default
- Interactive prompts for unknown commands (y=agent, Y=workspace, n=deny)
- Agent-specific permissions stored in .autogpt/agents/{id}/permissions.yaml
Files added:
- forge/forge/config/workspace_settings.py - Pydantic models for settings
- forge/forge/permissions.py - CommandPermissionManager with pattern matching
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove references to deleted files (./run, cli.py, setup.py, frontend/) from CI workflows. Replace ./run agent start with direct poetry commands to start agent servers in background. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add a task management component modeled after Claude Code's TodoWrite: - TodoItem with recursive sub_items for hierarchical task structure - todo_write: atomic list replacement with sub-items support - todo_read: retrieve current todos with nested structure - todo_clear: clear all todos - todo_decompose: use smart LLM to break down tasks into sub-steps Features: - Hierarchical task tracking with independent status per sub-item - MessageProvider shows todos in LLM context with proper indentation - DirectiveProvider adds best practices for task management - Graceful fallback when LLM provider not configured Integrates with: - original_autogpt Agent (full LLM decomposition support) - ForgeAgent (basic task tracking, no decomposition) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add 6 new utility components to expand agent functionality: - ArchiveHandlerComponent: ZIP/TAR archive operations (create, extract, list) - ClipboardComponent: In-memory clipboard for copy/paste operations - DataProcessorComponent: CSV/JSON data manipulation and analysis - HTTPClientComponent: HTTP requests (GET, POST, PUT, DELETE) - MathUtilsComponent: Mathematical calculations and statistics - TextUtilsComponent: Text processing (regex, diff, encoding, hashing) All components follow the forge component pattern with: - CommandProvider for exposing commands - DirectiveProvider for resources/best practices - Comprehensive parameter validation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add specialized exception classes for better error reporting: - CodeTimeoutError: For code execution timeouts - HTTPError: For HTTP request failures with status code/URL - DataProcessingError: For JSON/CSV processing errors Each exception includes helpful hints for users. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
CodeExecutorComponent: - Add timeout and env_vars parameters to execution commands - Add execute_shell_popen for streaming output - Improve error handling with CodeTimeoutError FileManagerComponent: - Add file_info, file_search, file_copy, file_move commands - Add directory_create, directory_list_tree commands - Better path validation and error messages GitOperationsComponent: - Add git_log, git_show, git_branch commands - Add git_stash, git_stash_pop, git_stash_list commands - Add git_cherry_pick, git_revert, git_reset commands - Add git_remote, git_fetch, git_pull, git_push commands UserInteractionComponent: - Add ask_multiple_choice for structured options - Add notify_user for non-blocking notifications - Add confirm_action for yes/no confirmations WebSearchComponent: - Minor error handling improvements WebSeleniumComponent: - Add get_page_content, execute_javascript commands - Add take_element_screenshot command - Add wait_for_element, scroll_page commands - Improve element interaction reliability Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Environment loading: - Search for .env in multiple locations (cwd, ~/.autogpt, ~/.config/autogpt) - Allows running autogpt from any directory - Document search order in .env.template Setup simplification: - Remove interactive AI settings revision (was broken/unused) - Simplify to just printing current settings - Clean up unused imports Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove asctime from log formats since terminal output already has timestamps from the logging infrastructure. Makes logs cleaner and easier to read. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update gitignore to use glob pattern for settings.local.json files in any .claude directory. Also untrack the existing file. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove openai_functions config option - native tool calling is now always enabled - Remove use_functions_api from BaseAgentConfiguration and prompt strategy - Add use_prefill config to disable prefill for Anthropic (prefill + tools incompatible) - Update anthropic dependency to ^0.45.0 for tools API support - Simplify prompt strategy to always expect tool_calls from LLM response This fixes the N/A command loop bug where models would output "N/A" as a command name when function calling was disabled. With native tool calling always enabled, models are forced to pick from valid tools only. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
… 3.14 - Update Python version constraint from ^3.10 to ^3.12 in all pyproject.toml - Update classifiers to reflect Python 3.12, 3.13, 3.14 support - Update dependencies for Python 3.13+ compatibility: - chromadb: ^0.4.10 -> ^1.4.0 - numpy: >=1.26.0,<2.0.0 -> >=2.0.0 - watchdog: 4.0.0 -> ^6.0.0 - spacy: ^3.0.0 -> ^3.8.0 (numpy 2.x compatibility) - en-core-web-sm model: 3.7.1 -> 3.8.0 - httpx (benchmark): ^0.24.0 -> ^0.27.0 - Update tool configuration: - Black target-version: py310 -> py312 - Pyright pythonVersion: 3.10 -> 3.12 - Update Dockerfiles to use Python 3.12 - Update CI workflows to test on Python 3.12, 3.13, and 3.14 - Regenerate all poetry.lock files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implement four new prompt strategies based on research papers: - ReWOO: Reasoning Without Observation (5x token efficiency) - Plan-and-Execute: Separate planning from execution phases - Reflexion: Verbal reinforcement learning with episodic memory - Tree of Thoughts: Deliberate problem solving with tree search Each strategy extends a new BaseMultiStepPromptStrategy base class with shared utilities. Strategies are selectable via PROMPT_STRATEGY environment variable or config.prompt_strategy setting. Fix JSONSchema generation issue where Optional/Union types created anyOf schemas without direct type field - resolved by storing plan/phase state in strategy instances rather than ActionProposal. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The strategy was stuck in a loop because it tracked plan steps but never advanced them - the record_step_success() method existed but was never called by the agent's execution loop. Fix by using a _pending_step_advance flag to track when an action has been proposed. On the next parse_response_content() call, advance the previous step before processing the new response. This keeps step tracking self-contained in the strategy without requiring agent changes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds a custom Rich-based interactive selector for the command approval workflow. Features include: - Arrow key navigation for selecting approval options - Tab to add context to any selection (e.g., "Once + also check file x") - Dedicated inline feedback option with shadow placeholder text - Quick select with number keys 1-5 - Works within existing asyncio event loop (no prompt_toolkit dependency) Also adds UIProvider abstraction pattern for future UI implementations. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly. |
...benchmark/challenges/verticals/code/2_password_generator/artifacts_out/password_generator.py
Dismissed
Show dismissed
Hide dismissed
Some LLM providers (notably Anthropic) don't support system messages in the middle of a conversation. Changed ChatMessage.system() to ChatMessage.user() for all mid-conversation context messages across components (action history, context, skills, system clock, todo, error reporting, LATS, and multi-agent debate strategies). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
6ee7ead to
f56abce
Compare
|
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request. |
# Conflicts: # .github/workflows/classic-frontend-ci.yml # .gitignore # classic/frontend/.gitignore
|
Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly. |
- Add 'playwright install chromium' step to Forge CI workflow - Auto-detect default model from available API keys (ANTHROPIC_API_KEY, OPENAI_API_KEY, GROQ_API_KEY) in direct_benchmark harness - Prefer Claude > OpenAI > Groq, fallback to OpenAI if no keys found
|
Too many files changed for review. ( |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| # Include forge so it can be used as a path dependency | ||
| COPY forge/ ../forge | ||
|
|
||
| # Include frontend |
There was a problem hiding this comment.
Dockerfile references removed pyproject.toml and poetry.lock files
High Severity
The COPY original_autogpt/pyproject.toml original_autogpt/poetry.lock ./ instruction references files that no longer exist. The PR consolidates all Poetry projects into a single classic/pyproject.toml, removing the per-subpackage pyproject.toml and poetry.lock files. The Dockerfile was partially updated (Python 3.12, frontend removal) but this COPY instruction still points to the old, now-nonexistent files, which will cause all Docker builds to fail.


Summary
This PR modernizes AutoGPT Classic to make it more useful for day-to-day autonomous agent development. Major changes include consolidating the project structure, adding new prompt strategies, modernizing the benchmark system, and improving the development experience.
Note: AutoGPT Classic is an experimental, unsupported project preserved for educational/historical purposes. Dependencies will not be actively updated.
Changes 🏗️
Project Structure & Build System
forge/,original_autogpt/, and benchmark packages into a singlepyproject.tomlatclassic/rootagbenchmarkpackage (3000+ lines) in favor of the newdirect_benchmarkharnessbenchmark/frontend/React app (no longer needed)New Direct Benchmark System
direct_benchmarkharness - New streamlined benchmark runner with:Multiple Prompt Strategies
Added pluggable prompt strategy system supporting:
LLM Provider Improvements
Web Components
Agent Capabilities
Developer Experience
autogptfrom any project directoryBug Fixes
Test Plan
poetry installfromclassic/directorypoetry run autogptand verify CLI startspoetry run direct-benchmark run --tests ReadFileto verify benchmark worksNotes
Note
Medium Risk
CI and developer-tooling changes are broad and could unintentionally skip coverage (e.g., dropping multi-OS testing) or break installs if the consolidated
classicPoetry setup diverges from reality; functional runtime behavior is largely untouched in this diff.Overview
AutoGPT Classic’s automation is refactored to assume a single consolidated Poetry project rooted at
classic/: GitHub Actions workflows drop the multi-OS matrix, standardize on Ubuntu + Python 3.12, useclassic/poetry.lockfor caching, and run tests/coverage from the unified layout (including addingANTHROPIC_API_KEYand consistent MinIO setup).Benchmark CI is rewritten to run
direct-benchmarksmoke tests (Read/WriteFile), category/strategy filtering checks, and a gated (master/dev) maintain/regression run; legacyagbenchmark-based workflow logic and related per-subproject lint/type plumbing are removed in favor of singleclassic-scoped pre-commit hooks and a consolidatedclassic/.flake8.Repository hygiene/docs are updated:
.gitmodulesand the Classic frontend workflow are removed, Classic docs are refreshed (classic/README.md, newclassic/CLAUDE.md), and.gitignoreexpands to cover new workspace/report artifacts (e.g.,.autogpt/, benchmark workspaces,test.db) plus Claude local settings.Written by Cursor Bugbot for commit b075495. This will update automatically on new commits. Configure here.