Skip to content

Introduce langchain based interpreter for AFM#7

Merged
MaryamZi merged 65 commits intowso2:mainfrom
RadCod3:feat/langchain-py
Feb 9, 2026
Merged

Introduce langchain based interpreter for AFM#7
MaryamZi merged 65 commits intowso2:mainfrom
RadCod3:feat/langchain-py

Conversation

@RadCod3
Copy link
Contributor

@RadCod3 RadCod3 commented Feb 5, 2026

$subject

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced langchain-interpreter: a new AFM implementation supporting multi-interface agents (console, web, webhook), MCP tool integration, template-based prompts, and flexible model provider support (OpenAI, Anthropic).
  • Documentation

    • Added README with quick-start guide, configuration instructions, and project structure overview.
  • Chores

    • Added project configuration, Dockerfile, GitHub Actions CI/CD pipeline, and comprehensive test suite.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 5, 2026

Walkthrough

This PR introduces a complete LangChain-based interpreter implementation for the AFM specification. It adds a full Python project with agent orchestration, multiple interface types (console, web, webhook), MCP tool integration, comprehensive testing, and CI/CD workflows for building and deploying the interpreter.

Changes

Cohort / File(s) Summary
Project Configuration
langchain-interpreter/.gitignore, .python-version, pyproject.toml, .dockerignore
Adds project metadata, Python version specification (3.11), dependency configuration with LangChain/FastAPI/Pydantic stack, and Docker build exclusions.
Docker & Deployment
langchain-interpreter/Dockerfile, .github/workflows/langchain-interpreter.yml
Implements multi-stage Alpine-based Docker build with uv for dependency management and CI workflow for testing and multi-arch image publishing to GHCR.
License & Documentation
langchain-interpreter/LICENSE, README.md, langchain-interpreter/README.md, README.md
Adds Apache 2.0 license, project README with features/setup, and updates main repo README to list langchain-interpreter as active implementation.
Core Models & Data Structures
src/afm/models.py, src/afm/__init__.py
Defines comprehensive Pydantic models for AFM representation: agents, interfaces (console/web/webhook), authentication, MCP servers, JSON schemas, templates, and signatures with strict validation.
Parsing & Validation
src/afm/parser.py, src/afm/variables.py, src/afm/exceptions.py
Implements AFM file parsing with YAML frontmatter extraction, variable resolution (environment), HTTP-variable validation, and rich exception hierarchy (AFMError, ProviderError, MCPError, etc.).
Agent Orchestration
src/afm/agent.py
Core Agent class managing model invocation, tool binding/execution, session history, message composition with system prompts, input/output validation, and iteration limits.
Schema Validation & Templates
src/afm/schema_validator.py, src/afm/templates.py
JSON schema validation for agent I/O, template compilation and evaluation for dynamic prompts with payload/header variable support, and output coercion logic.
Model Providers
src/afm/providers.py
Factory for creating LLM providers (OpenAI, Anthropic) with API key resolution from authentication config or environment, and error handling for missing credentials/packages.
Interface: Console Chat
src/afm/interfaces/console_chat.py, console_chat.tcss
Textual-based interactive chat UI with message rendering, command support (help/clear/exit), session management, agent communication, and comprehensive CSS styling.
Interface: Web Chat
src/afm/interfaces/web_chat.py, src/afm/resources/chat-ui.html
FastAPI router for web chat with HTML UI template, string/schema-based endpoints, session tracking via headers, error handling, and health/metadata endpoints.
Interface: Webhook
src/afm/interfaces/webhook.py
Webhook handler with signature verification (SHA256/SHA1), WebSub subscription lifecycle management, template evaluation for prompts, auto-subscription with retry, and graceful shutdown.
Interface Utilities
src/afm/interfaces/base.py, src/afm/interfaces/__init__.py
Helpers for interface selection, type-guarded retrieval, HTTP path resolution, and default interface handling.
MCP Integration
src/afm/tools/mcp.py, src/afm/tools/__init__.py
Multi-server MCP client with pluggable auth (Bearer/API-Key/Basic), tool filtering, aggregated tool retrieval, caching, and error propagation with MCPConnectionError/MCPAuthenticationError.
CLI & App Creation
src/afm/cli.py
Click-based CLI for running AFM agents with dry-run validation, HTTP+console hybrid mode, interface extraction, formatted output, and unified FastAPI app builder with CORS/lifespan management.
Test Fixtures
tests/conftest.py, tests/__init__.py, tests/fixtures/*
Pytest configuration with path fixtures, environment variable manager, and sample AFM files (agent, console, webhook, minimal, no-frontmatter variants).
Unit Tests: Core
tests/test_parser.py, tests/test_variables.py, tests/test_agent.py
Tests for AFM parsing, variable resolution with env var handling, and agent creation/execution with mock models and system prompt composition.
Unit Tests: Interfaces
tests/test_console_chat.py, tests/test_web_chat.py, tests/test_webhook.py
Tests for Textual chat app lifecycle/commands, web chat endpoint behavior with string/object I/O and session handling, and webhook signature verification/WebSub flows.
Unit Tests: Integration
tests/test_cli.py, tests/test_mcp.py
CLI argument parsing, dry-run validation, app creation paths, HTTP server lifecycle, MCP authentication/filtering, multi-server aggregation, and error scenarios.
Workflow Integration
.github/workflows/release-common.yml, .github/workflows/release.yml
Extends release workflows to support langchain-interpreter: conditional Python setup with uv, version extraction from pyproject.toml, and build/test steps.

Sequence Diagrams

sequenceDiagram
    participant User
    participant Agent
    participant Model
    participant MCPClient
    participant Tool as Tool Executor

    User->>Agent: arun(input, session_id)
    Agent->>Agent: Validate input against signature
    Agent->>Agent: Build system prompt + history
    Agent->>Model: invoke(messages)
    
    alt Model returns tool use
        Model-->>Agent: ToolCall (tool_name, args)
        Agent->>Agent: Find tool by name
        Agent->>MCPClient: Execute tool
        MCPClient->>Tool: Call via HTTP
        Tool-->>MCPClient: Result
        MCPClient-->>Agent: Tool result
        Agent->>Agent: Append ToolMessage
        Agent->>Model: invoke(messages + ToolMessage)
    end
    
    Model-->>Agent: Final response (text)
    Agent->>Agent: Coerce to output schema
    Agent->>Agent: Validate output
    Agent->>Agent: Store in session history
    Agent-->>User: Return result
Loading
sequenceDiagram
    participant Client
    participant WebhookApp as Webhook App
    participant WebSubHub as WebSub Hub
    participant Subscriber as WebSubSubscriber
    participant Agent

    WebhookApp->>Subscriber: Subscribe to hub/topic with callback
    Subscriber->>WebSubHub: POST subscribe request
    WebSubHub-->>Subscriber: Verification challenge
    Subscriber->>WebhookApp: Verify challenge response
    WebhookApp->>Subscriber: Challenge validated
    
    Client->>WebhookApp: POST /webhook (signed payload)
    WebhookApp->>WebhookApp: Verify signature
    WebhookApp->>WebhookApp: Evaluate template with payload
    WebhookApp->>Agent: arun(prompt, session_id)
    Agent-->>WebhookApp: Result
    WebhookApp-->>Client: JSON/text response
Loading
sequenceDiagram
    participant Client
    participant WebChat as Web Chat UI
    participant FastAPI
    participant Agent
    participant Model

    Client->>WebChat: Open chat page
    WebChat->>WebChat: Generate/load session ID
    WebChat->>FastAPI: GET /chat/ui
    FastAPI-->>WebChat: HTML with agent metadata
    
    Client->>WebChat: Type message
    WebChat->>FastAPI: POST /chat (message, X-Session-Id)
    FastAPI->>Agent: arun(message, session_id)
    Agent->>Model: invoke(messages)
    Model-->>Agent: response
    Agent-->>FastAPI: result
    FastAPI-->>WebChat: Response (text or JSON)
    WebChat->>Client: Display agent message
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐰 A new interpreter hops into sight,
With LangChain and FastAPI shining bright,
MCP tools dance on webhook and web,
Console chats flow with each agent thread,
AFM specs find their Python flight!

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description contains only a placeholder ($subject) with no actual content, missing all required template sections. Fill in the PR description using the template provided, including Purpose, Goals, Approach, User stories, Release note, Documentation, and other required sections.
Docstring Coverage ⚠️ Warning Docstring coverage is 55.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main addition: a LangChain-based interpreter for AFM framework.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@RadCod3 RadCod3 marked this pull request as ready for review February 6, 2026 06:54
@RadCod3 RadCod3 requested a review from MaryamZi as a code owner February 6, 2026 06:54
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

🤖 Fix all issues with AI agents
In `@langchain-interpreter/src/langchain_interpreter/agent.py`:
- Line 411: The assignment "max_iterations = self.max_iterations or 10"
incorrectly treats 0 as falsy; update both occurrences (in the run() method
around the max_iterations assignment and the analogous line in arun()) to use an
explicit None check so an explicit 0 is preserved (e.g., set max_iterations to
self.max_iterations if self.max_iterations is not None else 10), referencing the
max_iterations variable and the run()/arun() methods to locate the fixes.

In `@langchain-interpreter/src/langchain_interpreter/cli.py`:
- Around line 440-452: The async context manager around the agent in
_run_http_and_console is redundant with the FastAPI lifespan (both call
agent.connect()/agent.disconnect()); remove the outer "async with agent:" block
and its enter/exit handling so the FastAPI lifespan (startup/shutdown) is the
single place managing agent lifecycle, leaving server start/stop and
async_run_console_chat(agent) intact; ensure references to agent inside the
function remain valid and that server_task/startup still run while the lifespan
performs connect/disconnect.

In `@langchain-interpreter/src/langchain_interpreter/interfaces/__init__.py`:
- Around line 48-57: The __all__ includes "InterfaceNotFoundError" but the
symbol isn't imported from the base module; import InterfaceNotFoundError from
.base alongside get_console_interface, get_http_path, etc., so that from
langchain_interpreter.interfaces import InterfaceNotFoundError works, and verify
the exported list (the __all__ near the bottom) matches the imported symbols
(also update the duplicate export block around the later 81–91 area if present
to include InterfaceNotFoundError).

In `@langchain-interpreter/src/langchain_interpreter/interfaces/base.py`:
- Around line 41-69: The function get_interface_by_type currently accepts both
InterfaceType and str at runtime but its type annotation only lists
InterfaceType; update the signature to accept InterfaceType | str (or
Union[InterfaceType, str]) to match the isinstance check and runtime behavior
(similar to has_interface_type), ensuring any necessary typing import is added
or supported by the project's Python version.

In `@langchain-interpreter/src/langchain_interpreter/interfaces/console_chat.py`:
- Around line 265-269: The loop is calling the blocking input_fn(user_prompt)
(defaulting to input()) inside an async coroutine which will block the event
loop; replace the direct call with an offloaded call using the event loop's
executor (e.g., get_running_loop().run_in_executor(None, input_fn, user_prompt))
so the input runs in a thread pool and is awaited, leaving the loop free for
other tasks (this change should be made where the main chat loop invokes
input_fn; references: input_fn and the coroutine run that executes alongside
_run_http_and_console).

In `@langchain-interpreter/src/langchain_interpreter/interfaces/web_chat.py`:
- Around line 23-26: The module currently reads CHAT_UI_TEMPLATE at import time
which will raise FileNotFoundError if resources/chat-ui.html is missing; change
this to lazy-load and handle missing files by replacing the module-level
CHAT_UI_TEMPLATE assignment with a function like get_chat_ui_template() that
reads CHAT_UI_TEMPLATE_PATH on first call, caches the content, and catches
FileNotFoundError to either return a clear fallback string or raise a more
informative exception; update all callers to use get_chat_ui_template() instead
of the CHAT_UI_TEMPLATE constant and keep CHAT_UI_TEMPLATE_PATH as the path
symbol for locating the file.
- Around line 92-103: The agent metadata (raw_icon_url -> icon_url, agent_name,
agent_description) are inserted directly into CHAT_UI_TEMPLATE causing XSS; fix
by importing and using html.escape() on these user-derived values before
constructing icon_style and before calling .replace() on CHAT_UI_TEMPLATE
(escape raw_icon_url when computing icon_url, then compute icon_style from the
escaped/normalized value, and escape agent.name and agent.description
variables), ensuring all three substituted symbols (agent_name,
agent_description, AGENT_ICON_URL) are escaped prior to template replacement.

In `@langchain-interpreter/src/langchain_interpreter/interfaces/webhook.py`:
- Around line 569-587: _resolve_secret currently swallows variable resolution
failures and returns the raw `${env:...}` template, causing opaque HMAC
failures; update _resolve_secret to catch VariableResolutionError (or generic
exceptions from resolve_variables) and either log a warning via the module
logger (including the secret key name/template and the exception message) or
re-raise a clear configuration error so misconfigured env secrets are surfaced;
reference the _resolve_secret function and the call to
resolve_variables/VariableResolutionError to locate the change and ensure the
code does not silently return the unexpanded template.
- Around line 496-505: The lifespan asynccontextmanager discards the task
created for _subscribe_with_retry, risking lost exceptions and premature GC;
capture the returned task (e.g., assign to a variable like subscribe_task) when
calling asyncio.create_task(_subscribe_with_retry(websub_subscriber)), add
subscribe_task.add_done_callback that logs exceptions (inspect task.exception())
and on shutdown cancel/wait for subscribe_task if still pending before calling
websub_subscriber.unsubscribe(); update references in the lifespan function and
ensure proper cleanup when websub_subscriber is present and verified.

In `@langchain-interpreter/src/langchain_interpreter/parser.py`:
- Around line 126-148: The except blocks drop original exceptions; update the
three raise statements to chain the original exception with "from e" so
tracebacks are preserved: in the yaml.YAMLError handler when raising
AFMParseError, and in the ValidationError handler when raising
AFMValidationError (both the branch that builds msg+field from first_error and
the fallback raise), keep the same messages but append "from e" to each raise;
target the exception variable e from yaml.safe_load and
AgentMetadata.model_validate/ValidationError to preserve context.
- Around line 180-189: The parser currently uses startswith checks on the
lowercased heading (heading = stripped[2:].lower()) which allows unintended
matches like "Roleplay" or "Instructions for..."; update the logic in the block
that sets in_role and in_instructions to perform exact equality checks against
the normalized heading (e.g., compute heading = stripped[2:].strip().lower() and
then use heading == "role" and heading == "instructions") so only the explicit
"# Role" and "# Instructions" headings trigger setting in_role/in_instructions;
keep the existing variables in_role and in_instructions and the continue
behavior unchanged.

In `@langchain-interpreter/src/langchain_interpreter/providers.py`:
- Line 27: DEFAULT_ANTHROPIC_MODEL is set to an invalid Anthropic model ID
("claude-sonnet-4"); update this constant to a valid snapshot ID or alias such
as "claude-sonnet-4-20250514" or an alias like "claude-sonnet-4-0" (or for
Sonnet 4.5 use "claude-sonnet-4-5" / "claude-sonnet-4-5-20250929") so the
Anthropic client will accept the model identifier.

In `@langchain-interpreter/src/langchain_interpreter/resources/chat-ui.html`:
- Around line 54-56: The template substitutions in web_chat.py currently insert
user-controlled values via plain str.replace (e.g. {{AGENT_NAME}},
{{AGENT_DESCRIPTION}}, {{AGENT_ICON_URL}}, {{AGENT_ICON_STYLE}}, {{CHAT_PATH}})
which allows XSS; fix by HTML-escaping each value before substitution using
Python's html.escape() (escape agent.name and agent.description, escape
raw_icon_url for AGENT_ICON_URL, escape the computed icon style for
AGENT_ICON_STYLE, and escape path for CHAT_PATH) so all replacements are safe
for their HTML/attribute/JS contexts; update the code paths that compute
agent_name, agent_description, icon_url, icon_style and path to call escape(...)
prior to performing str.replace().
- Line 80: The chatPath assignment injects raw {{CHAT_PATH}} into a JS string
literal allowing quote/backslash/newline injection; update the template
substitution to safely JS-escape characters (at least single quote, backslash,
and newline) before placing into the chatPath string or, alternatively, change
the template to assign chatPath via a JSON.stringify-encoded value (e.g. use the
JSON-encoded path produced in web_chat.py instead of raw insertion). Locate the
chatPath declaration in chat-ui.html and the template rendering in web_chat.py
(the code that substitutes CHAT_PATH) and implement one of these safe-encoding
approaches so the final emitted JavaScript cannot be broken or injected.

In `@langchain-interpreter/src/langchain_interpreter/templates.py`:
- Around line 138-167: The docstring for evaluate_template incorrectly claims it
raises TemplateEvaluationError while the function currently swallows missing
payload/header values; either remove the Raises section or implement the error
behavior: add or import TemplateEvaluationError and make evaluate_template (or
the helpers _handle_payload_variable and _handle_header_variable) detect missing
required variables in compiled.segments (e.g., when a PayloadVariable or
HeaderVariable cannot be resolved) and raise TemplateEvaluationError with a
clear message including the variable name and context; update callers/tests
accordingly if you choose to raise instead of editing the docstring.
- Around line 308-316: The except block that handles a ValueError when
converting bracket_content to int should preserve the original exception via
exception chaining; change the except clause to capture the ValueError (e.g.,
"except ValueError as err:") and re-raise JSONAccessError (the custom exception)
using "raise JSONAccessError(f'Invalid array index: {bracket_content}',
path=remaining) from err" so the original traceback is kept; this touches the
block that references bracket_content, remaining, and JSONAccessError in
templates.py.

In `@langchain-interpreter/src/langchain_interpreter/variables.py`:
- Around line 259-281: The function _json_schema_contains_http_variable misses
arbitrary extra fields allowed on JSONSchema (models.py sets extra="allow"), so
update it to also iterate over the schema's extra/unmodeled attributes (e.g.,
via schema.__dict__ or schema.__pydantic_extra) and recursively inspect those
values: if a value is a JSONSchema instance, call
_json_schema_contains_http_variable; if it's a list/tuple/dict, iterate and
check contained items; if it's a string, run contains_http_variable on it;
otherwise skip. Keep the existing checks for
type/properties/required/items/description and add this extra-attributes pass so
enum/default/const and other extras are covered.
🧹 Nitpick comments (33)
langchain-interpreter/src/langchain_interpreter/variables.py (1)

37-37: VARIABLE_PATTERN is duplicated in templates.py Line 29.

Both variables.py and templates.py define VARIABLE_PATTERN = re.compile(r"\$\{([^}]+)\}"). Consider extracting it to a shared location to avoid drift.

langchain-interpreter/src/langchain_interpreter/interfaces/base.py (2)

12-12: Inconsistent import style — use a relative import.

All other imports in this file use relative paths (from ..models import ...), but InterfaceNotFoundError is imported with an absolute path from the package root. This couples the module to whatever __init__.py re-exports and is inconsistent with the rest of the file.

Proposed fix
-from langchain_interpreter import InterfaceNotFoundError
+from ..exceptions import InterfaceNotFoundError

84-86: assert is stripped under python -O; prefer an explicit check.

assert isinstance(...) is used for type narrowing after get_interface_by_type. If the interpreter is run with optimizations (-O/-OO), these assertions are removed and the functions would silently return an incorrectly-typed value. Given the discriminated-union mechanics this is unlikely to trigger, but replacing with an explicit guard is safer for a library.

Example fix for get_console_interface
     interface = get_interface_by_type(afm, InterfaceType.CONSOLE_CHAT)
-    assert isinstance(interface, ConsoleChatInterface)
+    if not isinstance(interface, ConsoleChatInterface):
+        raise TypeError(f"Expected ConsoleChatInterface, got {type(interface).__name__}")
     return interface

Also applies to: 101-103, 118-120

langchain-interpreter/src/langchain_interpreter/providers.py (2)

25-35: DEFAULT_OPENAI_URL and DEFAULT_ANTHROPIC_URL are unused.

These constants are defined but never referenced anywhere in this module (the base_url kwarg is only set when model.url is present). Either remove them or use them as actual fallbacks.


96-103: API key is passed as a plain str.

ChatOpenAI (and ChatAnthropic) typically declare api_key as SecretStr. Passing a raw string works due to Pydantic coercion in current LangChain versions, but if that changes this will break. Not a blocking issue, just noting the coupling.

langchain-interpreter/src/langchain_interpreter/templates.py (1)

55-92: Manual find("${") / find("}") parsing vs. regex.

The compile loop uses manual string scanning rather than VARIABLE_PATTERN (defined on Line 29 but unused). The manual approach works but diverges from the regex-based approach in variables.py. If the variable syntax ever changes, two different parsers must be updated.

Not a bug — just noting the inconsistency for maintainability.

langchain-interpreter/src/langchain_interpreter/models.py (1)

350-378: get_filtered_tools: consider using a set for deny lookups.

The if tool not in deny check on Line 378 is O(m) per element, making the overall filter O(n×m). For small lists this is fine, but a set makes it O(n).

Proposed fix
     # Both specified - return allow minus deny
-    return [tool for tool in allow if tool not in deny]
+    deny_set = set(deny)
+    return [tool for tool in allow if tool not in deny_set]
langchain-interpreter/tests/test_console_chat.py (1)

33-43: Unused fixture sample_afm.

This fixture is defined but never referenced by any test in this file. Consider removing it to keep the test module clean.

🧹 Remove unused fixture
-@pytest.fixture
-def sample_afm() -> AFMRecord:
-    """Create a sample AFM record for testing."""
-    return AFMRecord(
-        metadata=AgentMetadata(
-            name="Test Agent",
-            description="A test agent",
-        ),
-        role="You are a helpful assistant.",
-        instructions="Be helpful and concise.",
-    )
-

Also remove the unused imports on line 13:

-from langchain_interpreter import AFMRecord, AgentMetadata
langchain-interpreter/tests/test_parser.py (1)

243-298: Consider adding tests for duplicate webhook and single webhook extraction.

The TestValidateAndExtractInterfaces class covers duplicate errors for consolechat and webchat, and single-interface extraction for those two types, but is missing analogous coverage for webhook. Adding test_single_webhook and test_duplicate_webhook_error would make the suite symmetric and complete.

🧪 Suggested additional test cases
def test_single_webhook(self) -> None:
    """Test extracting a single webhook interface."""
    interfaces = [WebhookInterface(subscription=Subscription(protocol="websub"))]
    console, web, webhook = validate_and_extract_interfaces(interfaces)

    assert console is None
    assert web is None
    assert webhook is not None
    assert isinstance(webhook, WebhookInterface)

def test_duplicate_webhook_error(self) -> None:
    """Test that duplicate webhook interfaces raise an error."""
    interfaces = [
        WebhookInterface(subscription=Subscription(protocol="websub")),
        WebhookInterface(subscription=Subscription(protocol="websub")),
    ]
    with pytest.raises(AFMValidationError) as exc_info:
        validate_and_extract_interfaces(interfaces)
    assert "Multiple interfaces" in str(exc_info.value)

This requires adding Subscription to the imports:

 from langchain_interpreter import (
     AFMParseError,
     AFMValidationError,
     ConsoleChatInterface,
+    Subscription,
     WebChatInterface,
     WebhookInterface,
     parse_afm,
     parse_afm_file,
     validate_and_extract_interfaces,
 )
langchain-interpreter/tests/test_agent_tools.py (1)

1-5: Missing copyright header.

Other test files in this PR include a copyright header (e.g., # Copyright (c) 2025 / # Licensed under the Apache License, Version 2.0). This file is missing it.

📝 Add copyright header
+# Copyright (c) 2025
+# Licensed under the Apache License, Version 2.0
+
+"""Tests for Agent tool schema fixing for OpenAI compatibility."""
+
 from unittest.mock import MagicMock, patch
langchain-interpreter/tests/test_cli.py (1)

371-392: Consider asserting result.exit_code in the HTTP server test.

All other CLI tests assert on exit_code, but test_cli_starts_http_server_for_webchat only checks whether uvicorn.run or uvicorn.Config was called. If the CLI throws an unexpected error before reaching uvicorn, this test would silently pass the uvicorn assertion (since called on a fresh mock defaults to False but the assertion uses or, so if one is True it passes — however neither might be called on failure).

🧪 Suggested improvement
         runner.invoke(main, [str(sample_agent_path), "--port", "9000"])
 
+        assert result.exit_code == 0
         # Should have called uvicorn.run
         assert mock_uvicorn.run.called or mock_uvicorn.Config.called

Note: you'd also need to capture the result: result = runner.invoke(...).

langchain-interpreter/src/langchain_interpreter/resources/chat-ui.html (2)

114-127: Basic markdown parser has known limitations with nested/overlapping syntax.

The regex-based parser processes patterns sequentially, so overlapping markdown constructs (e.g., bold inside code blocks, or * inside inline code) can produce incorrect output. The escapeHtml call before markdown parsing is a good safety measure. This is acceptable for a basic chat UI, but worth noting.


194-204: Error re-throw logic based on message prefix is fragile.

The check at Line 199 relies on error.message.startsWith(...) matching the exact wording of previously thrown errors. If error messages are changed, legitimate errors may be swallowed and replaced with a generic message.

Consider storing a flag or using a custom error class/property instead:

♻️ Suggested approach
-            if (error instanceof Error) {
-                if (error.name === 'TypeError' && error.message.includes('fetch')) {
-                    throw new Error('Unable to connect to the chat service. Please check your connection.');
-                }
-                if (error.message.startsWith('Invalid') || error.message.startsWith('Authentication') || error.message.startsWith('Too many') || error.message.startsWith('Service') || error.message.startsWith('Unable')) {
-                    throw error;
-                }
-            }
-            throw new Error('An unexpected error occurred. Please try again.');
+            if (error instanceof Error) {
+                if (error.name === 'TypeError' && error.message.includes('fetch')) {
+                    throw new Error('Unable to connect to the chat service. Please check your connection.');
+                }
+                throw error;
+            }
+            throw new Error('An unexpected error occurred. Please try again.');

Since errors in the try block are already user-friendly messages thrown by the code above, re-throwing them directly is sufficient. The only case that needs special handling is the TypeError from fetch.

langchain-interpreter/tests/test_web_chat.py (1)

354-382: Consider also testing that a disallowed origin is rejected.

The CORS tests verify that CORS is disabled by default and that a preflight from an allowed origin succeeds. An additional test asserting that a request from a non-allowed origin does not get the access-control-allow-origin header would strengthen the coverage.

langchain-interpreter/tests/test_agent.py (2)

57-67: Unused fixture: afm_with_interface.

This fixture is defined but not referenced by any test in this file. If it's intended for future tests, consider adding a TODO or removing it to keep the test module clean.


380-394: Weak assertion: >= 4 instead of == 4.

Line 394 uses assert len(second_call_messages) >= 4, but the expected message list for the second call is exactly: [SystemMessage, HumanMessage("First message"), AIMessage, HumanMessage("Second message")] — so == 4 would be more precise and would catch regressions where extra messages are inserted.

♻️ Suggested tightening
-        assert len(second_call_messages) >= 4
+        assert len(second_call_messages) == 4
langchain-interpreter/src/langchain_interpreter/interfaces/console_chat.py (2)

114-218: Significant code duplication between sync and async chat loops.

The two functions are nearly identical (~100 lines each) — the only substantive difference is agent.run(...) vs await agent.arun(...). Consider extracting the shared command-parsing and I/O logic into a common helper that accepts a callback for the agent invocation step.

Also applies to: 221-319


199-202: Move import json to the module top level.

Placing import json inside the loop body on every non-string response is unconventional. Python caches it, so there's no performance concern, but it hurts readability. Same applies to the duplicate on line 301.

langchain-interpreter/src/langchain_interpreter/parser.py (1)

207-209: Use typed tuple return annotation for validate_and_extract_interfaces.

The bare list and tuple annotations lose type information for callers and type-checkers.

Proposed fix
 def validate_and_extract_interfaces(
-    interfaces: list,
-) -> tuple:
+    interfaces: list[ConsoleChatInterface | WebChatInterface | WebhookInterface],
+) -> tuple[ConsoleChatInterface | None, WebChatInterface | None, WebhookInterface | None]:

Note: this would require moving the imports from line 224 to the top of the function or using TYPE_CHECKING.

langchain-interpreter/tests/test_mcp.py (1)

352-363: Fragile test setup: bypassing Pydantic immutability with object.__setattr__.

Line 360 uses object.__setattr__ to mutate a frozen Pydantic model, which couples this test to Pydantic's internal behavior. If Pydantic changes how immutability is enforced, this test will silently stop testing the intended scenario. Consider constructing an MCPServer with a mock Transport object instead.

langchain-interpreter/src/langchain_interpreter/schema_validator.py (1)

155-162: Redundant branch — both paths are identical.

Lines 155-158 (if schema.type != "object") and lines 160-162 do the exact same thing: validate_output(data, schema) then return data. The conditional can be collapsed.

Proposed fix
-    # Handle non-object schema types
-    if schema.type != "object":
-        validate_output(data, schema)
-        return data
-
-    # Validate against schema
+    # Validate against schema
     validate_output(data, schema)
     return data
langchain-interpreter/src/langchain_interpreter/cli.py (2)

238-280: extract_interfaces duplicates validate_and_extract_interfaces from parser.py.

parser.py already has validate_and_extract_interfaces (lines 207-250) that does the same classification and duplicate-type check. This function repeats the logic but throws click.ClickException instead. Consider reusing the parser function and catching AFMValidationError to convert it:

Proposed approach
from .parser import validate_and_extract_interfaces

def extract_interfaces(afm):
    interfaces = get_interfaces(afm)
    try:
        return validate_and_extract_interfaces(interfaces)
    except AFMValidationError as e:
        raise click.ClickException(str(e)) from e

356-361: Chain exceptions with raise ... from for CLI error paths.

Both raise click.ClickException(...) statements lose the original traceback. While Click swallows tracebacks by default, chaining preserves context for --verbose debugging or logging.

Proposed fix
     except AFMError as e:
-        raise click.ClickException(f"Failed to parse AFM file: {e}")
+        raise click.ClickException(f"Failed to parse AFM file: {e}") from e
     except Exception as e:
-        raise click.ClickException(f"Unexpected error parsing AFM file: {e}")
+        raise click.ClickException(f"Unexpected error parsing AFM file: {e}") from e
langchain-interpreter/src/langchain_interpreter/tools/mcp.py (3)

358-359: _multi_client field is never used.

self._multi_client is initialized to None on line 359 but is never assigned or read anywhere in MCPManager. This appears to be dead code from an earlier design.

Proposed fix
         self._servers = servers
         self._clients: list[MCPClient] = []
         self._tools: list[BaseTool] | None = None
-        self._multi_client: MultiServerMCPClient | None = None

425-427: Use logger.exception instead of logger.error to include the traceback.

When a server fails, the traceback is valuable for debugging. logger.exception automatically attaches it.

Proposed fix
             except MCPConnectionError as e:
                 errors.append(str(e))
-                logger.error(f"Failed to get tools from server '{client.name}': {e}")
+                logger.exception(f"Failed to get tools from server '{client.name}': {e}")

76-162: Duplicated validation logic between build_auth_headers and build_httpx_auth.

Both functions replicate the same type-dispatch and field-presence checks (if auth.token is None, if auth.username is None or auth.password is None, etc.). If a new auth type is added, both must be updated in lockstep. Consider extracting a shared validation step, or having build_auth_headers delegate to build_httpx_auth (or vice versa).

langchain-interpreter/src/langchain_interpreter/interfaces/web_chat.py (3)

38-63: ErrorResponse and HealthResponse are duplicated in webhook.py.

These identical Pydantic models are defined in both web_chat.py and webhook.py. Consider extracting them into a shared location (e.g., a models.py or base.py in the interfaces package) to keep a single source of truth.


297-300: Accessing private attribute agent._signature.

Line 300 reaches into Agent._signature directly. Since _signature is a private attribute, this couples create_webchat_app to Agent internals. Expose it via a public property or method on Agent instead.

♻️ Proposed fix

In agent.py, add a public property (if not already present):

`@property`
def signature(self) -> Signature:
    """The active input/output signature."""
    return self._signature

Then in web_chat.py:

-        signature = agent._signature  # Use agent's default signature
+        signature = agent.signature  # Use agent's default signature

164-175: Use raise ... from to preserve exception chains.

In both chat_string (lines 167–175) and chat_object (lines 239–247), json.JSONDecodeError and generic Exception are caught and re-raised as HTTPException without chaining. This discards the original traceback, making debugging harder.

♻️ Proposed fix (showing chat_string; apply same pattern to chat_object)
             except json.JSONDecodeError:
-                raise HTTPException(
+                raise HTTPException(
                     status_code=400,
                     detail="Invalid JSON in request body",
-                )
+                ) from None
             except Exception as e:
                 raise HTTPException(
                     status_code=500,
                     detail=str(e),
-                )
+                ) from e

Also applies to: 236-247

langchain-interpreter/src/langchain_interpreter/agent.py (3)

374-473: run() and arun() are ~100 lines of near-identical code.

The only differences are self._model.invoke vs await self._model.ainvoke and tool.invoke vs await tool.ainvoke. Consider extracting the shared logic (input prep, message building, tool-call loop structure, output coercion, history update) into a common helper, parameterized by sync/async invocation callables.

Also applies to: 475-574


170-185: Bare except: pass silently swallows all errors in tool schema fixups.

Lines 170–173 and 182–185 catch all exceptions and discard them. If a tool's schema is broken in an unexpected way, this produces no log output, making debugging tool-binding failures very difficult. At minimum, log a warning.

♻️ Proposed fix
                     try:
                         args_schema["properties"] = {}
-                    except Exception:
-                        pass
+                    except Exception:
+                        logger.warning("Failed to fix args_schema for tool %s", tool.name, exc_info=True)
             # Also handle tools with no args_schema
             elif args_schema is None:
 
                 class EmptySchema(BaseModel):
                     """Empty schema for tools with no arguments."""
 
                     pass
 
                 try:
                     tool.args_schema = EmptySchema
-                except Exception:
-                    pass
+                except Exception:
+                    logger.warning("Failed to set EmptySchema for tool %s", tool.name, exc_info=True)

416-450: When max_iterations is exhausted while tool calls are still pending, the agent silently returns a potentially empty response.

If the LLM keeps requesting tool calls and the loop exits at max_iterations, response is the last AIMessage which has tool_calls but may have empty or partial content. The code proceeds to extract and coerce that content without any warning or error, which could yield an empty or nonsensical result to the user.

Consider logging a warning when the loop exits due to iteration exhaustion while tool calls are still pending, so operators can detect and tune the limit.

♻️ Proposed fix (showing run; apply same to arun)
                 iterations += 1
 
+            if response is not None and response.tool_calls:
+                logger.warning(
+                    "Max iterations (%d) reached with pending tool calls",
+                    max_iterations,
+                )
+
             if response is None:
                 raise AgentError("No response from LLM")

Also applies to: 517-551

langchain-interpreter/src/langchain_interpreter/__init__.py (1)

116-116: Eagerly importing cli at package level pulls in click, uvicorn, and other CLI-only dependencies for all consumers.

from .cli import create_unified_app, main executes cli.py at import langchain_interpreter time. Since cli.py imports click, uvicorn, and heavy dependencies, any library consumer (e.g., using only parse_afm_file or Agent) pays the import cost. Consider lazy-loading CLI symbols or moving them out of the top-level __init__.py.

♻️ Proposed fix

Remove the eager import and __all__ entries, keeping CLI accessible via langchain_interpreter.cli:

-from .cli import create_unified_app, main
 __all__ = [
     ...
-    # CLI
-    "main",
-    "create_unified_app",
 ]

Users who need CLI symbols can import directly:

from langchain_interpreter.cli import main, create_unified_app

- Fix max_iterations falsey check
- Add missing InterfaceNotFoundError import
- Remove unnecessary str check in get_interface_by_type
- Escape html injection strings to avoid XSS
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@langchain-interpreter/src/langchain_interpreter/agent.py`:
- Around line 470-475: The except block currently re-raises InputValidationError
and AgentError but wraps any other Exception (including OutputValidationError)
into AgentError, preventing callers from catching OutputValidationError; update
the error handling in the synchronous run() and asynchronous arun() functions so
that OutputValidationError is explicitly re-raised alongside
InputValidationError (i.e., check isinstance(e, OutputValidationError) and
re-raise it) instead of wrapping it in AgentError, preserving the docstring
contract.

In `@langchain-interpreter/src/langchain_interpreter/interfaces/base.py`:
- Line 12: Replace the top-level import "from langchain_interpreter import
InterfaceNotFoundError" in the base module with a direct relative import from
the exceptions module (e.g., "from .exceptions import InterfaceNotFoundError")
so the InterfaceNotFoundError symbol is imported directly; update the import
statement in the langchain_interpreter.interfaces.base module and remove the
indirect top-level import to avoid depending on package __init__ execution
order.
🧹 Nitpick comments (6)
langchain-interpreter/src/langchain_interpreter/interfaces/__init__.py (1)

82-106: __all__ is not sorted (RUF022).

Ruff flags this. Not blocking, but sorting __all__ alphabetically within each group would improve maintainability and reduce merge conflicts.

langchain-interpreter/src/langchain_interpreter/agent.py (2)

150-188: Silent except: pass blocks hide failures in tool schema patching.

The bare except Exception: pass on Lines 170-173 and 182-185 means that if a tool's schema can't be fixed (e.g., frozen/immutable dict), it will silently proceed and likely cause an OpenAI API error later — far from the root cause. At minimum, log a warning so operators can diagnose.

Proposed fix
                     try:
                         args_schema["properties"] = {}
-                    except Exception:
-                        pass
+                    except Exception:
+                        logger.warning(
+                            "Could not patch args_schema for tool '%s'", tool.name
+                        )
             # Also handle tools with no args_schema
             elif args_schema is None:
 
                 class EmptySchema(BaseModel):
                     """Empty schema for tools with no arguments."""
 
                     pass
 
                 try:
                     tool.args_schema = EmptySchema
-                except Exception:
-                    pass
+                except Exception:
+                    logger.warning(
+                        "Could not set args_schema for tool '%s'", tool.name
+                    )

374-475: run() and arun() are near-identical (~100 lines each) — consider extracting shared logic.

The two methods differ only in invoke vs await ainvoke. The duplicated input validation, message building, tool loop structure, output coercion, history update, and error handling are all identical. A refactor to extract the shared orchestration (e.g., a generator or shared helper that yields at the LLM/tool call points) would reduce the maintenance surface.

Not blocking, but any bug fix in one method must be mirrored in the other — a recipe for drift.

Also applies to: 477-578

langchain-interpreter/src/langchain_interpreter/interfaces/web_chat.py (3)

193-204: Exception handlers lose traceback context and may leak internal details.

Lines 195-199: json.JSONDecodeError is re-raised as HTTPException without from e, losing the original traceback for debugging. Lines 200-204: the broad except Exception exposes str(e) directly in the HTTP response, which could leak internal details (file paths, DB errors, etc.) to clients in production.

Proposed improvement
             except json.JSONDecodeError:
                 raise HTTPException(
                     status_code=400,
                     detail="Invalid JSON in request body",
-                )
+                ) from None
             except Exception as e:
+                logger.exception("Unexpected error in chat_string endpoint")
                 raise HTTPException(
                     status_code=500,
-                    detail=str(e),
-                )
+                    detail="Internal server error",
+                ) from e

265-276: Same exception handling concerns as in chat_string — apply fixes here too.

The json.JSONDecodeError re-raise without from and the broad except Exception leaking str(e) apply identically to this code path (Lines 267-276). The fix from the earlier comment should be mirrored here.


329-329: Accessing private attribute agent._signature.

agent._signature is a private/protected attribute of Agent. This creates tight coupling between create_webchat_app and the Agent internals. Consider exposing a public signature property on Agent (similar to the existing name, description properties).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@langchain-interpreter/src/langchain_interpreter/interfaces/webhook.py`:
- Line 386: The code flattens Starlette's case-insensitive, multi-value
request.headers into a plain dict (headers: dict[str, str | list[str]] =
dict(request.headers)), losing duplicate values and case-insensitivity; change
this to preserve all values and lookup semantics by building headers as a
mapping of header name to list of values (e.g., iterate
request.headers.multi_items() or request.scope['headers'] and decode
keys/values) so headers becomes dict[str, list[str]] (or keep the original
Headers object) before passing into evaluate_template, and update the type hint
to dict[str, list[str]] (or the appropriate Headers type) to reflect the real
shape.
- Around line 339-340: The code directly sets websub_subscriber._verified = True
for hub_mode == "subscribe", which duplicates logic and leaves
websub_subscriber._challenge unset; replace that direct mutation with a call to
websub_subscriber.verify_challenge(hub_challenge) (or the appropriate challenge
variable) so the method performs the verification, sets _challenge, and logs the
event; if verify_challenge returns a boolean, use its result to decide whether
to mark the subscriber verified instead of assigning _verified directly.
🧹 Nitpick comments (3)
langchain-interpreter/src/langchain_interpreter/interfaces/webhook.py (3)

146-148: Use logging.exception to preserve stack traces in error handlers.

In both subscribe() (line 147) and unsubscribe() (line 181), logger.error(...) is used inside except Exception blocks. logging.exception automatically includes the traceback, which is valuable for diagnosing network issues with the WebSub hub.

♻️ Proposed fix
         except Exception as e:
-            logger.error(f"WebSub subscription error: {e}")
+            logger.exception("WebSub subscription error")
             return False

Apply the same pattern at line 181:

         except Exception as e:
-            logger.error(f"WebSub unsubscription error: {e}")
+            logger.exception("WebSub unsubscription error")
             return False

376-420: Chain exceptions with raise ... from for proper traceback propagation.

Both except blocks re-raise new exceptions without chaining, suppressing the original cause. This makes debugging harder.

♻️ Proposed fix
         except json.JSONDecodeError:
-            raise HTTPException(
+            raise HTTPException(
                 status_code=400,
                 detail="Invalid JSON payload",
-            )
+            ) from None
         except Exception as e:
-            logger.error(f"Agent execution error: {e}")
+            logger.exception("Agent execution error")
             raise HTTPException(
                 status_code=500,
                 detail=str(e),
-            )
+            ) from e

475-479: Chain the re-raised exception to preserve the original cause.

The ValueError is raised inside an except InterfaceNotFoundError block without exception chaining. This loses the original error context.

♻️ Proposed fix
     except InterfaceNotFoundError:
-        raise ValueError(
+        raise ValueError(
             "Agent must have a webhook interface to create a webhook app. "
             "Add a webhook interface to the agent's metadata."
-        )
+        ) from None

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@langchain-interpreter/pyproject.toml`:
- Line 12: The Repository URL in the pyproject.toml currently points to the
contributor's fork; update the urls entry (urls = { Repository =
"https://github.com/RadCod3/reference-implementations-afm" }) to reference the
official wso2 organization repository (e.g., change Repository to
"https://github.com/wso2/reference-implementations-afm") so the project metadata
points to the target org repository.

In `@langchain-interpreter/src/afm_cli/interfaces/webhook.py`:
- Around line 223-236: The split/len check inside the signature parsing is
redundant and makes the inner else branch unreachable; in the signature parsing
block that uses signature_header and sig_parts, remove the len(sig_parts) == 2
check and its unreachable else, and simply do algo, provided_sig =
signature_header.split("=", 1) when "=" is present, then set algorithm =
algo.lower() if algo.lower() in ("sha1","sha256","sha512"), otherwise leave
provided_sig unchanged; keep the outer else that assigns provided_sig =
signature_header when "=" is not present.
🧹 Nitpick comments (11)
langchain-interpreter/pyproject.toml (1)

13-27: Dependencies use open-ended lower bounds — consider pinning upper bounds for LangChain ecosystem packages.

The mcp package latest version is around 1.25.0–1.26.0, and the LangChain ecosystem evolves rapidly with breaking changes across minor versions. Using >= without upper bounds for all dependencies is acceptable for an application, but the LangChain packages (langchain, langchain-openai, langchain-anthropic, langchain-mcp-adapters) are particularly prone to breaking API changes between minor releases. Consider adding upper-bound constraints (e.g., <2.0) for at least these packages to prevent unexpected breakage.

langchain-interpreter/src/afm_cli/variables.py (1)

250-261: Shallow check on model_dump() won't catch nested structures in extra fields.

ClientAuthentication uses extra="allow", so extra fields could be dicts or lists. The current iteration only checks top-level string values from model_dump(). If a custom auth type includes nested objects, http: variables within them would be missed.

Consider reusing the recursive _check_value approach from _json_schema_contains_http_variable.

♻️ Proposed fix
 def _authentication_contains_http_variable(
     auth: ClientAuthentication | None,
 ) -> bool:
     """Check if authentication config contains http: variables."""
     if auth is None:
         return False
 
-    # Check all fields in the authentication object
-    for value in auth.model_dump().values():
-        if isinstance(value, str) and contains_http_variable(value):
-            return True
-    return False
+    def _check_value(value: Any) -> bool:
+        if isinstance(value, str):
+            return contains_http_variable(value)
+        elif isinstance(value, dict):
+            return any(_check_value(v) for v in value.values())
+        elif isinstance(value, list):
+            return any(_check_value(item) for item in value)
+        return False
+
+    return _check_value(auth.model_dump())
langchain-interpreter/src/afm_cli/interfaces/web_chat.py (2)

330-338: Permissive CORS configuration when origins are provided.

allow_methods=["*"] and allow_headers=["*"] is quite open. This is acceptable for local development and demo usage, but if this server is intended for production, consider restricting to only the methods and headers actually needed.


248-259: Silent fallback when JSON parsing fails for object output.

When output_is_string is False and the agent returns a string that isn't valid JSON (line 256-257), the response silently wraps it as {"response": response} instead of signaling a schema mismatch. This could mask agent output bugs — the caller expects a structured object but gets a wrapper with a raw string.

Consider logging a warning here to aid debugging.

♻️ Proposed addition
                        except json.JSONDecodeError:
+                            logger.warning(
+                                "Agent returned non-JSON string for object output schema; "
+                                "wrapping in {\"response\": ...}"
+                            )
                            return JSONResponse(content={"response": response})
langchain-interpreter/src/afm_cli/models.py (1)

327-346: get_filtered_tools returns None for deny-only filters, which is non-obvious.

When only deny is specified (no allow), the function returns None — same as "no filter". The caller must separately apply deny filtering against the actual available tools. This behavior isn't documented in the docstring and could confuse consumers.

Consider adding a brief docstring note explaining that None means "all tools (caller applies deny filtering)" vs returning a concrete list.

langchain-interpreter/tests/test_variables.py (1)

36-42: Test relies on NONEXISTENT_VAR_XYZ not being in the environment.

This is fragile if an environment happens to have this variable set. Consider using the env_vars fixture to explicitly unset it.

♻️ Proposed fix
-    def test_missing_env_var_error(self) -> None:
+    def test_missing_env_var_error(self, env_vars) -> None:
         """Test that missing environment variables raise an error."""
+        env_vars.unset("NONEXISTENT_VAR_XYZ")
         content = "Config: ${NONEXISTENT_VAR_XYZ}"
         with pytest.raises(VariableResolutionError) as exc_info:
             resolve_variables(content)
langchain-interpreter/src/afm_cli/tools/mcp.py (1)

249-254: Unused fields: _multi_client and _build_connections are dead code.

MCPManager._multi_client (line 254) is initialized to None but never read or assigned elsewhere. Similarly, _build_connections (lines 281-285) is never called — get_tools delegates to individual MCPClient.get_tools() calls instead. The MCPClient._tools field (line 155) is also initialized but never read or set in MCPClient.get_tools.

These appear to be leftover scaffolding. Consider removing them to avoid confusion.

♻️ Proposed cleanup
 class MCPClient:
     def __init__(self, ...):
         ...
-        self._tools: list[BaseTool] | None = None
 
 class MCPManager:
     def __init__(self, servers: list[MCPServer]) -> None:
         self._servers = servers
         self._clients: list[MCPClient] = []
         self._tools: list[BaseTool] | None = None
-        self._multi_client: MultiServerMCPClient | None = None
         ...
 
-    def _build_connections(self) -> dict[str, Any]:
-        connections: dict[str, Any] = {}
-        for client in self._clients:
-            connections[client.name] = client._build_connection_config()
-        return connections
langchain-interpreter/tests/test_mcp.py (4)

116-142: Consider adding negative-path tests for build_httpx_auth.

The positive cases are well covered, but there are no tests for error paths — e.g., bearer auth missing token, basic auth missing credentials, or unsupported types like "oauth2" / "jwt" that raise MCPAuthenticationError. These paths exist in the production code and would be worth covering.


289-299: Fragile transport type override via object.__setattr__.

Using object.__setattr__ to bypass Pydantic's frozen model and inject "websocket" as the transport type works but is brittle — it will silently break if the model structure changes. A more robust approach would be to create a minimal mock or stub that has .type = "websocket" and pass it directly to the MCPClient constructor.

♻️ Suggested alternative
     def test_from_mcp_server_unsupported_transport_raises_error(self):
         """from_mcp_server with unsupported transport type raises MCPError."""
-        server = MCPServer(
-            name="test",
-            transport=HttpTransport(url="http://localhost:8080"),
-        )
-        # Manually override the type to simulate an unsupported transport
-        object.__setattr__(server.transport, "type", "websocket")
+        mock_transport = MagicMock()
+        mock_transport.type = "websocket"
+        server = MagicMock(spec=MCPServer)
+        server.name = "test"
+        server.transport = mock_transport
+        server.tool_filter = None
 
         with pytest.raises(MCPError, match="Unsupported transport type"):
             MCPClient.from_mcp_server(server)

543-565: Verify patch.object auto-detects async for get_tools.

patch.object(manager._clients[0], "get_tools", return_value=[...]) relies on unittest.mock.patch auto-detecting that get_tools is an async def and creating an AsyncMock rather than a MagicMock. This works in Python ≥ 3.8 but is implicit. If there's ever a case where the detection fails (e.g., the attribute is already replaced), await on a regular MagicMock return value would raise TypeError. Using an explicit new=AsyncMock(return_value=[...]) would make the intent clearer, though this is a minor point.


730-830: Agent integration tests provide good end-to-end coverage.

The context-manager lifecycle, tool binding, and external-tools tests are solid. One note: lines 749, 775, 804, and 824 import Agent inside each test method. This is presumably to ensure the import happens after create_model_provider is patched. If that's the case, a brief comment explaining why would help future readers.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@langchain-interpreter/src/afm_cli/interfaces/webhook.py`:
- Around line 560-571: The warning handlers in resolve_secret are leaking secret
contents by interpolating the secret string into log messages; update
resolve_secret to never log the raw secret or its template name—either log a
generic message (e.g., "failed to resolve secret template") or log a safely
masked/indicator value (no actual secret or env var name). Specifically modify
the exception blocks in resolve_secret (handling VariableResolutionError and
Exception) to remove f"'{secret}'" from logger.warning and replace with a
non-sensitive placeholder or masked representation, keeping the existing
exception object for context.

In `@langchain-interpreter/tests/test_mcp.py`:
- Around line 267-278: The test
test_from_mcp_server_unsupported_transport_raises_error uses object.__setattr__
to mutate server.transport.type to "stdio" to simulate an unsupported transport;
add a brief inline comment next to that object.__setattr__ call explaining that
this mutation is necessary because Transport currently allows runtime mutation
(not frozen) and that the approach is intentionally fragile—if Transport becomes
frozen (e.g., frozen=True) this hack will break and the test should be updated
to construct an invalid transport another way; reference MCPServer, Transport,
and MCPClient.from_mcp_server in the comment for clarity.
- Around line 12-30: The tests reference Transport (used by make_mcp_server at
the top of the file and again near line 272) but it isn't imported, causing a
NameError; add Transport to the imports from the module that provides it
(include Transport in the existing from afm_cli.tools.mcp import ... list
alongside ApiKeyAuth, BearerAuth, build_httpx_auth) so make_mcp_server and other
tests can access the Transport symbol.
🧹 Nitpick comments (5)
langchain-interpreter/src/afm_cli/interfaces/webhook.py (1)

459-472: Secret is resolved twice — once here and again inside create_webhook_router (line 291).

resolve_secret(subscription.secret) is called in create_webhook_app (line 461) and independently in create_webhook_router (line 291). If the secret is backed by an environment variable or external source, the two calls could theoretically return different values, leading to a mismatch between the WebSub subscriber's secret and the signature verification secret. Consider resolving once and passing it through.

♻️ Sketch: pass resolved secret to the router

Add a secret parameter to create_webhook_router and remove the internal resolution:

 def create_webhook_router(
     agent: Agent,
     interface: WebhookInterface,
     path: str = "/webhook",
     *,
     verify_signatures: bool = True,
+    secret: str | None = None,
 ) -> APIRouter:
     ...
-    secret = resolve_secret(subscription.secret)
+    if secret is None:
+        secret = resolve_secret(subscription.secret)

Then in create_webhook_app:

     webhook_router = create_webhook_router(
-        agent, interface, webhook_path, verify_signatures=verify_signatures
+        agent, interface, webhook_path, verify_signatures=verify_signatures, secret=secret
     )
langchain-interpreter/src/afm_cli/interfaces/web_chat.py (2)

248-259: List responses are silently wrapped instead of returned as top-level JSON arrays.

If agent.arun returns a list (a valid JSON root type), it falls through to the else branch and gets wrapped as {"response": [...]}. This could violate the output schema contract when the agent legitimately returns a list.

Consider handling list alongside dict:

Suggested fix
                 else:
                     # Object output - return as-is or wrap
-                    if isinstance(response, dict):
+                    if isinstance(response, (dict, list)):
                         return JSONResponse(content=response)
                     elif isinstance(response, str):

195-196: Nit: redundant {e} in logger.exception() calls.

logger.exception() already appends the full exception and traceback. The {e} in the f-string duplicates the exception message. You can simplify to:

-logger.exception(f"Error in chat_string for session {session_id}: {e}")
+logger.exception("Error in chat_string for session %s", session_id)

(Same for chat_object at Line 269.) This also avoids the minor overhead of f-string formatting before the log level check.

Also applies to: 268-269

langchain-interpreter/tests/test_mcp.py (2)

98-124: Consider adding error-path tests for build_httpx_auth.

The happy paths are well covered, but there are no tests for unsupported auth types (oauth2, jwt, unknown) or missing required fields (e.g., bearer without token). These paths raise MCPAuthenticationError in production code and would benefit from coverage.


577-577: Hoist Agent import to the top of the file.

from afm_cli import Agent is repeated in four test methods (lines 577, 603, 632, 652). Since Agent itself isn't being patched, there's no reason to defer the import. Move it to the top-level imports to reduce duplication.

♻️ Proposed fix

Add to the existing top-level imports:

 from afm_cli import (
     AFMRecord,
+    Agent,
     AgentMetadata,
     ClientAuthentication,
     ...
 )

Then remove the four from afm_cli import Agent lines inside the test methods.

Also applies to: 603-603, 632-632, 652-652

- Cancel background subscription tasks during app shutdown in both
  webhook.py and cli.py lifespan handlers to prevent dangling coroutines
- Fix nested JSONSchema extra fields not being recursively converted in
  schema_validator.py (e.g., additionalProperties)
- Fix MCPManager caching partial failures, preventing retries of failed
  servers; now only caches on full success
- Remove unused _multi_client field and _build_connections method from
  MCPManager
- Add comprehensive tests for all fixes including lifespan cancellation,
  schema conversion, and caching behavior
- Fix type checker warnings in test_cli.py by using patch.object()
  instead of direct assignment
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@langchain-interpreter/src/afm_cli/cli.py`:
- Around line 94-103: The auto-generated WebSub callback URL currently uses the
incoming host value, which can be "0.0.0.0" and produce an unroutable URL; when
building the callback (where subscription.callback is None and callback_url is
set using get_http_path and host) detect if host == "0.0.0.0" (or host is falsy)
and instead use "localhost" for the URL (mirroring create_webhook_app behavior),
and emit a warning via the same logger to indicate we replaced 0.0.0.0 with
localhost before constructing the WebSubSubscriber (hub/topic/callback/secret).

In `@langchain-interpreter/src/afm_cli/interfaces/console_chat.py`:
- Around line 145-150: The agent_prefix passed into the class constructor is not
used when rendering agent messages: update the message construction in the block
that creates msg_widget (and the Vertical wrapper mount) to prepend agent_prefix
(e.g., f"{self.agent_prefix}{response}" or similar) to the response string
before escaping so the rendered agent messages include the configured prefix;
ensure you reference the instance attribute name used in the constructor
(agent_prefix or self.agent_prefix) and apply it where msg_widget is created and
mounted so customization takes effect.

In `@langchain-interpreter/src/afm_cli/tools/mcp.py`:
- Around line 184-206: The current get_tools method wraps every exception into
MCPConnectionError, hiding upstream MCPError details; modify get_tools to catch
MCPError from client.get_tools (or any MCP client call) and re-raise it
unchanged, while keeping the broad except Exception as e path to wrap non-MCP
errors into MCPConnectionError; reference the method name get_tools, the
MultiServerMCPClient usage and client.get_tools call, and ensure MCPError is
imported/used so auth/config validation errors surface verbatim and only
unexpected exceptions get wrapped into MCPConnectionError.
🧹 Nitpick comments (1)
langchain-interpreter/src/afm_cli/interfaces/webhook.py (1)

223-229: Consider enforcing the configured signature algorithm.

Right now the header algorithm overrides the algorithm argument. If you intend to enforce the configured algorithm (and prevent downgrade), reject mismatches or document the override behavior.

♻️ Optional hardening
-        if algo.lower() in ("sha1", "sha256", "sha512"):
-            algorithm = algo.lower()
+        if algo.lower() in ("sha1", "sha256", "sha512"):
+            if algo.lower() != algorithm.lower():
+                return False
+            algorithm = algo.lower()

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@langchain-interpreter/src/afm_cli/agent.py`:
- Around line 193-198: The code currently mutates user_input by appending
schema_instruction (built via build_output_schema_instruction and checking
self._signature.output.type) and then saves that augmented string into session
history; instead, keep the original_user_input separate, use a local
augmented_input only for the LLM call (e.g., pass augmented_input into the
prompt/agent execution), and when adding to session history (where
session/history methods are invoked) store original_user_input instead; apply
the same change in both the synchronous method that builds user_input
(referencing self._signature.output and build_output_schema_instruction) and the
asynchronous arun() path so history entries do not contain the appended schema
instructions.

In `@langchain-interpreter/tests/test_cli.py`:
- Around line 448-494: Test uses FastAPI internal app.router.lifespan_context;
switch to asgi-lifespan.LifespanManager to manage async lifespan. Import
LifespanManager from asgi_lifespan and replace usages of
app.router.lifespan_context with "async with LifespanManager(app):" in
test_lifespan_cancels_subscription_task_on_shutdown, leaving the setup and
assertions (accessing app.state.subscription_task, short sleep, and post-context
cancellation checks) unchanged; ensure the new import is added at the top of the
test file.
🧹 Nitpick comments (11)
langchain-interpreter/tests/test_schema_validator.py (2)

155-168: Consider strengthening the path assertion.

test_error_includes_path only asserts schema_path is not None but doesn't verify the actual path content (e.g., that it contains "user" or "age"). This weakens the test — it would pass even if the implementation returned a bogus path.

         with pytest.raises(InputValidationError) as exc_info:
             validate_input({"user": {"age": "not a number"}}, schema)
-        assert exc_info.value.schema_path is not None
+        assert exc_info.value.schema_path is not None
+        # Verify the path points to the invalid nested field
+        assert "age" in str(exc_info.value.schema_path)

418-428: Missing newline at end of file.

The file doesn't end with a trailing newline, which is a POSIX convention and may cause noise in diffs or linter warnings.

         assert result["additionalProperties"] == {"type": "boolean"}
+
langchain-interpreter/src/afm_cli/interfaces/webhook.py (2)

315-319: Accepting unverified WebSub challenges when no subscriber is configured.

When websub_subscriber is None and the attribute doesn't exist on app.state at all (i.e., the router is used standalone), any GET request with a valid hub.mode will have its hub.challenge echoed back without topic verification. This is likely intentional for composability, but could be surprising.

Consider adding a brief inline comment explaining the intent of this fallback branch.


585-596: Minor: missing colon separator in warning messages.

The log messages read "Failed to resolve secret template <error>" — a colon would improve readability.

Suggested fix
     except VariableResolutionError as e:
-        logger.warning(f"Failed to resolve secret template {e}")
+        logger.warning(f"Failed to resolve secret template: {e}")
         raise
     except Exception as e:
-        logger.warning(f"Unexpected error resolving secret template {e}")
+        logger.warning(f"Unexpected error resolving secret template: {e}")
         raise
langchain-interpreter/tests/test_mcp.py (1)

99-126: Consider adding tests for unsupported auth types.

build_httpx_auth has branches for oauth2, jwt, and unknown types that raise MCPAuthenticationError. Adding test cases for these would complete the coverage of build_httpx_auth.

langchain-interpreter/src/afm_cli/agent.py (1)

215-325: run() and arun() are ~100 lines of near-identical logic.

The only differences are invoke vs ainvoke and tool.invoke vs tool.ainvoke. This duplication increases the maintenance burden — any bug fix or logic change must be applied in both places.

This is a well-known challenge in Python's sync/async world; extracting the shared loop logic into a helper (or using asyncio.run internally for run() to delegate to arun()) could reduce duplication but may introduce its own complexity. Acceptable to defer.

Also applies to: 327-437

langchain-interpreter/src/afm_cli/interfaces/console_chat.py (1)

80-118: User input is logged at debug level — may contain sensitive data.

Lines 82 and 85 log user input verbatim at DEBUG level. If the agent handles sensitive information (PII, credentials, etc.), this could be a compliance concern when debug logging is enabled.

langchain-interpreter/tests/test_cli.py (1)

116-131: The "invalid YAML" test content may parse successfully with some YAML parsers.

The string invalid: yaml: with: colons: everywhere could be interpreted as valid (though oddly nested) YAML by some parsers. Consider using truly unparseable YAML (e.g., unbalanced brackets) for a more robust test:

---
name: [unclosed bracket
---

This is a minor concern since the test currently passes.

langchain-interpreter/tests/test_webhook.py (3)

448-448: Redundant local imports of WebSubSubscriber.

WebSubSubscriber is already imported at line 25. The local re-imports on lines 448 and 478 are unnecessary.

♻️ Remove redundant imports
-        from afm_cli.interfaces.webhook import WebSubSubscriber
-
         app.state.websub_subscriber = WebSubSubscriber(

Also applies to: 478-478


581-589: Consider extracting the repeated httpx.AsyncClient mock setup into a shared fixture.

The mock setup for httpx.AsyncClient (constructing mock_instance with post, __aenter__, __aexit__) is duplicated across all three async tests. A parameterized fixture or helper function would reduce duplication.

Also applies to: 607-616, 631-639


711-771: Lifespan test re-implements production logic instead of exercising the actual app lifespan.

The test defines its own test_lifespan context manager (lines 736–761) that duplicates the startup/shutdown logic from the production code. This means the test validates the copy, not the actual lifespan handler — if the production lifespan changes, this test won't catch regressions.

Instead, use the real app with TestClient as a context manager (or async with LifespanManager(app) from asgi-lifespan) so the actual lifespan is exercised:

# Using asgi-lifespan (preferred)
from asgi_lifespan import LifespanManager

async with LifespanManager(app):
    task = app.state.subscription_task
    assert not task.done()
# After exit, task should be cancelled
assert task.cancelled()

Alternatively, TestClient as a context manager triggers lifespan events automatically:

with TestClient(app) as client:
    task = app.state.subscription_task
    assert not task.done()
assert task.cancelled()

- agent.py: Store original user input in session history before schema
augmentation. Previously, the schema instruction suffix was appended to
user_input and then stored in history, causing token waste and potential
confusion on subsequent turns.
- cli.py: Fix auto-generated WebSub callback URL to use localhost when
host is 0.0.0.0, matching the pattern in webhook.py. 0.0.0.0 is not
externally routable.
- console_chat.py: Remove unused agent_prefix parameter. The Textual TUI
already differentiates messages via CSS classes, making text prefixes
redundant.
- mcp.py: Preserve MCPError subclasses (like MCPAuthenticationError)
when raised by build_httpx_auth(). These were being wrapped in generic
MCPConnectionError, hiding diagnostic details.
- tests: Use asgi-lifespan.LifespanManager for async lifespan testing
instead of internal FastAPI attributes. This follows official testing
guidance.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @.github/workflows/release.yml:
- Around line 134-142: The bump-version step can proceed with an empty
VERSION_FILE if neither Ballerina.toml nor pyproject.toml exists; add the same
guard used in the validate step: after the if/elif that sets VERSION_FILE, check
that VERSION_FILE is non-empty and, if it is empty, print a clear error message
mentioning IMPLEMENTATION and NEXT_VERSION and exit with a non-zero status so
sed and git add are not called with an empty path; update the bump-version step
to fail-fast when VERSION_FILE is unset.

In `@langchain-interpreter/src/afm_cli/agent.py`:
- Around line 153-157: The _get_primary_interface currently always returns
interfaces[0], causing wrong schema selection when multiple interfaces exist;
change _get_primary_interface to pick the interface that matches the actual
interface type for the current request (e.g., match Interface.type or a
passed-in identifier) instead of unconditionally returning the first
element—either add an optional parameter interface_type (or use an existing
attribute like self._current_interface_type if present) and search
self._afm.metadata.interfaces for the interface whose type/name matches that
value, returning that match and falling back to None or the single interface
only if no match is found.
🧹 Nitpick comments (14)
README.md (1)

14-14: Add language identifier to the fenced code block.

The fenced code block displaying the repository structure should specify a language identifier for better markdown compatibility and tooling support.

📝 Suggested fix

Add a language identifier (e.g., plaintext or text) to the code fence:

-```
+```plaintext
 reference-implementations-afm/
 ├── ballerina-interpreter/   # Ballerina-based AFM interpreter

As per coding guidelines, markdownlint recommends specifying a language for fenced code blocks (MD040).

langchain-interpreter/pyproject.toml (1)

13-27: Consider adding upper-bound constraints for fast-moving LangChain dependencies.

All dependencies use open-ended >= lower bounds. The LangChain ecosystem (langchain, langchain-openai, langchain-anthropic, langchain-mcp-adapters) is known for frequent breaking changes across minor versions. For a CLI tool shipping to users, this risks silent breakage when a new incompatible release lands.

Consider pinning at least the LangChain-family packages to a compatible range (e.g., langchain>=1.2.8,<2), or using a lock file (e.g., uv.lock) committed to the repo for reproducible installs.

.github/workflows/release-common.yml (1)

87-100: Explicitly pin Python version in the setup-uv action for consistency.

The setup-uv@v5 action supports a python-version input parameter. Since the project declares requires-python = ">=3.11" in pyproject.toml, explicitly setting the Python version in CI ensures consistent behavior and prevents unexpected mismatches between local development and CI environments. While uv sync respects the requires-python constraint from pyproject.toml, being explicit in the workflow improves clarity and reliability.

Suggested change
      - name: Install uv
        if: inputs.implementation == 'langchain-interpreter'
        uses: astral-sh/setup-uv@v5
        with:
          enable-cache: true
+         python-version: "3.11"
langchain-interpreter/tests/test_webhook.py (2)

448-454: Redundant local import of WebSubSubscriber.

WebSubSubscriber is already imported at line 25. The local re-imports on lines 448 and 478 are unnecessary.

♻️ Remove redundant imports
-        from afm_cli.interfaces.webhook import WebSubSubscriber
-
         app.state.websub_subscriber = WebSubSubscriber(

Also applies to: 478-484


571-646: Consider extracting the repeated httpx.AsyncClient mock setup into a helper.

The same 8-line mock-client setup pattern appears three times (lines 581–589, 607–616, 631–639). A small fixture or helper would reduce noise and make intent clearer.

langchain-interpreter/src/afm_cli/templates.py (3)

52-89: Template parser doesn't handle } inside bracket-notation paths.

template.find("}", dollar_pos) at line 63 picks the first } after ${. If the path contains bracket notation with a field containing } (e.g., ${http:payload.['key}name']}), the parser would split incorrectly. This is a very unlikely scenario in practice, so just noting for awareness.


155-163: No exhaustive match/default case for template segment dispatch.

The match statement handles the three known segment kinds but has no wildcard case. If a new TemplateSegment variant were added, it would be silently ignored during evaluation.

♻️ Proposed defensive default case
             case HeaderVariable():
                 _handle_header_variable(headers, parts, segment)
+            case _:
+                raise TemplateEvaluationError(
+                    f"Unknown segment type: {type(segment).__name__}",
+                    template="",
+                )

267-317: Bracket notation parser assumes no ] inside quoted keys.

remaining.find("]") at line 270 finds the first ], which would mismatch on a key like ['key]value']. A minor robustness gap — real-world JSON keys with ] are rare.

langchain-interpreter/src/afm_cli/cli.py (1)

62-82: Docstring Args section omits host and port parameters.

The host and port parameters are in the signature but not documented in the Args block.

📝 Proposed docstring update
         startup_event: Optional event to signal when MCP connection is complete.
             Useful for coordinating with other async tasks that need the agent
             to be connected before proceeding.
+        host: Host to bind the server to. Defaults to "0.0.0.0".
+        port: Port to bind the server to. Defaults to 8000.
 
     Returns:
langchain-interpreter/src/afm_cli/tools/mcp.py (1)

36-50: Missing return type annotations on auth_flow generators.

Both BearerAuth.auth_flow and ApiKeyAuth.auth_flow lack return type annotations. The httpx Auth.auth_flow protocol expects Generator[Request, Response, None].

♻️ Proposed fix
+from typing import Generator
+
 class BearerAuth(httpx.Auth):
     """Bearer token authentication for httpx."""

     def __init__(self, token: str) -> None:
         self.token = token

-    def auth_flow(self, request: httpx.Request):
+    def auth_flow(self, request: httpx.Request) -> Generator[httpx.Request, httpx.Response, None]:
         request.headers["Authorization"] = f"Bearer {self.token}"
         yield request

Apply the same to ApiKeyAuth.auth_flow.

langchain-interpreter/src/afm_cli/agent.py (1)

215-328: Significant duplication between run() and arun().

The synchronous run() and asynchronous arun() share nearly identical logic (input prep, message building, tool loop, output coercion, history update, error handling). Consider extracting the shared pre/post-processing and tool-loop structure into private helpers, with only the invoke/ainvoke calls differing.

Also applies to: 330-443

langchain-interpreter/src/afm_cli/interfaces/console_chat.py (1)

156-158: Redundant pass after logger.debug().

The pass on line 158 is unnecessary since the logger.debug() call already serves as the except-block body.

♻️ Remove redundant pass
                 except Exception:
                     logger.debug("Failed to remove thinking indicator")
-                    pass
langchain-interpreter/tests/test_console_chat.py (2)

51-76: Single pilot.pause() may be insufficient for async worker completion.

_send_message is run as an async worker. A single pilot.pause() yields one event-loop tick, which may not be enough if the worker hasn't finished scheduling its UI updates. Consider using await pilot.pause(delay=...) or app.workers.wait_for_complete() before asserting on the agent response to reduce flakiness.

This applies to several other tests as well (Lines 117, 137, 159, 170, 177, 197, 216).


79-89: @pytest.mark.asyncio is unnecessary here — run_console_chat is synchronous.

The function under test (run_console_chat) is a synchronous wrapper that calls app.run(). While marking the test async still works, it adds unnecessary overhead. A plain synchronous test would be clearer.

Suggested simplification
-@pytest.mark.asyncio
-async def test_run_console_chat(mock_agent: MagicMock) -> None:
+def test_run_console_chat(mock_agent: MagicMock) -> None:
     """Test wrapper function."""
     with patch("afm_cli.interfaces.console_chat.ChatApp") as MockApp:
         run_console_chat(

Comment on lines +134 to +142
# Determine version file
if [ -f "$IMPLEMENTATION/Ballerina.toml" ]; then
VERSION_FILE="$IMPLEMENTATION/Ballerina.toml"
elif [ -f "$IMPLEMENTATION/pyproject.toml" ]; then
VERSION_FILE="$IMPLEMENTATION/pyproject.toml"
fi

sed -i "s/^version = \".*\"/version = \"$NEXT_VERSION\"/" "$VERSION_FILE"
git add "$VERSION_FILE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing error handling when neither version file is found.

Unlike the validate step (lines 58–61), the bump-version step silently proceeds with an unset VERSION_FILE if neither Ballerina.toml nor pyproject.toml exists. This would cause sed and git add to receive an empty path argument, producing a confusing failure.

Proposed fix: add the same guard as the validate step
           # Determine version file
           if [ -f "$IMPLEMENTATION/Ballerina.toml" ]; then
             VERSION_FILE="$IMPLEMENTATION/Ballerina.toml"
           elif [ -f "$IMPLEMENTATION/pyproject.toml" ]; then
             VERSION_FILE="$IMPLEMENTATION/pyproject.toml"
+          else
+            echo "::error::No version file found for implementation: $IMPLEMENTATION"
+            exit 1
           fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Determine version file
if [ -f "$IMPLEMENTATION/Ballerina.toml" ]; then
VERSION_FILE="$IMPLEMENTATION/Ballerina.toml"
elif [ -f "$IMPLEMENTATION/pyproject.toml" ]; then
VERSION_FILE="$IMPLEMENTATION/pyproject.toml"
fi
sed -i "s/^version = \".*\"/version = \"$NEXT_VERSION\"/" "$VERSION_FILE"
git add "$VERSION_FILE"
# Determine version file
if [ -f "$IMPLEMENTATION/Ballerina.toml" ]; then
VERSION_FILE="$IMPLEMENTATION/Ballerina.toml"
elif [ -f "$IMPLEMENTATION/pyproject.toml" ]; then
VERSION_FILE="$IMPLEMENTATION/pyproject.toml"
else
echo "::error::No version file found for implementation: $IMPLEMENTATION"
exit 1
fi
sed -i "s/^version = \".*\"/version = \"$NEXT_VERSION\"/" "$VERSION_FILE"
git add "$VERSION_FILE"
🤖 Prompt for AI Agents
In @.github/workflows/release.yml around lines 134 - 142, The bump-version step
can proceed with an empty VERSION_FILE if neither Ballerina.toml nor
pyproject.toml exists; add the same guard used in the validate step: after the
if/elif that sets VERSION_FILE, check that VERSION_FILE is non-empty and, if it
is empty, print a clear error message mentioning IMPLEMENTATION and NEXT_VERSION
and exit with a non-zero status so sed and git add are not called with an empty
path; update the bump-version step to fail-fast when VERSION_FILE is unset.

Comment on lines +153 to +157
def _get_primary_interface(self) -> Interface | None:
interfaces = self._afm.metadata.interfaces
if interfaces and len(interfaces) > 0:
return interfaces[0]
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's understand the repository structure
find . -type f -name "*.py" | head -20

Repository: wso2/reference-implementations-afm

Length of output: 1083


🏁 Script executed:

# Run the suggested verification command
rg -n --type=py 'agent\.(a?run|_prepare_input|_build_messages|_get_signature)' -g '!**/test_*'

Repository: wso2/reference-implementations-afm

Length of output: 718


🏁 Script executed:

# Search for the Agent class and _get_primary_interface usage
rg -n '_get_primary_interface' --type=py

Repository: wso2/reference-implementations-afm

Length of output: 284


🏁 Script executed:

# Look at the Agent class definition and its methods
rg -n 'class Agent' --type=py -A 5

Repository: wso2/reference-implementations-afm

Length of output: 2513


🏁 Script executed:

# Check for interface definitions and signatures
rg -n 'interface' --type=py -i | head -30

Repository: wso2/reference-implementations-afm

Length of output: 2871


🏁 Script executed:

# View the Agent.__init__ and surrounding code around line 60
sed -n '38,100p' langchain-interpreter/src/afm_cli/agent.py

Repository: wso2/reference-implementations-afm

Length of output: 2177


🏁 Script executed:

# View the arun method
rg -n 'def arun|def run' langchain-interpreter/src/afm_cli/agent.py -A 15

Repository: wso2/reference-implementations-afm

Length of output: 1325


🏁 Script executed:

# View the _get_primary_interface method and surrounding context
sed -n '150,180p' langchain-interpreter/src/afm_cli/agent.py

Repository: wso2/reference-implementations-afm

Length of output: 1112


🏁 Script executed:

# Check the Interface base class
cat langchain-interpreter/src/afm_cli/interfaces/base.py

Repository: wso2/reference-implementations-afm

Length of output: 2569


🏁 Script executed:

# Check specific interface implementations for signature differences
sed -n '1,100p' langchain-interpreter/src/afm_cli/interfaces/console_chat.py

Repository: wso2/reference-implementations-afm

Length of output: 3088


🏁 Script executed:

# Check webhook interface for signature
sed -n '1,100p' langchain-interpreter/src/afm_cli/interfaces/webhook.py

Repository: wso2/reference-implementations-afm

Length of output: 3003


🏁 Script executed:

# Check the Interface model definitions
rg -n 'class.*Interface' langchain-interpreter/src/afm_cli/models.py -A 20

Repository: wso2/reference-implementations-afm

Length of output: 1584


🏁 Script executed:

# Check the Signature model
rg -n 'class Signature' langchain-interpreter/src/afm_cli/models.py -A 15

Repository: wso2/reference-implementations-afm

Length of output: 577


🏁 Script executed:

# View how interfaces are used in the agent's arun method
sed -n '330,380p' langchain-interpreter/src/afm_cli/agent.py

Repository: wso2/reference-implementations-afm

Length of output: 1800


🏁 Script executed:

# Check if interfaces can truly have different signatures in AFM spec
grep -r "signature" langchain-interpreter/src/afm_cli/interfaces/ --include="*.py" -n

Repository: wso2/reference-implementations-afm

Length of output: 4615


🏁 Script executed:

# Check if web_chat uses interface signature vs agent signature more clearly
sed -n '310,370p' langchain-interpreter/src/afm_cli/interfaces/web_chat.py

Repository: wso2/reference-implementations-afm

Length of output: 1943


🏁 Script executed:

# Check webhook interface signature usage
sed -n '270,350p' langchain-interpreter/src/afm_cli/interfaces/webhook.py

Repository: wso2/reference-implementations-afm

Length of output: 3218


🏁 Script executed:

# Check if there are any AFM files with multiple interfaces with different signatures
find . -name "*.md" -o -name "*.yaml" -o -name "*.yml" | grep -E "(afm|test)" | head -10

Repository: wso2/reference-implementations-afm

Length of output: 663


🏁 Script executed:

# Check test fixtures for multiple interfaces
grep -r "interfaces:" langchain-interpreter/tests/ --include="*.py" -A 10 | head -50

Repository: wso2/reference-implementations-afm

Length of output: 2637


🏁 Script executed:

# Check if any AFM test files define multiple interfaces with different signatures
cat langchain-interpreter/tests/fixtures/sample_agent.afm.md

Repository: wso2/reference-implementations-afm

Length of output: 1135


🏁 Script executed:

# Check test cases that use multiple interfaces
rg -n "interfaces.*-.*type" langchain-interpreter/tests/ --include="*.py" -B 2 -A 10

Repository: wso2/reference-implementations-afm

Length of output: 519


🏁 Script executed:

# Check if Agent is instantiated once per interface or once per AFM
rg -n "Agent(" langchain-interpreter/src/afm_cli/interfaces/ --include="*.py" -B 3 -A 3

Repository: wso2/reference-implementations-afm

Length of output: 519


🏁 Script executed:

# Look for any documentation about multiple interface configurations
cat langchain-interpreter/tests/fixtures/sample_webhook_agent.afm.md

Repository: wso2/reference-implementations-afm

Length of output: 1027


🏁 Script executed:

# Check the actual input validation in web_chat endpoint
rg -n 'create_webchat_router\|@.*router\.' langchain-interpreter/src/afm_cli/interfaces/web_chat.py -A 20 | head -80

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Check the create_webchat_router function implementation
rg -n 'def create_webchat_router' langchain-interpreter/src/afm_cli/interfaces/web_chat.py -A 50 | head -70

Repository: wso2/reference-implementations-afm

Length of output: 2162


🏁 Script executed:

# Look at how interfaces are actually used - check if each interface gets its own agent or shares
rg -n 'self.agent\|Agent(' langchain-interpreter/src/afm_cli/interfaces/ -t py -B 2 -A 2

Repository: wso2/reference-implementations-afm

Length of output: 164


🏁 Script executed:

# Check if multiple interfaces can coexist in one AFM file and how they're handled
grep -r "interfaces:" langchain-interpreter/tests/fixtures/ -A 10

Repository: wso2/reference-implementations-afm

Length of output: 2986


🏁 Script executed:

# Check the actual validation that happens in _prepare_input
sed -n '165,185p' langchain-interpreter/src/afm_cli/agent.py

Repository: wso2/reference-implementations-afm

Length of output: 829


🏁 Script executed:

# Check if agent.arun() is called directly from interfaces or only through interface handlers
rg -n 'agent\.arun\|agent\.run' langchain-interpreter/src/afm_cli/interfaces/ -B 3 -A 3

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Look at how console_chat actually calls the agent
sed -n '115,145p' langchain-interpreter/src/afm_cli/interfaces/console_chat.py

Repository: wso2/reference-implementations-afm

Length of output: 1172


🏁 Script executed:

# Check if there's any place where the agent's _signature is actually used versus interface-specific signatures
rg -n '_signature\|self\.signature' langchain-interpreter/src/afm_cli/agent.py

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Look for any tests that verify multiple interfaces
grep -r "multiple.*interface\|interfaces.*multiple" langchain-interpreter/tests/ -i

Repository: wso2/reference-implementations-afm

Length of output: 283


🏁 Script executed:

# Check the test that mentions "Multiple interfaces"
rg -n "Multiple interfaces" langchain-interpreter/tests/test_parser.py -B 10 -A 5

Repository: wso2/reference-implementations-afm

Length of output: 1219


🏁 Script executed:

# Check the parser to understand the constraint
rg -n "Multiple interfaces" langchain-interpreter/src/ -r "py"

Repository: wso2/reference-implementations-afm

Length of output: 498


🏁 Script executed:

# Check validate_and_extract_interfaces function
rg -n "validate_and_extract_interfaces" langchain-interpreter/src/ -B 5 -A 20

Repository: wso2/reference-implementations-afm

Length of output: 5843


🏁 Script executed:

# Check the full validate_and_extract_interfaces implementation
sed -n '207,260p' langchain-interpreter/src/afm_cli/parser.py

Repository: wso2/reference-implementations-afm

Length of output: 1635


🏁 Script executed:

# See where console_chat, web_chat and webhook get their interface
rg -n 'get_console_interface|get_webchat_interface|get_webhook_interface' langchain-interpreter/src/afm_cli/interfaces/

Repository: wso2/reference-implementations-afm

Length of output: 1469


🏁 Script executed:

# Check how interfaces list is used in practice across handlers
rg -n 'metadata\.interfaces' langchain-interpreter/src/afm_cli/interfaces/ -B 2 -A 2

Repository: wso2/reference-implementations-afm

Length of output: 686


🏁 Script executed:

# See if Agent._interface is actually used for validation or just for defaults
rg -n 'self\._interface' langchain-interpreter/src/afm_cli/agent.py

Repository: wso2/reference-implementations-afm

Length of output: 229


🏁 Script executed:

# Check if interface-specific routers use their passed signature or the agent's signature
sed -n '129,200p' langchain-interpreter/src/afm_cli/interfaces/web_chat.py

Repository: wso2/reference-implementations-afm

Length of output: 3014


🏁 Script executed:

# Check webhook router to see how it uses signature
sed -n '250,320p' langchain-interpreter/src/afm_cli/interfaces/webhook.py

Repository: wso2/reference-implementations-afm

Length of output: 2955


🏁 Script executed:

# Check the actual validation in routers - do they validate before calling agent.arun?
rg -n 'validate_input\|validate_output' langchain-interpreter/src/afm_cli/interfaces/ -B 2 -A 2

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Check if agent.arun is called with pre-validated data from interface handlers
rg -n 'agent\.arun\|agent\.run' langchain-interpreter/src/afm_cli/interfaces/ -B 10 | head -60

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Check the webhook router's receive_webhook endpoint - does it call agent.arun?
sed -n '330,380p' langchain-interpreter/src/afm_cli/interfaces/webhook.py

Repository: wso2/reference-implementations-afm

Length of output: 1923


🏁 Script executed:

# Look for where agent.arun is actually called in webhook
rg -n 'agent\.arun\|await agent' langchain-interpreter/src/afm_cli/interfaces/webhook.py -B 5 -A 5

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Check if console_chat also gets its signature from interface or from agent
sed -n '110,130p' langchain-interpreter/src/afm_cli/interfaces/console_chat.py

Repository: wso2/reference-implementations-afm

Length of output: 827


🏁 Script executed:

# Check the AFM specification or documentation about interface signatures
find . -name "*.md" -type f | xargs grep -l "interface.*signature" | head -3

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Verify that ConsoleChatInterface, WebChatInterface, and WebhookInterface can all have different signatures
grep -A 5 "class ConsoleChatInterface\|class WebChatInterface\|class WebhookInterface" langchain-interpreter/src/afm_cli/models.py

Repository: wso2/reference-implementations-afm

Length of output: 602


🏁 Script executed:

# Check if there's any test that defines multiple interfaces with different signatures
cat langchain-interpreter/tests/fixtures/sample_agent.afm.md | grep -A 30 "^interfaces:"

Repository: wso2/reference-implementations-afm

Length of output: 757


🏁 Script executed:

# Verify the actual validation flow - does agent._prepare_input validate or does interface handler?
sed -n '165,180p' langchain-interpreter/src/afm_cli/agent.py

Repository: wso2/reference-implementations-afm

Length of output: 637


🏁 Script executed:

# Check the actual validate_input function to see what it checks against
rg -n "def validate_input" langchain-interpreter/src/afm_cli/ -A 10

Repository: wso2/reference-implementations-afm

Length of output: 1189


Primary interface selection always picks the first interface, causing validation mismatches with multiple interfaces.

_get_primary_interface returns interfaces[0], so the signature used for input/output validation is always from the first declared interface. This is problematic when an AFM declares multiple interfaces with different signatures (e.g., consolechat with string I/O and webhook with object I/O): the agent will validate all inputs against the first interface's schema, not the specific interface handling the request. The codebase enforces only one interface per type, allowing one ConsoleChatInterface, one WebChatInterface, and one WebhookInterface coexist—each potentially with a different signature. When these have incompatible schemas, validation will fail or produce unexpected results.

🤖 Prompt for AI Agents
In `@langchain-interpreter/src/afm_cli/agent.py` around lines 153 - 157, The
_get_primary_interface currently always returns interfaces[0], causing wrong
schema selection when multiple interfaces exist; change _get_primary_interface
to pick the interface that matches the actual interface type for the current
request (e.g., match Interface.type or a passed-in identifier) instead of
unconditionally returning the first element—either add an optional parameter
interface_type (or use an existing attribute like self._current_interface_type
if present) and search self._afm.metadata.interfaces for the interface whose
type/name matches that value, returning that match and falling back to None or
the single interface only if no match is found.

│ ├── providers.py # LLM provider handling
│ └── templates.py # Prompt templates
├── tests/ # Unit and integration tests
├── afm-samples/ # Example AFM definitions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have this, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need the LICENSE here, noh? We have it at the repo level.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we remove irrelevant stuff?


if webhook_interface is not None:
subscription = webhook_interface.subscription
secret = resolve_secret(subscription.secret)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any field can contain a variable IMO. We shouldn't have to special case secret like this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

"""Check if exposure config contains http: variables."""
if exposure.http and contains_http_variable(exposure.http.path):
return True
return False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just be a return?

@RadCod3 RadCod3 closed this Feb 9, 2026
@RadCod3 RadCod3 deleted the feat/langchain-py branch February 9, 2026 12:16
@RadCod3 RadCod3 restored the feat/langchain-py branch February 9, 2026 12:16
@RadCod3 RadCod3 reopened this Feb 9, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🤖 Fix all issues with AI agents
In `@langchain-interpreter/src/afm/__init__.py`:
- Line 4: The package runtime version in
langchain-interpreter/src/afm/__init__.py is out of sync: update the __version__
symbol from "0.1.0" to match pyproject.toml's "0.1.1" (or refactor to a single
source of truth such as reading importlib.metadata.version in __init__ to avoid
future drift); locate and change the __version__ assignment in the afm package
accordingly so runtime and distribution metadata match.

In `@langchain-interpreter/src/afm/agent.py`:
- Around line 216-219: The code allows max_iterations to be 0 which prevents the
while loop in Agent.run from executing and causes AgentError("No response from
LLM"); fix by guarding non-positive values: when computing max_iterations (the
local variable derived from self.max_iterations / afm.metadata.max_iterations),
clamp it to at least 1 (e.g., max_iterations = max(1, self.max_iterations or 0))
or explicitly raise a clear ValueError if a caller-supplied non-positive value
is invalid; update the calculation where max_iterations is set and ensure any
callers or metadata validation reflect this constraint.

In `@langchain-interpreter/src/afm/interfaces/web_chat.py`:
- Around line 111-167: The chat_string handler logs the user-supplied session_id
directly in logger.exception, which may expose PII; update the chat_string
function to avoid logging raw X-Session-Id by either omitting it from the error
message or replacing it with a non-reversible identifier (e.g., a
truncated/hashed fingerprint) before passing to logger.exception; locate the
logger.exception call in chat_string and change the logged value from session_id
to the sanitized/omitted form and ensure any related error details still provide
enough context without exposing the original session header.
- Around line 245-296: create_webchat_app currently never calls the agent
lifecycle methods so MCP tools won't be available; add a FastAPI lifespan
context manager (using an asynccontextmanager passed as the lifespan argument to
FastAPI) that calls await agent.connect() on startup and await
agent.disconnect() on shutdown (mirroring create_unified_app behavior) so the
agent's MCP connections are established when the app starts and cleaned up when
it stops; ensure you reference the existing create_webchat_app and agent.connect
/ agent.disconnect symbols when making the change.

In `@langchain-interpreter/src/afm/interfaces/webhook.py`:
- Around line 329-428: The lifespan context in create_webhook_app only manages
WebSub subscription; update the lifespan function to initialize and tear down
the agent lifecycle by calling/awaiting agent.connect() during startup (before
spawning subscribe_with_retry) and calling/awaiting agent.disconnect() during
shutdown (after cancelling subscription_task and before exiting), handling both
sync and async connect/disconnect (await if coroutine) and catching/logging
exceptions so failures don't crash the app; mirror the behavior used in
create_webchat_app for ordering and error handling.

In `@langchain-interpreter/src/afm/models.py`:
- Line 187: The max_iterations attribute on the Pydantic model currently allows
0 or negative values; update its definition to enforce a positive integer (e.g.,
use Field(gt=0) or add a validator) and import Field from pydantic if needed so
that max_iterations: int | None = Field(default=None, gt=0) (or an equivalent
validator on max_iterations) prevents non-positive values while keeping None
allowed.

In `@langchain-interpreter/src/afm/providers.py`:
- Around line 103-136: The _get_api_key function currently treats empty-string
tokens as missing and may return them from auth.model_dump; update _get_api_key
(and its checks of ClientAuthentication.auth.type/auth.token and the
post-model_dump scan) to treat empty strings as absent: when matching auth.type
== "bearer" require auth.token is a non-empty string (e.g., auth.token not None
and auth.token != ""), and in the model_dump loop only accept values that are
non-empty strings (not just truthy) for keys ["token","api_key","key","apiKey"];
if no non-empty value is found, fall back to the env_var and ultimately raise
ProviderError as before. Ensure ProviderError messages and provider argument
remain unchanged.

In `@langchain-interpreter/src/afm/resources/chat-ui.html`:
- Line 80: The template currently injects the CHAT_PATH token raw into the JS
var declaration (const chatPath = {{CHAT_PATH}}), which is fragile and can cause
syntax errors or injection; change the template to include explicit JS string
delimiters around the token (use const chatPath = '{{CHAT_PATH}}'; in
chat-ui.html) and update the server-side substitution that fills CHAT_PATH to
emit only the path value (HTML/JS-escaped) without surrounding quotes so the
contract is clear and safe.

In `@langchain-interpreter/src/afm/templates.py`:
- Around line 190-239: The bracket parser in _handle_bracket_access incorrectly
uses remaining.find("]") which breaks when a quoted field contains ] (e.g.,
["a]b"]); implement a quote-aware scanner (e.g., a helper _find_closing_bracket)
that advances from the opening '[' and, if it sees a quote, scans until the
matching quote and the closing ']' while skipping brackets inside the quotes,
return the index of that closing ']', and then use that index instead of
remaining.find("]") in _handle_bracket_access to correctly extract
bracket_content and new_remaining; ensure the helper returns -1 on failure so
the existing JSONAccessError handling remains unchanged.

In `@langchain-interpreter/tests/test_console_chat.py`:
- Around line 39-63: Tests that trigger the _send_message worker (decorated with
`@work`(exclusive=True)) need explicit worker synchronization; update
test_user_message_flow, test_agent_error_display, and test_json_response to
await app.workers.wait_for_complete() before calling await pilot.pause() so the
async worker finishes reliably (insert await app.workers.wait_for_complete()
immediately before existing await pilot.pause() in each test while keeping the
rest of the assertions, including mock_agent.arun.assert_called_once()).
🧹 Nitpick comments (16)
langchain-interpreter/src/afm/resources/chat-ui.html (1)

1-49: All CSS is inlined in a single minified block — consider readability.

The entire stylesheet is compressed into single-line declarations, making it very hard to review, maintain, or diff. For a template embedded in a Python package (not a build artifact), readability matters more than payload size. Consider formatting the CSS across multiple lines or extracting it into a separate static file.

langchain-interpreter/tests/test_agent.py (1)

16-26: Consider moving the inner imports to module level.

AIMessage and AsyncMock are imported inside the fixture body. These are stable, well-known imports — hoisting them to the top of the file with the other imports improves readability and follows Python conventions.

Proposed fix
 from unittest.mock import MagicMock, patch
+from unittest.mock import AsyncMock
 
 import pytest
+from langchain_core.messages import AIMessage
 
 from afm.agent import Agent
 from afm.models import (
@@ -16,9 +18,6 @@
 `@pytest.fixture`
 def mock_chat_model() -> MagicMock:
-    from langchain_core.messages import AIMessage
-    from unittest.mock import AsyncMock
-
     model = MagicMock()
     model.invoke.return_value = AIMessage(content="Hello! I'm here to help.")
     model.ainvoke = AsyncMock(
langchain-interpreter/src/afm/parser.py (1)

82-117: Any unrecognized # Heading silently terminates the current section and drops subsequent content.

Content under headings other than # Role and # Instructions (e.g., # Notes, # Examples) is silently discarded. If this is intentional per the AFM spec, it's fine — but consider logging a debug/warning message when content is dropped so users get feedback that their sections were ignored.

             else:
-                # Different heading - stop current section
-                in_role = False
-                in_instructions = False
+                # Unrecognized heading - stop current section
+                import logging
+                logger = logging.getLogger(__name__)
+                logger.debug(f"Ignoring unrecognized section heading: '{stripped}'")
+                in_role = False
+                in_instructions = False

(The logger should be a module-level constant in practice.)

langchain-interpreter/src/afm/schema_validator.py (2)

96-103: Redundant conditional — both branches are identical.

The if schema.type != "object" and the implicit else branch perform the same operations (validate_output + return data). The distinction was likely intended to allow object-specific handling but currently adds dead complexity.

♻️ Suggested simplification
-    # Handle non-object schema types
-    if schema.type != "object":
-        validate_output(data, schema)
-        return data
-
-    # Validate against schema
     validate_output(data, schema)
     return data

35-44: Extra-field handling may silently produce incorrect dict values for nested JSONSchema inside collections.

model_dump() serializes nested JSONSchema objects as plain dicts. The code at Line 41 checks isinstance(original, JSONSchema) and recursively converts it, which is correct for direct JSONSchema-typed extra fields. However, if an extra field is a list[JSONSchema] or dict[str, JSONSchema], the isinstance check won't match, and the already-serialized (but potentially incomplete) dict from model_dump will be used instead of being recursively converted via json_schema_to_dict.

This is unlikely to be hit today given typical JSON Schema keywords, but worth noting for future extensibility.

langchain-interpreter/src/afm/agent.py (1)

43-43: Unbounded session history growth.

_sessions accumulates messages indefinitely per session. For long-running agents (e.g., behind a webhook), this can lead to increasing memory usage and ever-growing LLM context windows. Consider adding a configurable history limit or a sliding window.

langchain-interpreter/src/afm/interfaces/console_chat.py (1)

114-118: Inline import json inside method.

Consider moving import json to the top of the file alongside the other imports for consistency.

langchain-interpreter/src/afm/interfaces/base.py (1)

38-47: assert for type narrowing can be stripped by python -O.

assert isinstance(interface, WebChatInterface) and assert isinstance(interface, WebhookInterface) are used for type narrowing after get_interface_by_type. While the preceding lookup makes this safe in practice, assert statements are removed when Python runs with optimization flags (-O). If this code could run in optimized mode, consider replacing with an explicit if not isinstance(...): raise.

langchain-interpreter/src/afm/variables.py (1)

163-173: Field names in validate_http_variables don't disambiguate between multiple MCP servers.

When iterating over metadata.tools.mcp, field names like "tools.mcp.name" are used without including the server name or index. If multiple servers have http: variables, the error message will list duplicate field names without indicating which server is affected.

♻️ Suggested improvement
     if metadata.tools and metadata.tools.mcp:
-        for server in metadata.tools.mcp:
-            if contains_http_variable(server.name):
-                errored_fields.append("tools.mcp.name")
-            if contains_http_variable(server.transport.url):
-                errored_fields.append("tools.mcp.transport.url")
-            if _authentication_contains_http_variable(server.transport.authentication):
-                errored_fields.append("tools.mcp.transport.authentication")
-            if _tool_filter_contains_http_variable(server.tool_filter):
-                errored_fields.append("tools.mcp.tool_filter")
+        for i, server in enumerate(metadata.tools.mcp):
+            prefix = f"tools.mcp[{server.name or i}]"
+            if contains_http_variable(server.name):
+                errored_fields.append(f"{prefix}.name")
+            if contains_http_variable(server.transport.url):
+                errored_fields.append(f"{prefix}.transport.url")
+            if _authentication_contains_http_variable(server.transport.authentication):
+                errored_fields.append(f"{prefix}.transport.authentication")
+            if _tool_filter_contains_http_variable(server.tool_filter):
+                errored_fields.append(f"{prefix}.tool_filter")
langchain-interpreter/src/afm/cli.py (2)

466-515: Well-designed async orchestration with proper race handling.

The asyncio.wait pattern to detect server failure before startup, the secondary race between server and console tasks, and the finally cleanup with server.should_exit = True are all solid. One minor note: if both tasks in done have exceptions (line 505-508), only the first is raised and the second is lost. Consider logging the other exception before raising.

♻️ Optional: log all exceptions before raising
         for task in done:
             exc = task.exception()
-            if exc is not None and not isinstance(exc, SystemExit):
-                raise exc
+            if exc is not None and not isinstance(exc, SystemExit):
+                # Log and raise the first real exception; log any others
+                other_excs = [
+                    t.exception()
+                    for t in done
+                    if t is not task and t.exception() and not isinstance(t.exception(), SystemExit)
+                ]
+                for other in other_excs:
+                    logger.error("Additional task error", exc_info=other)
+                raise exc

426-435: has_console parameter defaults to False but the function is only called when has_console is True.

The parameter has_console: bool = False on line 433 is only used to suppress uvicorn logging (line 451), yet the caller at line 415 always passes it as True (since _run_http_and_console is only invoked when has_http and has_console). Consider removing the default or making the intent clearer.

langchain-interpreter/src/afm/models.py (5)

22-26: Sensitive credentials stored as plain str — consider SecretStr.

token, password, and api_key will be fully visible in .model_dump(), repr(), and log output. Pydantic v2's SecretStr masks these by default, reducing the risk of accidental credential leakage in logs or serialized payloads.

The same concern applies to Subscription.secret on Line 126.

Suggested change
+from pydantic import SecretStr
 ...
 class ClientAuthentication(BaseModel):
     model_config = ConfigDict(extra="allow")

     type: str
-    token: str | None = None
-    username: str | None = None
-    password: str | None = None
-    api_key: str | None = None
+    token: SecretStr | None = None
+    username: str | None = None
+    password: SecretStr | None = None
+    api_key: SecretStr | None = None

Note: callers that need the raw value would use .get_secret_value(). The validator comparisons against None remain unchanged since SecretStr | None still defaults to None.


28-42: Unknown auth types silently pass validation.

The match statement only validates bearer, basic, and api-key. Any other type value (e.g., a typo like "barer") skips all checks and succeeds silently. If extensibility is intentional (consistent with extra="allow"), a comment clarifying this would help. Otherwise, a default case raising ValueError for unrecognized types would catch misconfigurations early.


66-70: No mutual-exclusion or precedence validation for allow/deny.

Both allow and deny can be set simultaneously, which leaves precedence semantics ambiguous. If the intended behavior is "deny wins" or "only one is permitted," a model_validator would enforce that at parse time rather than deferring it to consumers.


104-105: Stray empty comment — remove or replace with a meaningful section header.

Line 104 has an orphaned # with no text. If it's meant as a section separator, consider a descriptive comment like # --- Exposure & Interface Models ---, or simply remove it.


54-56: Remove unused TransportType enum.

TransportType is defined but never referenced anywhere in the codebase. The Transport.type field uses Literal["http"] instead, making the enum dead code.

Comment on lines +103 to +136
def _get_api_key(
auth: ClientAuthentication | None,
env_var: str,
provider: str,
) -> str:
if auth is not None:
auth_type = auth.type.lower()

if auth_type == "bearer" and auth.token:
return auth.token
elif auth_type == "api-key" and auth.api_key:
return auth.api_key
elif auth_type == "basic":
raise ProviderError(
"Basic authentication is not supported for LLM providers. "
"Use 'bearer' or 'api-key' authentication type.",
provider=provider,
)
else:
auth_dict = auth.model_dump(exclude_none=True)
for key in ["token", "api_key", "key", "apiKey"]:
if key in auth_dict and auth_dict[key]:
return auth_dict[key]

# Fall back to environment variable
api_key = os.environ.get(env_var)
if api_key:
return api_key

raise ProviderError(
f"No API key found. Provide authentication in the model config "
f"or set the {env_var} environment variable.",
provider=provider,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Empty-string auth tokens silently fall through to the env-var/scan fallback.

If auth.type == "bearer" and auth.token == "", the truthiness check on line 111 (auth.token) is False, so it falls to the else branch (line 121) which scans model_dump. Since token: "" is excluded by exclude_none=True but not by a falsiness filter, the empty string is returned by the scan loop on line 125.

The Pydantic validator on ClientAuthentication only rejects None, not empty strings. This means an empty bearer token would be silently passed to the LLM provider.

🛡️ Proposed fix — validate non-empty after type match
         if auth_type == "bearer" and auth.token:
             return auth.token
         elif auth_type == "api-key" and auth.api_key:
             return auth.api_key
         elif auth_type == "basic":
             raise ProviderError(
                 "Basic authentication is not supported for LLM providers. "
                 "Use 'bearer' or 'api-key' authentication type.",
                 provider=provider,
             )
+        elif auth_type == "bearer" and not auth.token:
+            raise ProviderError(
+                "Bearer token is empty. Provide a valid token.",
+                provider=provider,
+            )
+        elif auth_type == "api-key" and not auth.api_key:
+            raise ProviderError(
+                "API key is empty. Provide a valid API key.",
+                provider=provider,
+            )
         else:
🤖 Prompt for AI Agents
In `@langchain-interpreter/src/afm/providers.py` around lines 103 - 136, The
_get_api_key function currently treats empty-string tokens as missing and may
return them from auth.model_dump; update _get_api_key (and its checks of
ClientAuthentication.auth.type/auth.token and the post-model_dump scan) to treat
empty strings as absent: when matching auth.type == "bearer" require auth.token
is a non-empty string (e.g., auth.token not None and auth.token != ""), and in
the model_dump loop only accept values that are non-empty strings (not just
truthy) for keys ["token","api_key","key","apiKey"]; if no non-empty value is
found, fall back to the env_var and ultimately raise ProviderError as before.
Ensure ProviderError messages and provider argument remain unchanged.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@MaryamZi MaryamZi merged commit fcfcbce into wso2:main Feb 9, 2026
3 checks passed
@RadCod3 RadCod3 deleted the feat/langchain-py branch February 27, 2026 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants