Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 114 additions & 0 deletions .pr/after_fix.log
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
================================================================================
AFTER FIX: Codex subscription mode — working end-to-end
================================================================================
Command: python scratch.py
Model: openai/gpt-5.2-codex (subscription mode via LLM.subscription_login)
Date: 2026-04-10

scratch.py contents:
llm = LLM.subscription_login(vendor="openai", model="gpt-5.2-codex")
agent = Agent(llm=llm, tools=[Terminal, FileEditor, TaskTracker])
conversation.send_message("Write 3 facts about the current project into FACTS.txt.")
conversation.run()

================================================================================

Message from User ──────────────────────────────────────────────────────────────

Write 3 facts about the current project into FACTS.txt.

Tokens: ↑ input 0 • cache hit N/A • ↓ output 0 • $ 0.00

Agent Action ───────────────────────────────────────────────────────────────────

Summary: List files in repository root

Reasoning:
$ ls

Tokens: ↑ input 6.49K • cache hit 0.00% • ↓ output 53 • $ 0.0121

Observation ────────────────────────────────────────────────────────────────────

Tool: terminal
Result:
AGENTS.md CONTRIBUTING.md
DEVELOPMENT.md FACTS.txt
LICENSE MAINTAINERS
MANIFEST.in MagicMock
Makefile README.md
examples openhands-agent-server
openhands-sdk openhands-tools
openhands-workspace proposals
pyproject.toml scratch.py
scripts tests
uv.lock workspace

📁 Working directory: /Users/VascoSchiavo/git/software-agent-sdk
✅ Exit code: 0

Agent Action ───────────────────────────────────────────────────────────────────

Summary: Display current FACTS.txt content

$ cat FACTS.txt

Tokens: ↑ input 13.2K • cache hit 0.00% • ↓ output 89 • $ 0.0243

Observation ────────────────────────────────────────────────────────────────────

Tool: terminal
Result:
1. The OpenHands Software Agent SDK provides Python and REST APIs for building
code-focused agents.
2. The repository includes multiple packages such as openhands-sdk,
openhands-tools, openhands-workspace, and openhands-agent-server.
3. The examples/ directory contains standalone SDK, remote agent server, and
GitHub workflow demos.

📁 Working directory: /Users/VascoSchiavo/git/software-agent-sdk
✅ Exit code: 0

Agent Action ───────────────────────────────────────────────────────────────────

Summary: Update FACTS.txt with three current project facts

Reasoning:
Action: FileEditorAction

Arguments:
command: "str_replace"
path: "/Users/VascoSchiavo/git/software-agent-sdk/FACTS.txt"
old_str: <original 3 lines>
new_str: <updated 3 lines>

Tokens: ↑ input 20.07K • cache hit 31.88% • reasoning 64 • ↓ output 370 • $ 0.0302

Observation ────────────────────────────────────────────────────────────────────

Tool: file_editor
Result:
[File /Users/VascoSchiavo/git/software-agent-sdk/FACTS.txt edited with 1 changes.]
[begin of edit 1 / 1]
(content before edit)
-1|1. The OpenHands Software Agent SDK provides Python and REST APIs for
building code-focused agents.
-2|2. The repository includes multiple packages such as openhands-sdk,
openhands-tools, openhands-workspace, and openhands-agent-server.
-3|3. The examples/ directory contains standalone SDK, remote agent server, and
GitHub workflow demos.

(content after edit)
+1|1. The repository root includes packages like openhands-sdk and
openhands-tools.
+2|2. There is a tests/ directory alongside examples/ and scripts/.
+3|3. Project metadata is managed via pyproject.toml at the root.
[end of edit 1 / 1]

Message from Agent ─────────────────────────────────────────────────────────────

Updated FACTS.txt with three current project facts.

Tokens: ↑ input 27.27K • cache hit 23.47% • reasoning 64 • ↓ output 384 • $ 0.0430

All done!
72 changes: 72 additions & 0 deletions .pr/before_fix.log
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
================================================================================
BEFORE FIX: Codex subscription mode — empty responses loop
================================================================================
Command: python scratch.py
Model: openai/gpt-5.2-codex (subscription mode via LLM.subscription_login)
Date: 2026-04-10

scratch.py contents:
llm = LLM.subscription_login(vendor="openai", model="gpt-5.2-codex")
agent = Agent(llm=llm, tools=[Terminal, FileEditor, TaskTracker])
conversation.send_message("Write 3 facts about the current project into FACTS.txt.")
conversation.run()

================================================================================

Message from User ──────────────────────────────────────────────────────────────

Write 3 facts about the current project into FACTS.txt.

Tokens: ↑ input 0 • cache hit N/A • ↓ output 0 • $ 0.00

Message from Agent ─────────────────────────────────────────────────────────────

[no text content]

Tokens: ↑ input 6.49K • cache hit 98.60% • ↓ output 72 • $ 0.0033

Message from User ──────────────────────────────────────────────────────────────

Your last response did not include a function call or a message. Please use a
tool to proceed with the task.

Message from Agent ─────────────────────────────────────────────────────────────

[no text content]

Tokens: ↑ input 13.01K • cache hit 98.39% • ↓ output 69 • $ 0.0044

Message from User ──────────────────────────────────────────────────────────────

Your last response did not include a function call or a message. Please use a
tool to proceed with the task.

Message from Agent ─────────────────────────────────────────────────────────────

[no text content]

Tokens: ↑ input 19.55K • cache hit 98.19% • ↓ output 53 • $ 0.0067

Message from User ──────────────────────────────────────────────────────────────

Your last response did not include a function call or a message. Please use a
tool to proceed with the task.

Message from Agent ─────────────────────────────────────────────────────────────

[no text content]

Tokens: ↑ input 26.13K • cache hit 97.99% • ↓ output 65 • $ 0.0089

Message from User ──────────────────────────────────────────────────────────────

Your last response did not include a function call or a message. Please use a
tool to proceed with the task.

Message from Agent ─────────────────────────────────────────────────────────────

[no text content]

Tokens: ↑ input 32.73K • cache hit 98.18% • ↓ output 70 • $ 0.0110

... (loops indefinitely with [no text content])
27 changes: 27 additions & 0 deletions openhands-sdk/openhands/sdk/llm/llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@
RefusalDeltaEvent,
ResponseCompletedEvent,
ResponsesAPIResponse,
ResponsesAPIStreamEvents,
)
from litellm.types.utils import (
Delta,
Expand Down Expand Up @@ -999,7 +1000,21 @@ def _one_attempt(**retry_kwargs) -> ResponsesAPIResponse:
)

stream_callback = on_token if user_enable_streaming else None
# Collect output items from streaming events.
# Some endpoints (e.g., Codex subscription) send output
# items as separate events but the final response.completed
# event has output=[]. We accumulate them here and patch
# the completed response if needed.
collected_output_items: list[Any] = []
for event in ret:
if event is None:
continue
# Collect finished output items
evt_type = getattr(event, "type", None)
if evt_type == ResponsesAPIStreamEvents.OUTPUT_ITEM_DONE:
item = getattr(event, "item", None)
if item is not None:
collected_output_items.append(item)
if stream_callback is None:
continue
if isinstance(
Expand Down Expand Up @@ -1034,6 +1049,10 @@ def _one_attempt(**retry_kwargs) -> ResponsesAPIResponse:

completed_resp = completed_event.response

# Patch empty output with items collected from stream
if not completed_resp.output and collected_output_items:
completed_resp.output = collected_output_items

self._telemetry.on_response(completed_resp)
return completed_resp

Expand Down Expand Up @@ -1418,6 +1437,14 @@ def format_messages_for_responses(
"""
msgs = copy.deepcopy(messages)

# Subscription mode (store=false): strip reasoning items from prior
# assistant turns. The Codex endpoint doesn't persist items, so
# referencing their IDs in follow-up requests causes a 404.
if self.is_subscription:
for m in msgs:
if m.role == "assistant" and m.responses_reasoning_item is not None:
m.responses_reasoning_item = None

# Determine vision based on model detection
vision_active = self.vision_is_active()

Expand Down
76 changes: 50 additions & 26 deletions openhands-sdk/openhands/sdk/llm/message.py
Original file line number Diff line number Diff line change
Expand Up @@ -650,40 +650,64 @@ def from_llm_responses_output(
tool_calls: list[MessageToolCall] = []
responses_reasoning_item: ReasoningItemModel | None = None

# Helper to access fields from typed Pydantic objects, generic
# litellm base objects (BaseLiteLLMOpenAIResponseObject), or dicts.
def _get(obj: Any, key: str, default: Any = None) -> Any:
if isinstance(obj, dict):
return obj.get(key, default)
return getattr(obj, key, default)

for item in output or []:
item_type = _get(item, "type")

if (
isinstance(item, GenericResponseOutputItem)
or isinstance(item, ResponseOutputMessage)
) and item.type == "message":
for part in item.content or []:
if part.type == "output_text" and part.text:
assistant_text_parts.append(part.text)
isinstance(item, (GenericResponseOutputItem, ResponseOutputMessage))
or item_type == "message"
) and item_type == "message":
content = _get(item, "content")
for part in content or []:
part_type = _get(part, "type")
part_text = _get(part, "text")
if part_type == "output_text" and part_text:
assistant_text_parts.append(part_text)
elif (
isinstance(item, (OutputFunctionToolCall, ResponseFunctionToolCall))
and item.type == "function_call"
and item_type == "function_call"
):
tc = MessageToolCall.from_responses_function_call(item)
tool_calls.append(tc)
elif isinstance(item, ResponseReasoningItem) and item.type == "reasoning":
# Parse OpenAI typed Responses "reasoning" output item
# (Pydantic BaseModel)
rid = item.id
summaries = item.summary or []
contents = item.content or []
enc = item.encrypted_content
status = item.status

summary_list: list[str] = [s.text for s in summaries]
content_texts: list[str] = [c.text for c in contents]
content_list: list[str] | None = content_texts or None

responses_reasoning_item = ReasoningItemModel(
id=rid,
summary=summary_list,
content=content_list,
encrypted_content=enc,
status=status,
elif item_type == "function_call":
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I am a bit worried that these item_type would break the type checking we had before, eg, before python would know that when item.type == reasoning, item is an instance of ResponseReasoningItem and will check for that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

I refactor such that it would preserve type safety for the standard API path while still handling the generic streaming types from the Codex endpoint.

# Handle generic objects (e.g., BaseLiteLLMOpenAIResponseObject
# from streaming) or dicts with function_call type
tc = MessageToolCall(
id=_get(item, "call_id") or _get(item, "id", ""),
name=_get(item, "name", ""),
arguments=_get(item, "arguments", ""),
origin="responses",
)
tool_calls.append(tc)
elif item_type == "reasoning":
if isinstance(item, ResponseReasoningItem):
# Typed path: preserves type narrowing for standard API
responses_reasoning_item = ReasoningItemModel(
id=item.id,
summary=[s.text for s in (item.summary or [])],
content=[c.text for c in (item.content or [])] or None,
encrypted_content=item.encrypted_content,
status=item.status,
)
else:
# Generic fallback for BaseLiteLLMOpenAIResponseObject
# or dicts (e.g., streaming items from Codex subscription)
summaries = _get(item, "summary") or []
contents = _get(item, "content") or []
responses_reasoning_item = ReasoningItemModel(
id=_get(item, "id"),
summary=[_get(s, "text", "") for s in summaries],
content=[_get(c, "text", "") for c in contents] or None,
encrypted_content=_get(item, "encrypted_content"),
status=_get(item, "status"),
)

assistant_text = "\n".join(assistant_text_parts).strip()
return Message(
Expand Down
32 changes: 19 additions & 13 deletions openhands-sdk/openhands/sdk/llm/options/responses_options.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,24 +39,30 @@ def select_responses_options(

# Include encrypted reasoning only when the user enables it on the LLM,
# and only for stateless calls (store=False). Respect user choice.
include_list = list(include) if include is not None else []
# Note: include and reasoning are not supported in subscription mode
# (the Codex subscription endpoint silently returns empty output when
# these parameters are present).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OpenHands Find the recent OpenAI docs and verify this statement. It seems surprising to me, there is reasoning I think, but maybe you can find out under what conditions?

Read official docs and maybe also find github issues in openai/codex repos or user experiences?

Respond directly on github, here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@enyst your session has expired. Please login again at OpenHands Cloud and try again.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I listed the open issues and docs into the issue related to this PR ;) @enyst

if not llm.is_subscription:
include_list = list(include) if include is not None else []

if not out.get("store", False) and llm.enable_encrypted_reasoning:
if "reasoning.encrypted_content" not in include_list:
include_list.append("reasoning.encrypted_content")
if include_list:
out["include"] = include_list
if not out.get("store", False) and llm.enable_encrypted_reasoning:
if "reasoning.encrypted_content" not in include_list:
include_list.append("reasoning.encrypted_content")
if include_list:
out["include"] = include_list

# Include reasoning effort only if explicitly set
if llm.reasoning_effort:
out["reasoning"] = {"effort": llm.reasoning_effort}
# Optionally include summary if explicitly set (requires verified org)
if llm.reasoning_summary:
out["reasoning"]["summary"] = llm.reasoning_summary
# Include reasoning effort only if explicitly set
if llm.reasoning_effort:
out["reasoning"] = {"effort": llm.reasoning_effort}
# Optionally include summary if explicitly set (requires verified org)
if llm.reasoning_summary:
out["reasoning"]["summary"] = llm.reasoning_summary

# Send prompt_cache_retention only if model supports it
# Note: prompt_cache_retention is not supported in subscription mode
if (
get_features(llm.model).supports_prompt_cache_retention
not llm.is_subscription
and get_features(llm.model).supports_prompt_cache_retention
and llm.prompt_cache_retention
):
out["prompt_cache_retention"] = llm.prompt_cache_retention
Expand Down
Loading
Loading