Skip to content

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Oct 10, 2025

feat: Add OPENAI_SESSION_BACKEND environment variable for session backend selection

Summary

This PR introduces the OPENAI_SESSION_BACKEND environment variable to switch between OpenAI Conversations API (default) and SQLite session storage. When using the OpenAI backend, the system now prints session URLs for real-time observation of agent conversations, addressing the need for better online visibility after the previous trace URL approach broke with Arize evals integration.

Key Changes:

  • Added create_session() helper function that reads OPENAI_SESSION_BACKEND env var (defaults to "openai")
  • Supports "sqlite" as fallback option for local session storage
  • Prints session URLs when using OpenAI backend: https://platform.openai.com/conversations/{conversation_id}
  • Applied to both interactive and manager-developer build modes
  • Updated exit/error messages to include session URLs

⚠️ Breaking Change: The default session backend is now OpenAI instead of SQLite. Users who want the old behavior must set OPENAI_SESSION_BACKEND=sqlite.

Review & Testing Checklist for Human

This PR changes default behavior and introduces untested OpenAI integration. Please verify:

  • Test OpenAI backend (default): Run a connector build without setting the env var. Verify the session URL is printed, accessible, and shows the conversation in OpenAI Platform.
  • Verify session URL format: The URL format https://platform.openai.com/conversations/{conversation_id} is an educated guess based on API patterns, not verified against actual OpenAI UI. Confirm this URL actually works.
  • Test SQLite fallback: Set OPENAI_SESSION_BACKEND=sqlite and verify session storage still works as before.
  • Check private API usage: Code calls await session._get_session_id() (note the underscore). Verify if there's a public API we should use instead.
  • Consider default change impact: Is changing the default from SQLite to OpenAI acceptable? Should we default to SQLite for backward compatibility?
  • Test error scenarios: What happens if OpenAI API is unavailable or credentials are missing? Should we have better error handling or automatic fallback to SQLite?

Recommended Test Plan

  1. Run connector-builder-mcp in interactive mode without setting env var (tests OpenAI default)
  2. Click the printed session URL and verify it opens a conversation view in OpenAI Platform
  3. Run again with export OPENAI_SESSION_BACKEND=sqlite (tests SQLite fallback)
  4. Try with invalid value: export OPENAI_SESSION_BACKEND=invalid (should raise ValueError)

Notes

Summary by CodeRabbit

  • New Features

    • Configurable session backend selection (e.g., OpenAI or SQLite).
    • OpenAI conversation sessions available as a backend.
  • Improvements

    • Session pages provide a shareable URL and the app attempts to open it in your browser.
    • Progress logs now surface session and trace links at start, end, on interrupts, and on errors for clearer tracking.
    • Consistent behavior across interactive and manager flows.

- Add environment variable OPENAI_SESSION_BACKEND (default: 'openai')
- Support 'sqlite' as fallback option for local session storage
- Implement create_session() helper function for backend selection
- Print session URLs for OpenAI-backed sessions for real-time observation
- Update exit/error messages to include session URLs when available
- Apply to both interactive and manager-developer modes

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

Original prompt from AJ Steers
@Devin - We're currently using SQLite session in connector builder MCP. Can you consult if we get better online visibility by moving to an OpenAI-managed Session API rather than local SQLite session info?
Thread URL: https://airbytehq-team.slack.com/archives/D089P0UPVT4/p1760134325882529?thread_ts=1760134325.882529

Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions github-actions bot added the enhancement New feature or request label Oct 10, 2025
Copy link

👋 Welcome to the Airbyte Connector Builder MCP!

Thank you for your contribution! Here are some helpful tips and reminders for your convenience.

Testing This Branch via MCP

To test the changes in this specific branch with an MCP client like Claude Desktop, use the following configuration:

{
  "mcpServers": {
    "connector-builder-mcp-dev": {
      "command": "uvx",
      "args": ["--from", "git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1760135522-openai-session-backend", "connector-builder-mcp"]
    }
  }
}

Testing This Branch via CLI

You can test this version of the MCP Server using the following CLI snippet:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1760135522-openai-session-backend#egg=airbyte-connector-builder-mcp' --help

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /build-connector - Builds the default connector on-demand using the AI builder
  • /build-connector prompt="<your prompt>" - Builds a connector on-demand using the AI builder
  • /poe <command> - Runs any poe command in the uv virtual environment

AI Builder Evaluations

AI builder evaluations run automatically under the following conditions:

  • When a PR is marked as "ready for review"
  • When a PR is reopened

A set of standardized evaluations also run on a schedule (Mon/Wed/Fri at midnight UTC) and can be manually triggered via workflow dispatch.

Helpful Resources

If you have any questions, feel free to ask in the PR comments or join our Slack community.

📝 Edit this welcome message.

Copy link

coderabbitai bot commented Oct 10, 2025

📝 Walkthrough

Walkthrough

Adds a create_session(session_id) factory that selects SQLiteSession or OpenAIConversationsSession via OPENAI_SESSION_BACKEND (raises ValueError on invalid backends). Replaces direct SQLiteSession instantiations, tracks a session URL for OpenAI sessions, logs and surfaces it on completion, interrupts, and errors, and attempts to open it in a browser.

Changes

Cohort / File(s) Summary of Changes
Session factory & exports
connector_builder_agents/src/run.py
Added create_session(session_id: str) selecting SQLiteSession or OpenAIConversationsSession based on OPENAI_SESSION_BACKEND; raises ValueError for invalid backends. Added os import and exported OpenAIConversationsSession.
Interactive & manager flows (session handling, logging, browser open)
connector_builder_agents/src/run.py
Replaced SQLiteSession(session_id=...) calls with create_session(session_id). For OpenAIConversationsSession, obtain conversation/session id, compose session_url, log and attempt to open it in the browser, and include session_url alongside trace URL in progress logs on normal end, KeyboardInterrupt, and exception paths. Minor control-flow/logging adjustments to surface session_url in multiple termination/error branches.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Run as run.py
  participant Env as Env (OPENAI_SESSION_BACKEND)
  participant SessionFactory as create_session
  participant Session as Session (SQLite/OpenAI)
  participant Browser as System Browser
  participant Log as Logger

  User->>Run: start interactive/manager flow(session_id)
  Run->>Env: read OPENAI_SESSION_BACKEND
  Run->>SessionFactory: create_session(session_id)
  alt backend == "sqlite"
    SessionFactory-->>Session: instantiate SQLiteSession(session_id)
  else backend == "openai"
    SessionFactory-->>Session: instantiate OpenAIConversationsSession()
    Session-->>Run: return conversation/session id
    Run->>Run: compose session_url
    Run->>Log: log session_url
    Run->>Browser: attempt to open session_url
  else invalid
    SessionFactory-->>Run: raise ValueError
    Run->>Log: log ValueError and exit
  end

  Note over Run,Session: main processing loop

  alt Normal completion
    Run->>Log: log session_url (if any) and trace URL
    Run-->>User: exit normally
  else KeyboardInterrupt or Error
    Run->>Log: log session_url (if any) and trace URL
    Run-->>User: exit with error
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly describes the addition of the OPENAI_SESSION_BACKEND environment variable for selecting the session backend, directly matching the PR’s main objective of enabling configurable session storage and behavior.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1760135522-openai-session-backend

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

github-actions bot commented Oct 10, 2025

PyTest Results (Fast)

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
0 files   ±0   0 ❌ ±0 

Results for commit 4c20701. ± Comparison against base commit 8b926ff.

♻️ This comment has been updated with latest results.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b926ff and 9155ae8.

📒 Files selected for processing (1)
  • connector_builder_agents/src/run.py (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
connector_builder_agents/src/run.py (2)
connector_builder_agents/src/tools.py (1)
  • update_progress_log (191-215)
connector_builder_agents/src/_util.py (1)
  • open_if_browser_available (15-26)
🪛 GitHub Actions: Linters
connector_builder_agents/src/run.py

[error] 64-69: Ruff format check failed. 1 file would be reformatted; run 'ruff format' to fix code style issues. Command: 'uv run ruff format --diff .'

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pytest (Fast)

Comment on lines 281 to 287
session_url = None
if isinstance(session, OpenAIConversationsSession):
conversation_id = await session._get_session_id()
session_url = f"https://platform.openai.com/conversations/{conversation_id}"
update_progress_log(f"🔗 Session URL: {session_url}", session_state)
open_if_browser_available(session_url)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Apply the same resilience when fetching the manager session URL.

Same concern as above: wrap _get_session_id() so a logging failure doesn’t cancel the entire manager/developer build run.

-        if isinstance(session, OpenAIConversationsSession):
-            conversation_id = await session._get_session_id()
-            session_url = f"https://platform.openai.com/conversations/{conversation_id}"
-            update_progress_log(f"🔗 Session URL: {session_url}", session_state)
-            open_if_browser_available(session_url)
+        if isinstance(session, OpenAIConversationsSession):
+            try:
+                conversation_id = await session._get_session_id()
+            except Exception as exc:
+                update_progress_log(
+                    f"⚠️ Unable to retrieve session URL: {exc}",
+                    session_state,
+                )
+            else:
+                session_url = f"https://platform.openai.com/conversations/{conversation_id}"
+                update_progress_log(f"🔗 Session URL: {session_url}", session_state)
+                open_if_browser_available(session_url)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
session_url = None
if isinstance(session, OpenAIConversationsSession):
conversation_id = await session._get_session_id()
session_url = f"https://platform.openai.com/conversations/{conversation_id}"
update_progress_log(f"🔗 Session URL: {session_url}", session_state)
open_if_browser_available(session_url)
session_url = None
if isinstance(session, OpenAIConversationsSession):
try:
conversation_id = await session._get_session_id()
except Exception as exc:
update_progress_log(
f"⚠️ Unable to retrieve session URL: {exc}",
session_state,
)
else:
session_url = f"https://platform.openai.com/conversations/{conversation_id}"
update_progress_log(f"🔗 Session URL: {session_url}", session_state)
open_if_browser_available(session_url)

@aaronsteers
Copy link
Contributor Author

aaronsteers commented Oct 10, 2025

/build-connector

🛠️ Building connector...

Link to job logs.

❌ Poe command evals run --connector source-jsonplaceholder failed. Please inspect the logs.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b926ff and e96ab73.

📒 Files selected for processing (1)
  • connector_builder_agents/src/run.py (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
connector_builder_agents/src/run.py (2)
connector_builder_agents/src/tools.py (1)
  • update_progress_log (191-215)
connector_builder_agents/src/_util.py (1)
  • open_if_browser_available (15-26)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pytest (Fast)

User-reported working URL format uses /logs/conv_{conversation_id}
instead of /conversations/{conversation_id}. Updated both interactive
and manager-developer modes.

Co-Authored-By: AJ Steers <[email protected]>
The conversation_id returned by session._get_session_id() already
includes the 'conv_' prefix, so we should not add it again.

Fixes URL construction to be:
https://platform.openai.com/logs/{conversation_id}
instead of:
https://platform.openai.com/logs/conv_{conversation_id}

Co-Authored-By: AJ Steers <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
connector_builder_agents/src/run.py (3)

50-68: Ruff formatting failure persists.

CI still reports formatting issues in this function. Run ruff format connector_builder_agents/src/run.py and commit the changes.

Based on past review comment at lines 50-68.


168-174: Unguarded _get_session_id() can abort interactive runs.

The call to await session._get_session_id() is unguarded and can raise exceptions (auth failures, API unavailable, rate limits), preventing the interactive session from starting. Since session URL logging is optional instrumentation, failures should be non-fatal.

Apply this diff to wrap the session URL logic in a try/except:

         session_url = None
         if isinstance(session, OpenAIConversationsSession):
-            conversation_id = await session._get_session_id()
-            session_url = f"https://platform.openai.com/logs/{conversation_id}"
-            update_progress_log(f"🔗 Session URL: {session_url}", session_state)
-            open_if_browser_available(session_url)
+            try:
+                conversation_id = await session._get_session_id()
+                session_url = f"https://platform.openai.com/logs/{conversation_id}"
+                update_progress_log(f"🔗 Session URL: {session_url}", session_state)
+                open_if_browser_available(session_url)
+            except Exception as exc:
+                update_progress_log(
+                    f"⚠️ Unable to retrieve session URL: {exc}",
+                    session_state,
+                )

Based on past review comments at lines 168-174.


280-286: Guard session URL lookup in manager/developer flow.

Same concern as the interactive mode: the unguarded await session._get_session_id() call can raise and prevent the manager/developer build from starting. Session URL retrieval is optional instrumentation and should not block the build.

Apply this diff to make the session URL lookup resilient:

         session_url = None
         if isinstance(session, OpenAIConversationsSession):
-            conversation_id = await session._get_session_id()
-            session_url = f"https://platform.openai.com/logs/{conversation_id}"
-            update_progress_log(f"🔗 Session URL: {session_url}", session_state)
-            open_if_browser_available(session_url)
+            try:
+                conversation_id = await session._get_session_id()
+                session_url = f"https://platform.openai.com/logs/{conversation_id}"
+                update_progress_log(f"🔗 Session URL: {session_url}", session_state)
+                open_if_browser_available(session_url)
+            except Exception as exc:
+                update_progress_log(
+                    f"⚠️ Unable to retrieve session URL: {exc}",
+                    session_state,
+                )

Based on past review comments at lines 280-286.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e66ab5 and e9b878a.

📒 Files selected for processing (1)
  • connector_builder_agents/src/run.py (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
connector_builder_agents/src/run.py (2)
connector_builder_agents/src/tools.py (1)
  • update_progress_log (191-215)
connector_builder_agents/src/_util.py (1)
  • open_if_browser_available (15-26)
🔇 Additional comments (2)
connector_builder_agents/src/run.py (2)

209-211: LGTM!

The conditional session URL logging in the exit flow is correctly implemented.


218-220: LGTM!

The conditional session URL logging in error and interrupt handlers is correctly implemented across all paths.

Also applies to: 334-336, 340-342

@aaronsteers
Copy link
Contributor Author

aaronsteers commented Oct 10, 2025

/build-connector

🛠️ Building connector...

Link to job logs.

🟦 Poe command evals run --connector source-jsonplaceholder completed successfully.

OpenAIConversationsSession.__init__ accepts a conversation_id parameter
that enables session continuity. Previously the session_id was only
passed to SQLiteSession but not to OpenAIConversationsSession, which
could break session continuity.

Addresses PR feedback: #135 (comment)

Co-Authored-By: AJ Steers <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e9b878a and 4c20701.

📒 Files selected for processing (1)
  • connector_builder_agents/src/run.py (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
connector_builder_agents/src/run.py (2)
connector_builder_agents/src/tools.py (1)
  • update_progress_log (191-215)
connector_builder_agents/src/_util.py (1)
  • open_if_browser_available (15-26)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (6)
connector_builder_agents/src/run.py (6)

4-4: LGTM: Imports support the new session backend selection.

The os and OpenAIConversationsSession imports are necessary for the environment-variable-based backend selection introduced in this PR.

Also applies to: 11-11


50-68: LGTM: Session factory correctly passes session identifiers to both backends.

The create_session() function properly:

  • Reads the environment variable with a sensible default ("openai")
  • Passes session_id to SQLiteSession and conversation_id to OpenAIConversationsSession (matching their respective APIs)
  • Raises a clear ValueError for invalid backend values

The default change from SQLite to OpenAI is a breaking change, but it's documented in the PR objectives.


209-210: LGTM: Session URLs properly surfaced at exit points.

The session URLs are consistently logged alongside trace URLs when users exit the interactive session (both normal exit and KeyboardInterrupt). The conditional checks prevent logging when session_url is None (i.e., when using SQLite backend).

Also applies to: 218-219


280-286: Consistent session URL handling in manager-developer mode.

The session URL retrieval and logging follows the same pattern as the interactive mode, maintaining consistency across both execution paths.

Note: The URL format and private API concerns flagged for lines 168-174 also apply here.


334-335: LGTM: Session URLs properly surfaced in error and interrupt handlers.

The session URLs are consistently logged in both the KeyboardInterrupt handler and the unexpected exception handler, ensuring users can review their sessions even when builds are interrupted or fail. The conditional checks are appropriate.

Also applies to: 340-341


170-170: Verify use of private session API
No matching public method was found locally; check the external agents library docs for a public session/conversation-ID API or document why session._get_session_id() is required.

@aaronsteers aaronsteers merged commit b850a66 into main Oct 10, 2025
14 checks passed
@aaronsteers aaronsteers deleted the devin/1760135522-openai-session-backend branch October 10, 2025 23:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant