Skip to content

Langchain instrumentor implementation #3662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

haneric00
Copy link

Description

OpenTelemetry has published an initial verson AI Agent semantic conventions that cover the fundamental attributes for AI Agent spans. However, existing third-party instrumentation solutions (OpenInference, OpenLLMetry, OpenLIT) have significant gaps with official OpenTelemetry GenAI semantic conventions:

  1. Non-standard Attribute Naming
    1. Current: llm.vendor, llm.usage.prompt_tokens, custom prefixes
    2. Official: gen_ai.system, gen_ai.usage.input_tokens, standardized gen_ai.* prefix
  2. Missing Agent-Level Semantics
    1. Current: No standardized agent operations, conversation tracking, or agent metadata
    2. Official: create_agent, invoke_agent, gen_ai.agent.*, gen_ai.conversation.id
  3. Inconsistent Span Naming
    1. Vendor-specific patterns, no clear hierarchy
    2. Official: Standardized format like invoke_agent {agent_name}, execute_tool {tool_name}
  4. Limited Tool Execution Tracking
    1. Current: Custom or missing tool instrumentation
    2. Offical: Standardized execute_tool operation with proper context

Proposed Solution

Build an OpenTelemetry instrumentation library for LangChain that:

  • Automatically instruments LangChain operations without code changes
  • Follows OpenTelemetry best practices and GenAI semantic conventions
  • Provides end-to-end tracing of agent workflows, LLM calls, and tool executions
  • Integrates seamlessly with existing observability infrastructure

This PR encompasses a langchain instrumentation solution that is currently tailored to work with AWS products like Cloudwatch and Bedrock. I want to get this solution onto upstream first and continuously update it to work universally overtime.

Type of change

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

  • Manual testing of the core functionality across different scenarios
  • Two new test files that cover the main functionality paths:
    1. Test file 1: test instrumentation works on an agentic application of langchain
    2. Test file 2: test instrumentation works on normal chain usage of langchain

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@haneric00 haneric00 requested a review from a team as a code owner July 30, 2025 09:46
Copy link

linux-foundation-easycla bot commented Jul 30, 2025

CLA Missing ID CLA Not Signed

@xrmx
Copy link
Contributor

xrmx commented Jul 30, 2025

@haneric00 I think you missed we have already the skeleton for a langchain implementation. Please get in touch with @wrisa and the other genai people so we don't duplicate effort there.

@haneric00 haneric00 closed this Jul 30, 2025
@haneric00 haneric00 reopened this Jul 30, 2025
@haneric00
Copy link
Author

@haneric00 I think you missed we have already the skeleton for a langchain implementation. Please get in touch with @wrisa and the other genai people so we don't duplicate effort there.

Thanks Riccardo - Agree with avoiding duplicate efforts. I've left a message in the #otel-genai-instrumentation channel to sync with the relevant folks.
@cc: @yiyuan-he @mxiamxia

@aabmass
Copy link
Member

aabmass commented Jul 31, 2025

@haneric00 did you get alignment? Is this ready for review?

@haneric00
Copy link
Author

@haneric00 did you get alignment? Is this ready for review?

Hi, we will be meeting to discuss alignment today at 2pm PST.

@haneric00
Copy link
Author

@haneric00 did you get alignment? Is this ready for review?

@aabmass so we are aligned, we will be focusing on agentic workflow spans and they will be focusing on LLM invocation spans. For the time being, we will wait for their open PRs to get merged so I will move mine to draft mode for now, and then rebase after their work is merged.

@haneric00 haneric00 marked this pull request as draft July 31, 2025 21:49
if run_id in self.span_mapping:
span = self.span_mapping[run_id].span

_set_span_attribute(span, "gen_ai.agent.tool.input", tool_input)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these attributes in semantic conventions already? do not see them here https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-agent-spans.md

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, these are placeholders until an alternative to the deprecated attribute is ready

self._handle_error(error, run_id, parent_run_id, **kwargs)


def on_agent_action(self,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the example is using langchain agent which is not recommended for production application, please see https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/agents/react/agent.py#L33.
Have you tried langraph agents? on_agent_action and on_agent_finish callbacks are never called when langraph agents are used. Do we know what is to be done there?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wanted to keep these callbacks available to cover a wider spread of use cases. We have been testing on both langgraph and langchain applications. I am a little confused on what the issue is, are these callback handlers to be removed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants