Skip to content

Conversation

@vsraccubits
Copy link
Contributor

  • Convert otel-collector from Deployment to DaemonSet for per-node collectors
  • Add hostPort (4317/4318) to expose OTEL endpoints on each node
  • Add tolerations for control-plane/master nodes
  • Reduce resource limits for distributed deployment (512Mi/500m per node)
  • Inject NODE_IP via Kubernetes Downward API into microservices
  • Update OTEL endpoints in values.yaml to use $(NODE_IP) instead of service name

Services updated: budgateway, budmetrics, budprompt, budcluster

- Convert otel-collector from Deployment to DaemonSet for per-node collectors
- Add hostPort (4317/4318) to expose OTEL endpoints on each node
- Add tolerations for control-plane/master nodes
- Reduce resource limits for distributed deployment (512Mi/500m per node)
- Inject NODE_IP via Kubernetes Downward API into microservices
- Update OTEL endpoints in values.yaml to use $(NODE_IP) instead of service name

Services updated: budgateway, budmetrics, budprompt, budcluster

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @vsraccubits, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the OpenTelemetry (OTEL) collection strategy within the Kubernetes environment. By transitioning the OTEL Collector from a standard Deployment to a DaemonSet, it ensures that a dedicated collector runs on each node, optimizing local telemetry data capture. The change also involves exposing collector endpoints directly via host ports, adjusting resource allocations for efficiency, and dynamically configuring microservices to send their telemetry data to the local collector using the node's IP address, thereby streamlining the observability pipeline.

Highlights

  • OTEL Collector Architecture Change: The OpenTelemetry Collector has been migrated from a Kubernetes Deployment to a DaemonSet, ensuring that a collector instance runs on every node in the cluster.
  • OTEL Endpoint Exposure: Host ports (4317 for gRPC and 4318 for HTTP) have been added to the OTEL Collector DaemonSet, allowing direct access to the collector from the host node.
  • Node IP Injection: The NODE_IP environment variable is now injected into microservices (budgateway, budmetrics, budprompt, budcluster) using the Kubernetes Downward API, providing the host IP address to each pod.
  • OTEL Endpoint Configuration Update: Microservices have been reconfigured to use the injected $(NODE_IP) for their OTEL exporter endpoints, directing telemetry data to the local OTEL Collector instance rather than a service name.
  • Resource Optimization: The default resource limits and requests for the OTEL Collector have been reduced to better suit a distributed, per-node deployment model (e.g., memory limits from 2Gi to 512Mi, CPU from 1000m to 500m).
  • Control Plane Tolerations: Tolerations for control-plane and master nodes have been added to the OTEL Collector DaemonSet, allowing it to schedule pods on these nodes if necessary.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly migrates the OpenTelemetry collector from a Deployment to a DaemonSet, which is a good architectural improvement for collecting telemetry on a per-node basis. The changes to inject the node IP into application pods and update the OTEL endpoint accordingly are implemented correctly across most services. My review includes a few points: a high-severity security concern regarding the use of hostPort, a medium-severity suggestion to remove a now-redundant Kubernetes Service, and a medium-severity comment about an incomplete configuration for the budmetrics service.

Comment on lines 137 to +142
containerPort: 4317
hostPort: 4317
protocol: TCP
- name: otlp-http
containerPort: 4318
hostPort: 4318
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Using hostPort exposes the OpenTelemetry collector ports on the node's IP address. If the nodes have publicly routable IP addresses, this could expose the collector to the internet, which can be a security risk. It is highly recommended to use NetworkPolicies to restrict access to these ports, allowing traffic only from within the cluster or trusted sources.

---
apiVersion: apps/v1
kind: Deployment
kind: DaemonSet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

With the collector now running as a DaemonSet and services configured to send telemetry to the collector on the same node via $(NODE_IP), the ClusterIP service for otel-collector defined later in this file (lines 178-201) appears to be redundant. If this service is no longer used by any component, consider removing it to simplify the configuration.

Comment on lines +46 to +49
- name: NODE_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The NODE_IP environment variable is being injected into the budmetrics container, but it doesn't appear to be used. The values.yaml file does not define an OTEL_..._ENDPOINT variable for the budmetrics service that would consume this NODE_IP. Please verify if budmetrics is intended to send telemetry to the OpenTelemetry collector and update values.yaml accordingly. If it's not needed, this environment variable injection should be removed.

@vsraccubits vsraccubits requested a review from sinanmohd January 5, 2026 06:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants