Skip to content

Conversation

@phooq
Copy link

@phooq phooq commented Dec 17, 2025

Issue #, if available:

  1. The current way to pass args via NeuronMonitor crd is problematic. It will causes the neuron-monitor daemonset pod to be immediately terminated after launch, recreated, and then go through a couple of these cycles before finally reaching a RUNNING state.

    args:
        port: "{{ .Values.neuronMonitor.service.port }}"
        cert-file: "/etc/amazon-cloudwatch-observability-neuron-cert/server.crt"
        key-file: "/etc/amazon-cloudwatch-observability-neuron-cert/server.key"
    
  2. Update the neuron-monitor pod image from version 1.6.0 to 1.7.0. Note that neuron-monitor 1.7.0 has NOT been released yet, so pls don't merge this PR until it is released

  3. Add neuron-monitor-blocker volume as an additional readonly volume mount in the neuron-monitor pod.

Description of changes:

  1. neuron-monitor daemonset manifest :
    Removed args field and merged it to a unified command field with all the args

  2. Updated the neuron-monitor version tag from 1.6.0 to 1.7.0

  3. Add the neuron-monitor-blocker volume and volume mount

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant