-
Notifications
You must be signed in to change notification settings - Fork 21
[llm-d] Keep working #914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[llm-d] Keep working #914
Changes from all commits
88cd5e8
14105ff
c57b55a
dd68002
3175abb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| :orphan: | ||
|
|
||
| .. | ||
| _Auto-generated file, do not edit manually ... | ||
| _Toolbox generate command: repo generate_toolbox_rst_documentation | ||
| _ Source component: Cluster.capture_servicemonitor_metrics | ||
|
|
||
|
|
||
| cluster capture_servicemonitor_metrics | ||
| ====================================== | ||
|
|
||
| Captures ServiceMonitor or PodMonitor YAML and status for a given service | ||
|
|
||
| Captures the ServiceMonitor/PodMonitor configuration and status information for | ||
| a specific service in a namespace, including related service/pod and | ||
| endpoints information for troubleshooting monitoring setup. | ||
|
|
||
|
|
||
| Parameters | ||
| ---------- | ||
|
|
||
|
|
||
| ``service_name`` | ||
|
|
||
| * Name of the service to capture ServiceMonitor/PodMonitor metrics for | ||
|
|
||
|
|
||
| ``namespace`` | ||
|
|
||
| * Namespace where the service and ServiceMonitor/PodMonitor are located (empty string auto-detects current namespace) | ||
|
|
||
|
|
||
| ``capture_frequency`` | ||
|
|
||
| * How often to capture metrics in seconds (default: 15) | ||
|
|
||
| * default value: ``60`` | ||
|
|
||
|
|
||
| ``is_podmonitor`` | ||
|
|
||
| * Whether to use PodMonitor instead of ServiceMonitor (default: False) | ||
|
|
||
|
|
||
| ``finalize`` | ||
|
|
||
| * Whether to finalize (capture logs and delete) an existing pod instead of creating new one (default: False) | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -547,3 +547,23 @@ def enable_userworkload_monitoring(self, namespaces: list = []): | |||||
| """ | ||||||
|
|
||||||
| return RunAnsibleRole(locals()) | ||||||
|
|
||||||
| @AnsibleRole("cluster_capture_servicemonitor_metrics") | ||||||
| @AnsibleMappedParams | ||||||
| def capture_servicemonitor_metrics(self, service_name, namespace="", capture_frequency=60, is_podmonitor=False, finalize=False): | ||||||
| """ | ||||||
| Captures ServiceMonitor or PodMonitor YAML and status for a given service | ||||||
|
|
||||||
| Captures the ServiceMonitor/PodMonitor configuration and status information for | ||||||
| a specific service in a namespace, including related service/pod and | ||||||
| endpoints information for troubleshooting monitoring setup. | ||||||
|
|
||||||
| Args: | ||||||
| service_name: Name of the service to capture ServiceMonitor/PodMonitor metrics for | ||||||
| namespace: Namespace where the service and ServiceMonitor/PodMonitor are located (empty string auto-detects current namespace) | ||||||
| capture_frequency: How often to capture metrics in seconds (default: 15) | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Docstring default value is incorrect: says 15 but actual default is 60. The docstring states Proposed fix- capture_frequency: How often to capture metrics in seconds (default: 15)
+ capture_frequency: How often to capture metrics in seconds (default: 60)📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||
| is_podmonitor: Whether to use PodMonitor instead of ServiceMonitor (default: False) | ||||||
| finalize: Whether to finalize (capture logs and delete) an existing pod instead of creating new one (default: False) | ||||||
| """ | ||||||
|
|
||||||
| return RunAnsibleRole(locals()) | ||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| # Auto-generated file, do not edit manually ... | ||
| # Toolbox generate command: repo generate_ansible_default_settings | ||
| # Source component: Cluster.capture_servicemonitor_metrics | ||
|
|
||
| # Parameters | ||
| # Name of the service to capture ServiceMonitor/PodMonitor metrics for | ||
| # Mandatory value | ||
| cluster_capture_servicemonitor_metrics_service_name: | ||
|
|
||
| # Namespace where the service and ServiceMonitor/PodMonitor are located (empty string auto-detects current namespace) | ||
| cluster_capture_servicemonitor_metrics_namespace: | ||
|
|
||
| # How often to capture metrics in seconds (default: 15) | ||
| cluster_capture_servicemonitor_metrics_capture_frequency: 60 | ||
|
Comment on lines
+13
to
+14
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Documentation mismatch: Comment says "default: 15" but actual default is 60. The comment on line 13 states Proposed fix-# How often to capture metrics in seconds (default: 15)
+# How often to capture metrics in seconds (default: 60)
cluster_capture_servicemonitor_metrics_capture_frequency: 60🤖 Prompt for AI Agents |
||
|
|
||
| # Whether to use PodMonitor instead of ServiceMonitor (default: False) | ||
| cluster_capture_servicemonitor_metrics_is_podmonitor: false | ||
|
|
||
| # Whether to finalize (capture logs and delete) an existing pod instead of creating new one (default: False) | ||
| cluster_capture_servicemonitor_metrics_finalize: false | ||
|
|
||
| # Default Ansible variables | ||
| # Default value for ansible_os_family to ensure role remains standalone | ||
| ansible_os_family: Linux | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,98 @@ | ||
| --- | ||
| - name: Get current namespace if not specified | ||
| command: oc project -q | ||
| register: current_namespace_result | ||
| when: cluster_capture_servicemonitor_metrics_namespace == "" | ||
|
|
||
| - name: Set the target namespace | ||
| set_fact: | ||
| target_namespace: "{{ current_namespace_result.stdout if cluster_capture_servicemonitor_metrics_namespace == '' else cluster_capture_servicemonitor_metrics_namespace }}" | ||
|
|
||
| - name: Create capture directory | ||
| file: | ||
| path: "{{ artifact_extra_logs_dir }}/artifacts" | ||
| state: directory | ||
| mode: '0755' | ||
|
|
||
| - name: "[Finalize mode] capture logs and delete existing pod" | ||
| when: cluster_capture_servicemonitor_metrics_finalize | ||
| block: | ||
| - name: Check if pod exists for finalization | ||
| shell: | | ||
| oc get pod topsail-metrics-capture-{{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} --no-headers -o name | ||
| register: pod_exists_check | ||
|
|
||
| - name: Capture pod logs | ||
| shell: | | ||
| oc logs topsail-metrics-capture-{{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} > "{{ artifact_extra_logs_dir }}/artifacts/metrics_capture_logs.txt" | ||
|
Comment on lines
+20
to
+27
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Make finalize mode idempotent when the capture pod never existed.
Suggested fix - name: Check if pod exists for finalization
shell: |
oc get pod topsail-metrics-capture-{{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} --no-headers -o name
register: pod_exists_check
+ failed_when: false
+ changed_when: false
- name: Capture pod logs
shell: |
oc logs topsail-metrics-capture-{{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} > "{{ artifact_extra_logs_dir }}/artifacts/metrics_capture_logs.txt"
+ when: pod_exists_check.rc == 0🤖 Prompt for AI Agents |
||
|
|
||
| - name: Delete metrics capture pod | ||
| shell: | | ||
| oc delete pod topsail-metrics-capture-{{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} --grace-period=0 --ignore-not-found | ||
|
|
||
| - name: "[Finalize mode] capture logs and delete existing pod" | ||
| when: cluster_capture_servicemonitor_metrics_finalize | ||
| meta: end_play | ||
|
|
||
| # Normal deployment mode: discover resources and create pod | ||
| - name: Include ServiceMonitor tasks | ||
| include_tasks: servicemonitor.yml | ||
| when: not cluster_capture_servicemonitor_metrics_is_podmonitor | ||
|
|
||
| - name: Include PodMonitor tasks | ||
| include_tasks: podmonitor.yml | ||
| when: cluster_capture_servicemonitor_metrics_is_podmonitor | ||
|
|
||
| # Ensure auth_secret_name is always defined with proper structure (fallback for edge cases) | ||
| - name: Set default auth secret name if not defined | ||
| set_fact: | ||
| auth_secret_name: "{% if auth_secret_name_cmd is defined %}{{ auth_secret_name_cmd.stdout }}{% endif %}" | ||
|
Comment on lines
+46
to
+49
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This normalization step clears PodMonitor auth secrets. The ServiceMonitor path registers Suggested fix-# Ensure auth_secret_name is always defined with proper structure (fallback for edge cases)
-- name: Set default auth secret name if not defined
- set_fact:
- auth_secret_name: "{% if auth_secret_name_cmd is defined %}{{ auth_secret_name_cmd.stdout }}{% endif %}"
+# Normalize the extracted auth secret name from either monitor path
+- name: Normalize auth secret name
+ set_fact:
+ auth_secret_name: >-
+ {{ auth_secret_name_cmd.stdout if auth_secret_name_cmd is defined
+ else (auth_secret_name.stdout if auth_secret_name is defined else '') }}🤖 Prompt for AI Agents |
||
|
|
||
| # Common tasks for deployment mode | ||
| - name: Read metrics URL | ||
| shell: cat "{{ artifact_extra_logs_dir }}/artifacts/metrics_url.txt" | ||
| register: metrics_url_content | ||
|
|
||
| - name: Check if auth secret exists | ||
| shell: | | ||
| oc get secret {{ auth_secret_name }} -n {{ target_namespace }} --no-headers -o name | ||
| register: auth_secret_exists | ||
| when: auth_secret_name != "" | ||
|
Comment on lines
+56
to
+60
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The auth-secret probe is still fatal. The next task branches on Suggested fix - name: Check if auth secret exists
shell: |
oc get secret {{ auth_secret_name }} -n {{ target_namespace }} --no-headers -o name
register: auth_secret_exists
when: auth_secret_name != ""
+ failed_when: false
+ changed_when: falseAlso applies to: 62-80 🤖 Prompt for AI Agents |
||
|
|
||
| - name: Save authentication info | ||
| shell: | | ||
| {% if cluster_capture_servicemonitor_metrics_is_podmonitor %} | ||
| echo "PodMonitor: {{ cluster_capture_servicemonitor_metrics_service_name }}" > "{{ artifact_extra_logs_dir }}/artifacts/auth_info.txt" | ||
| {% else %} | ||
| echo "ServiceMonitor: {{ cluster_capture_servicemonitor_metrics_service_name }}" > "{{ artifact_extra_logs_dir }}/artifacts/auth_info.txt" | ||
| {% endif %} | ||
| echo "Auth secret name: {{ auth_secret_name | default('none') }}" >> "{{ artifact_extra_logs_dir }}/artifacts/auth_info.txt" | ||
| if [ -n "{{ auth_secret_name }}" ]; then | ||
| {% if auth_secret_name != '' and auth_secret_exists.rc == 0 %} | ||
| echo "Secret exists: yes" >> "{{ artifact_extra_logs_dir }}/artifacts/auth_info.txt" | ||
| echo "Secret will be mounted at: /var/run/secrets/auth/token" >> "{{ artifact_extra_logs_dir }}/artifacts/auth_info.txt" | ||
| {% else %} | ||
| echo "Secret exists: no" >> "{{ artifact_extra_logs_dir }}/artifacts/auth_info.txt" | ||
| echo "WARNING: Secret not found!" >> "{{ artifact_extra_logs_dir }}/artifacts/auth_info.txt" | ||
| {% endif %} | ||
| else | ||
| echo "No authentication required" >> "{{ artifact_extra_logs_dir }}/artifacts/auth_info.txt" | ||
| fi | ||
|
|
||
| - name: Create metrics capture pod manifest | ||
| template: | ||
| src: metrics_capture_pod.yaml.j2 | ||
| dest: "{{ artifact_extra_logs_dir }}/artifacts/metrics_capture_pod.yaml" | ||
| mode: '0644' | ||
| vars: | ||
| metrics_url: "{{ metrics_url_content.stdout }}" | ||
| auth_secret_name: "{{ auth_secret_name | default('') }}" | ||
| capture_frequency: "{{ cluster_capture_servicemonitor_metrics_capture_frequency }}" | ||
|
|
||
| - name: Create metrics capture pod | ||
| shell: | | ||
| oc create -f "{{ artifact_extra_logs_dir }}/artifacts/metrics_capture_pod.yaml" | ||
|
|
||
| - name: Wait for pod to start | ||
| shell: | | ||
| oc wait --for=condition=Ready pod/topsail-metrics-capture-{{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} --timeout=60s | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,75 @@ | ||
| --- | ||
| # PodMonitor-specific tasks | ||
|
|
||
| - name: Capture PodMonitor YAML | ||
| shell: | | ||
| oc get podmonitor {{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} -oyaml > "{{ artifact_extra_logs_dir }}/artifacts/podmonitor.yaml" | ||
|
|
||
| - name: Get PodMonitor status | ||
| shell: | | ||
| oc get podmonitor {{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} > "{{ artifact_extra_logs_dir }}/artifacts/podmonitor.status" | ||
|
|
||
| - name: Capture pod targets by PodMonitor selector | ||
| shell: | | ||
| oc get pod -l "app.kubernetes.io/component in (llminferenceservice-workload,llminferenceservice-workload-prefill,llminferenceservice-workload-worker,llminferenceservice-workload-leader,llminferenceservice-workload-leader-prefill,llminferenceservice-workload-worker-prefill),app.kubernetes.io/part-of=llminferenceservice" -n {{ target_namespace }} > "{{ artifact_extra_logs_dir }}/artifacts/target_pods.status" | ||
|
|
||
| - name: Capture pod targets by PodMonitor selector YAML | ||
| shell: | | ||
| oc get pod -l "app.kubernetes.io/component in (llminferenceservice-workload,llminferenceservice-workload-prefill,llminferenceservice-workload-worker,llminferenceservice-workload-leader,llminferenceservice-workload-leader-prefill,llminferenceservice-workload-worker-prefill),app.kubernetes.io/part-of=llminferenceservice" -n {{ target_namespace }} -oyaml > "{{ artifact_extra_logs_dir }}/artifacts/target_pods.yaml" | ||
|
|
||
| - name: Get all target pod IPs and names | ||
| shell: | | ||
| oc get pod -l "app.kubernetes.io/component in (llminferenceservice-workload,llminferenceservice-workload-prefill,llminferenceservice-workload-worker,llminferenceservice-workload-leader,llminferenceservice-workload-leader-prefill,llminferenceservice-workload-worker-prefill),app.kubernetes.io/part-of=llminferenceservice" -n {{ target_namespace }} --no-headers -o custom-columns=":metadata.name,:status.podIP" | ||
| register: target_pods_info | ||
|
|
||
| - name: Extract scheme from PodMonitor | ||
| shell: | | ||
| oc get podmonitor {{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} -o jsonpath='{.spec.podMetricsEndpoints[0].scheme}' 2>/dev/null || echo "http" | ||
| register: metrics_scheme | ||
|
|
||
| - name: Extract target port from PodMonitor | ||
| shell: | | ||
| oc get podmonitor {{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} -o jsonpath='{.spec.podMetricsEndpoints[0].targetPort}' 2>/dev/null || echo "9090" | ||
| register: metrics_port | ||
|
|
||
| - name: Build metrics URLs for all matching pods | ||
| shell: | | ||
| set -o pipefail; | ||
|
|
||
| SCHEME="{{ metrics_scheme.stdout | trim | default('http') }}" | ||
| PORT="{{ metrics_port.stdout | trim | default('9090') }}" | ||
|
|
||
| # Count total pods and initialize files | ||
| TOTAL_PODS=$(echo "{{ target_pods_info.stdout }}" | wc -l) | ||
|
|
||
| # Initialize files | ||
| echo "" > "{{ artifact_extra_logs_dir }}/artifacts/metrics_url.txt" | ||
| echo "PodMonitor target pods (found: $TOTAL_PODS):" > "{{ artifact_extra_logs_dir }}/artifacts/metrics_info.txt" | ||
| echo "Scheme: $SCHEME" >> "{{ artifact_extra_logs_dir }}/artifacts/metrics_info.txt" | ||
| echo "Port: $PORT" >> "{{ artifact_extra_logs_dir }}/artifacts/metrics_info.txt" | ||
| echo "" >> "{{ artifact_extra_logs_dir }}/artifacts/metrics_info.txt" | ||
|
|
||
| # Build URL list for each pod | ||
| while IFS= read -r line; do | ||
| if [ -n "$line" ]; then | ||
| POD_NAME=$(echo "$line" | awk '{print $1}') | ||
| POD_IP=$(echo "$line" | awk '{print $2}') | ||
| if [ -n "$POD_IP" ] && [ "$POD_IP" != "<none>" ]; then | ||
| URL="$SCHEME://$POD_IP:$PORT/metrics" | ||
| echo "$URL" >> "{{ artifact_extra_logs_dir }}/artifacts/metrics_url.txt" | ||
| echo "Pod: $POD_NAME ($POD_IP) -> $URL" >> "{{ artifact_extra_logs_dir }}/artifacts/metrics_info.txt" | ||
| else | ||
| echo "Pod: $POD_NAME (no IP available)" >> "{{ artifact_extra_logs_dir }}/artifacts/metrics_info.txt" | ||
| fi | ||
| fi | ||
| done <<< "{{ target_pods_info.stdout }}" | ||
|
|
||
| # Count valid URLs and add summary | ||
| VALID_URLS=$(grep -c "^http" "{{ artifact_extra_logs_dir }}/artifacts/metrics_url.txt" || echo "0") | ||
| echo "" >> "{{ artifact_extra_logs_dir }}/artifacts/metrics_info.txt" | ||
| echo "Total metrics URLs: $VALID_URLS" >> "{{ artifact_extra_logs_dir }}/artifacts/metrics_info.txt" | ||
|
|
||
| - name: Extract authorization secret name from PodMonitor | ||
| shell: | | ||
| oc get podmonitor {{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} -o jsonpath='{.spec.podMetricsEndpoints[0].authorization.credentials.name}' 2>/dev/null || echo "" | ||
| register: auth_secret_name |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,76 @@ | ||
| --- | ||
| # ServiceMonitor-specific tasks | ||
|
|
||
| - name: Capture ServiceMonitor YAML | ||
| shell: | | ||
| oc get servicemonitor {{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} -oyaml > "{{ artifact_extra_logs_dir }}/artifacts/servicemonitor.yaml" | ||
|
|
||
| - name: Get ServiceMonitor status | ||
| shell: | | ||
| oc get servicemonitor {{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} > "{{ artifact_extra_logs_dir }}/artifacts/servicemonitor.status" | ||
|
|
||
| - name: Capture service target by ServiceMonitor selector | ||
| shell: | | ||
| oc get service -l "app.kubernetes.io/component=llminferenceservice-router-scheduler,app.kubernetes.io/part-of=llminferenceservice" -n {{ target_namespace }} > "{{ artifact_extra_logs_dir }}/artifacts/target_service.status" | ||
|
|
||
| - name: Capture service target by ServiceMonitor selector YAML | ||
| shell: | | ||
| oc get service -l "app.kubernetes.io/component=llminferenceservice-router-scheduler,app.kubernetes.io/part-of=llminferenceservice" -n {{ target_namespace }} -oyaml > "{{ artifact_extra_logs_dir }}/artifacts/target_service.yaml" | ||
|
|
||
| - name: Get target service name | ||
| shell: | | ||
| set -o pipefail; | ||
| oc get service -l "app.kubernetes.io/component=llminferenceservice-router-scheduler,app.kubernetes.io/part-of=llminferenceservice" -n {{ target_namespace }} --no-headers -o custom-columns=":metadata.name" | head -1 | ||
| register: target_service_name | ||
|
|
||
| - name: Extract port name from ServiceMonitor | ||
| shell: | | ||
| oc get servicemonitor {{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} -o jsonpath='{.spec.endpoints[0].port}' 2>/dev/null || echo "metrics" | ||
| register: metrics_port_name | ||
|
|
||
| - name: Get port number from Service by name | ||
| shell: | | ||
| SERVICE_NAME='{{ target_service_name.stdout | trim }}' | ||
| PORT_NAME='{{ metrics_port_name.stdout | trim | default("metrics") }}' | ||
| oc get service "$SERVICE_NAME" -n {{ target_namespace }} -o jsonpath="{.spec.ports[?(@.name=='$PORT_NAME')].port}" 2>/dev/null || echo "" | ||
| register: named_port_result | ||
|
|
||
| - name: Get first port as fallback | ||
| shell: | | ||
| SERVICE_NAME='{{ target_service_name.stdout | trim }}' | ||
| oc get service "$SERVICE_NAME" -n {{ target_namespace }} -o jsonpath='{.spec.ports[0].port}' 2>/dev/null || echo "9090" | ||
| register: first_port_result | ||
| when: named_port_result.stdout == "" | ||
|
|
||
| - name: Set final port number | ||
| set_fact: | ||
| final_port: "{{ named_port_result.stdout if named_port_result.stdout != '' else first_port_result.stdout | default('9090') }}" | ||
|
|
||
| - name: Determine scheme from port | ||
| set_fact: | ||
| final_scheme: >- | ||
| {{ | ||
| 'https' if ( | ||
| final_port in ['443', '8443', '6443'] or | ||
| (metrics_port_name.stdout | trim | default('metrics')) is match('.*(https|secure|tls).*') | ||
| ) else 'http' | ||
| }} | ||
|
|
||
| - name: Build metrics URL for ServiceMonitor | ||
| shell: | | ||
| SERVICE_NAME='{{ target_service_name.stdout | trim }}' | ||
| PORT_NAME='{{ metrics_port_name.stdout | trim | default("metrics") }}' | ||
| PORT_NUMBER='{{ final_port }}' | ||
| SCHEME='{{ final_scheme }}' | ||
|
|
||
| echo "$SCHEME://$SERVICE_NAME.{{ target_namespace }}.svc:$PORT_NUMBER/metrics" > "{{ artifact_extra_logs_dir }}/artifacts/metrics_url.txt" | ||
| echo "Service: $SERVICE_NAME" > "{{ artifact_extra_logs_dir }}/artifacts/metrics_info.txt" | ||
| echo "Port name: $PORT_NAME" >> "{{ artifact_extra_logs_dir }}/artifacts/metrics_info.txt" | ||
| echo "Port number: $PORT_NUMBER" >> "{{ artifact_extra_logs_dir }}/artifacts/metrics_info.txt" | ||
| echo "Scheme: $SCHEME" >> "{{ artifact_extra_logs_dir }}/artifacts/metrics_info.txt" | ||
| echo "URL: $SCHEME://$SERVICE_NAME.{{ target_namespace }}.svc:$PORT_NUMBER/metrics" >> "{{ artifact_extra_logs_dir }}/artifacts/metrics_info.txt" | ||
|
|
||
| - name: Extract authorization secret name from ServiceMonitor | ||
| shell: | | ||
| oc get servicemonitor {{ cluster_capture_servicemonitor_metrics_service_name }} -n {{ target_namespace }} -o jsonpath='{.spec.endpoints[0].authorization.credentials.name}' 2>/dev/null || echo "" | ||
| register: auth_secret_name_cmd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same documentation mismatch: "default: 15" but actual value is 60.
This inconsistency originates from the docstring in
projects/cluster/toolbox/cluster.pyline 564. Fix the source docstring to correct both this RST file and the Ansible defaults file upon regeneration.🤖 Prompt for AI Agents