-
Notifications
You must be signed in to change notification settings - Fork 488
Open
Description
Please provide an in-depth description of the question you have:
What do you think about this question?:
In the Huawei Ascend910B4 vnpu instance, the following error occurred: "DrvMngGetConsoleLogLevel failed. (g_conLogLevel=3)
dcmi module initialize failed. ret is -8005".
(base) root@root:/data1/yaml/npu-test# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
default alertmanager-prometheus-kube-prometheus-alertmanager-0 0/2 Init:ErrImagePull 0 3d23h
default npu-3 1/1 Running 0 9s
default prometheus-grafana-6b88b59b7c-2lghj 0/3 ContainerStatusUnknown 0 5d12h
default prometheus-grafana-6b88b59b7c-q8qlw 0/3 ImagePullBackOff 0 3d23h
default prometheus-kube-prometheus-operator-745c84d57d-frwwp 0/1 ContainerStatusUnknown 0 5d12h
default prometheus-kube-prometheus-operator-745c84d57d-lpwft 0/1 ImagePullBackOff 0 3d23h
default prometheus-kube-state-metrics-596bc6cb7c-pvc68 0/1 ContainerStatusUnknown 0 5d12h
default prometheus-kube-state-metrics-596bc6cb7c-zwhrh 0/1 ImagePullBackOff 0 3d23h
default prometheus-prometheus-kube-prometheus-prometheus-0 0/2 Init:ImagePullBackOff 0 3d23h
default prometheus-prometheus-node-exporter-rzxbw 0/1 ImagePullBackOff 0 3d23h
kube-system calico-kube-controllers-658d97c59c-mpz4z 1/1 Running 2 (56d ago) 76d
kube-system calico-node-gwmz6 1/1 Running 1 (75d ago) 76d
kube-system coredns-5884d58d84-82vwj 1/1 Running 0 71d
kube-system coredns-5884d58d84-zmn6k 1/1 Running 0 71d
kube-system etcd-root 1/1 Running 2 (75d ago) 76d
kube-system hami-ascend-device-plugin-hn9dp 1/1 Running 0 3d16h
kube-system hami-scheduler-5654fd7b7-jmbh9 2/2 Running 0 9d
kube-system kube-apiserver-root 1/1 Running 32 (75d ago) 76d
kube-system kube-controller-manager-root 1/1 Running 1495 (39d ago) 76d
kube-system kube-proxy-x9p22 1/1 Running 1 (75d ago) 76d
kube-system kube-scheduler-root 1/1 Running 1489 (39d ago) 76d
(base) root@root:/data1/yaml/npu-test# kubectl -n default exec -it npu-3 -- bash
HwHiAiUser@npu-3:~$ npu-smi info
DrvMngGetConsoleLogLevel failed. (g_conLogLevel=3)
dcmi module initialize failed. ret is -8005
Is it because of the following reasons?
(base) root@root:/data1/yaml# kubectl get node root -o json | jq -r '.status.allocatable | to_entries[] | select(.key|contains("huawei.com"))'
{
"key": "huawei.com/Ascend910B4",
"value": "32"
}
(base) root@root:/data1/yaml# cd npu-test/
Environment:
- HAMi version: 2.7.0
- Kubernetes version: v1.28.15
- Others: ascend-device-plugin: projecthami/ascend-device-plugin:v1.1.0
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels