Skip to content

Bug: neonvm-runner doesn't recover from CPU state inconsistent with neonvm-daemon #1375

@sharnoff

Description

@sharnoff

Environment

Production

Steps to reproduce

I was unable to reproduce locally, but it seems like the following sequence of events took place:

  1. VM is created with min CU = 0.25, current CU = 56, max CU = 56, with CPU scaling via sysfs
  2. neonvm-controller tells neonvm-runner to set CPU to 56
  3. neonvm-runner times out making the initial PUT request to neonvm-daemon... but neonvm-daemon does actually make the change? (AFAICT)

Expected result

After we get to this inconsistent state, neonvm-runner should be able to recover on subsequent requests from neonvm-controller to get the current CPU.

Actual result

On every subsequent request where neonvm-controller fetches the current CPU, neonvm-runner emits the following warning:

{"level":"warn","ts":1746021781.7511058,"logger":"neonvm-runner.http-handlers.cpu_current","caller":"cmd/main.go:501","msg":"CPU from NeonVM Daemon does not match stored value, returning daemon value to let controller reconcile correct state","stored":"250m","current":56}

Other logs, links

Metadata

Metadata

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions