Skip to content

How to deploy and restart runners

LaiRuiqi edited this page Dec 21, 2023 · 10 revisions

Deploy Runners

Runners physical node configuration: Three nodes with 4C-8G, 100GB storage. Suggested system image: ubuntu-20.04-2nic

One for integration test, one for gvisor-cri and one for firecracker-cri

How to deploy the three nodes:

  1. On the node that can access the three runners:
git clone https://github.com/vhive-serverless/vHive.git
  1. Build the runner deployer
cd vHive/scripts/github_runner/
go build .
  1. Modify the conf.json

Need to modify conf.json, the format is as following:

{
  "ghOrg": "<GitHub account>",
  "ghPat": "<GitHub PAT>",
  "hostUsername": "<username>",
  "runners": {
    "<hostname-1>": {
      "type": "cri",
      "sandbox": "firecracker"
    },
    "<hostname-2>": {
      "type": "cri",
      "sandbox": "gvisor",
    },
    "<hostname-3>": {
      "type": "integ",
      "num": 2,
      "restart": false
    }
  }
}

Note that in conf.json, for ghOrg, it's vhive-serverless, for ghPat, it should be your own account's Personal Access Token, as long as your account has the correct permissions for vhive-serverless org

<username>:<hostname-1/2/3> is the ssh username and hostname, so if you use SCSE cloud nodes as runners, <hostname-1/2/3> should be their ip addresses.

After modifying this, deploy the runners remotely by running:

./deploy_runners

Restart Runners

On SCSE cloud, rebuild the three nodes and redeploy them.

When Should Restart Runners

For firecracker and gvisor cri tests, when the test stuck in helloworld is waiting for a Revision to be ready bc67c34ef2308282b8285077534667f

This basically implies that the firecracker and gvisor cri runners need to be restart(You can also restart only one runner in that case) But if the firecracker and gvisor cri test passed the Setup vHive CRI test environment step and failed in Run vHive CRI tests step, this typically is just sporadic failure and can be resolved by re-running the tests, just trigger the re-run button on github webpage is okay.

Clone this wiki locally