-
Notifications
You must be signed in to change notification settings - Fork 45
Closed
Labels
questionFurther information is requestedFurther information is requested
Description
I created a cluster with Magic Custle 14.2.1 on Openstack; slurm seems to work fine, but I don't find any active JupyterHub: jupyter is not installed, port 8080 is not active and on 443 nothing runs.
In the logs seems to show that the JupyterHub installation didn't go well:
"journalctl -u puppet | grep -i jupyter" on login node
```bash [admin@mc-login1 ~]$ journalctl -u puppet | grep -i jupyter May 08 16:04:18 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub::Base/File[/opt/uv]/ensure) created May 08 16:04:18 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub::Base/File[/opt/uv/bin]/ensure) created May 08 16:04:19 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub::Base/Archive[jh_install_uv]/ensure) download archive from https://github.com/astral-sh/uv/releases/download/0.4.22/uv-x86_64-unknown-linux-gnu.tar.gz to /tmp/uv and extracted in /opt/uv/bin with cleanup May 08 16:04:22 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub::Base/Exec[jupyterhub_venv]/returns) executed successfully May 08 16:04:22 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub/File[/opt/jupyterhub/bin/ipa_create_user.py]/ensure) defined content as '{sha256}efa74f870040d8c0704bc95e01899ebe87f3e11458ae56ef65bc047fdc324bf0' May 08 16:04:22 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub/File[/opt/jupyterhub/bin/kinit_wrapper]/ensure) defined content as '{sha256}893eec077eb5909a9093d3caecae415250842b155951ed120f30af8fa0704456' May 08 16:04:22 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub::Keytab/File[/opt/jupyterhub/bin/ipa_register_service.py]/ensure) defined content as '{sha256}fb4c531d9a42bc698f5b0dab7ba287163c689010ab6d3b7caee1e6b3ab1350c1' May 08 16:04:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Reverse_proxy/File[/etc/caddy/conf.d/jupyter.conf]/ensure) defined content as '{sha256}c51c445dacc6687971d25a1a05af8a28d5b20215322d3c44c06995e335bfcd25' May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud groupadd[8225]: group added to /etc/group: name=jupyterhub, GID=2003 May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud groupadd[8225]: group added to /etc/gshadow: name=jupyterhub May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud groupadd[8225]: new group: name=jupyterhub, GID=2003 May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Group[jupyterhub]/ensure) created May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud useradd[8231]: new user: name=jupyterhub, UID=984, GID=2003, home=/run/jupyterhub, shell=/sbin/nologin, from=none May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud useradd[8231]: add 'jupyterhub' to group 'jupyterhub' May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud useradd[8231]: add 'jupyterhub' to shadow group 'jupyterhub' May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/User[jupyterhub]/ensure) created May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Archive[traefik]/ensure) download archive from https://github.com/traefik/traefik/releases/download/v2.10.4/traefik_v2.10.4_linux_amd64.tar.gz to /opt/puppetlabs/puppet/cache/puppet-archive/traefik_v2.10.4_linux_amd64.tar.gz and extracted in /usr/bin with cleanup May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[jupyterhub.service]/ensure) defined content as '{sha256}209c829d10ec4eaf0ad4fa362de4ca275710d5b2f423d1488d0f463fed1bea17' May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/sudoers.d/99-jupyterhub-user]/ensure) defined content as '{sha256}bbccd8209423f14724073124d966653e0a8374603139f76d456a6d782373c2b7' May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[jupyterhub-auth]/ensure) defined content as '{sha256}0ccaf0d7d7a85389a6af74e476bcf5ab37e108b197ba2f737e888eacc6b5a2a0' May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[jupyterhub-login]/ensure) defined content as '{sha256}cd5292ef8b059ce2c12d20b415732f7ed68c26deb8f7120840e9633de21590e4' May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/jupyterhub]/ensure) created May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/jupyterhub/ssl]/ensure) created May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/jupyterhub/templates]/ensure) created May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/run/jupyterhub]/ensure) created May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/usr/lib/tmpfiles.d/jupyterhub.conf]/ensure) defined content as '{sha256}47efa6524f66accd174c753237b574a1849143423f7c30fe6b1bda3d76bf7e16' May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/jupyterhub/templates/page.html]/ensure) defined content as '{sha256}10378ff9543a9d1ac3c6e81f80cc88fdcbd9a3ea3ee3a11eef211f03d358f8b4' May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[jupyterhub_config.json]/ensure) defined content as '{sha256}0c6f00443ecdedc29158fe1798902e1a0d164c43aef6f966c43f915b35d9b622' May 08 16:05:49 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[announcement_config.json]/ensure) defined content as '{sha256}a1f6e409f627c179e7e5cd0c8e6e0d7e647ca19ac731e6f34825244fcec3e66d' May 08 16:05:49 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[submit.sh]/ensure) defined content as '{sha256}acce56c4ebd743857faa6e16a204790a07c54a24bd0ecd7bfa4f9e2dc4248ca4' May 08 16:05:49 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/opt/jupyterhub/hub-requirements.txt]/ensure) defined content as '{sha256}63c0ea663a057ad0b48cc9901aba7bb4b551f6fec82f637449d22855426d01b3' May 08 16:05:57 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[hub_pip_install]/returns) Using Python 3.12.7 environment at /opt/jupyterhub May 08 16:05:57 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[hub_pip_install]/returns) × No solution found when resolving dependencies: May 08 16:05:57 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[hub_pip_install]/returns) ╰─▶ Because there is no version of slurmformspawner==2.8.0 and you require May 08 16:05:57 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[hub_pip_install]/returns) slurmformspawner==2.8.0, we can conclude that your requirements are May 08 16:05:57 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[hub_pip_install]/returns) unsatisfiable. May 08 16:05:57 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[hub_pip_install]) Failed to call refresh: 'uv pip install -r /opt/jupyterhub/hub-requirements.txt' returned 1 instead of one of [0] May 08 16:05:57 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[hub_pip_install]) 'uv pip install -r /opt/jupyterhub/hub-requirements.txt' returned 1 instead of one of [0] May 08 16:05:58 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[create_self_signed_sslcert]/returns) executed successfully May 08 16:05:58 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/jupyterhub/ssl/cert.pem]/mode) mode changed '0640' to '0644' May 08 16:05:58 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/jupyterhub/ssl/key.pem]/group) group changed 'root' to 'jupyterhub' May 08 16:05:58 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/jupyterhub/ssl/key.pem]/mode) mode changed '0600' to '0640' May 08 16:05:58 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Service[jupyterhub]) Dependency Exec[hub_pip_install] has failures: true May 08 16:05:58 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Service[jupyterhub]) Skipping because of failed dependencies May 08 16:05:58 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub/File[/etc/jupyterhub/templates/login.html]/ensure) defined content as '{sha256}d4c07e484f250161498351eed0ae094d18a3b9e9dd06e876fd9ed6d5ebdd2154' May 08 16:06:29 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub/Consul::Service[jupyterhub]/File[/etc/consul/service_jupyterhub.json]/ensure) defined content as '{sha256}228c12847befee4403d1ac8d276e6f1f03a3b6f0365100c418cc8900f1e66449' May 08 16:11:18 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub::Keytab/Exec[jupyterhub_keytab]/returns) executed successfully May 08 16:11:18 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub::Keytab/Exec[jupyterhub_keytab]) Triggered 'refresh' from 1 event May 08 16:11:18 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub::Keytab/File[/etc/jupyterhub/jupyterhub.keytab]/group) group changed 'root' to 'jupyterhub' May 08 16:11:18 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub::Keytab/File[/etc/jupyterhub/jupyterhub.keytab]/mode) mode changed '0600' to '0640' May 08 16:27:20 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[9587]: (/Stage[main]/Jupyterhub/Service[jupyterhub]/ensure) ensure changed 'stopped' to 'running' May 08 16:57:17 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[10022]: (/Stage[main]/Jupyterhub/Service[jupyterhub]/ensure) ensure changed 'stopped' to 'running' (corrective) May 08 17:27:17 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[10771]: (/Stage[main]/Jupyterhub/Service[jupyterhub]/ensure) ensure changed 'stopped' to 'running' (corrective) ```I paste here the main.tf file I used to create the cluster (with the "sensitive" data changed WLOG):
main.tf
terraform {
required_version = ">= 1.4.0"
}
variable "pool" {
description = "Slurm pool of compute nodes"
default = []
}
module "openstack" {
source = "./openstack"
config_git_url = "https://github.com/ComputeCanada/puppet-magic_castle.git"
config_version = "14.2.0"
cluster_name = "magic-castle-cluster"
domain = "calculquebec.cloud"
image = "Rocky-9.3-GenericCloud"
instances = {
mc-mgmt = {
type = "compute-inhpc.medium",
count = 1,
tags = ["puppet", "mgmt", "nfs"],
#disk_type = ceph,
disk_size = 30
}
mc-login = {
type = "compute-inhpc.large",
count = 1,
tags = ["login", "public", "proxy"],
#disk_type = ceph,
disk_size = 26
}
mc-node = {
type = "compute-inhpc.large",
count = 2,
tags = ["node"],
#disk_type = ceph,
disk_size = 25
}
}
# var.pool is managed by Slurm through Terraform REST API.
# To let Slurm manage a type of nodes, add "pool" to its tag list.
# When using Terraform CLI, this parameter is ignored.
# Refer to Magic Castle Documentation - Enable Magic Castle Autoscaling
pool = var.pool
volumes = {
nfs = {
home = { size = 100 }
project = { size = 30 }
scratch = { size = 30 }
}
}
public_keys = [file("~/.ssh/id_rsa_1.pub"),file("~/.ssh/id_rsa_2.pub")]
software_stack = "eessi"
nb_users = 5
# Shared password, randomly chosen if blank
guest_passwd = "testpassword"
sudoer_username = "admin"
os_floating_ips = {
mc-login1 = "138.246.1.1"
mc-login2 = "138.246.1.2"
}
os_ext_network = "internet_pool"
subnet_id = "274...d5e"
firewall_rules = {
ssh1 = { "from_port" = 22, "to_port" = 22, tag = "login", "protocol" = "tcp", "cidr" = "10.156.48.0/22" }, # Internal IP range 1
ssh2 = { "from_port" = 22, "to_port" = 22, tag = "login", "protocol" = "tcp", "cidr" = "129.187.1.1/32" }, # IP 1
ssh3 = { "from_port" = 22, "to_port" = 22, tag = "login", "protocol" = "tcp", "cidr" = "10.156.84.0/24" }, # Internal IP range 2
ssh4 = { "from_port" = 22, "to_port" = 22, tag = "login", "protocol" = "tcp", "cidr" = "10.195.6.1/32" }, # IP 2
http1 = { "from_port" = 80, "to_port" = 80, tag = "proxy", "protocol" = "tcp", "cidr" = "10.156.48.0/22" }, # Internal IP range 1
http2 = { "from_port" = 80, "to_port" = 80, tag = "proxy", "protocol" = "tcp", "cidr" = "129.187.1.1/22" }, # IP 1
https1 = { "from_port" = 443, "to_port" = 443, tag = "proxy", "protocol" = "tcp", "cidr" = "10.156.48.0/22" }, # Internal IP range 1
https2 = { "from_port" = 443, "to_port" = 443, tag = "proxy", "protocol" = "tcp", "cidr" = "129.187.1.1/22" }, # IP 1
#globus = { "from_port" = 2811, "to_port" = 2811, tag = "dtn", "protocol" = "tcp", "cidr" = "54.237.254.192/29" },
#myproxy = { "from_port" = 7512, "to_port" = 7512, tag = "dtn", "protocol" = "tcp", "cidr" = "0.0.0.0/0" },
#gridftp = { "from_port" = 50000, "to_port" = 51000, tag = "dtn", "protocol" = "tcp", "cidr" = "0.0.0.0/0" }
}
}
output "accounts" {
value = module.openstack.accounts
}
output "public_ip" {
value = module.openstack.public_ip
}
## Uncomment to register your domain name with CloudFlare
# module "dns" {
# source = "./dns/cloudflare"
# name = module.openstack.cluster_name
# domain = module.openstack.domain
# public_instances = module.openstack.public_instances
# }
## Uncomment to register your domain name with Google Cloud
# module "dns" {
# source = "./dns/gcloud"
# project = "your-project-id"
# zone_name = "you-zone-name"
# name = module.openstack.cluster_name
# domain = module.openstack.domain
# public_instances = module.openstack.public_instances
# }
# output "hostnames" {
# value = module.dns.hostnames
# }
I attach also the complete log of $ journalctl -u puppet and $ cat /var/log/cloud-init-output.log (in the latter there is an "Invalid cloud-config provided", but i don't understand to what is related and if it's relevant)
login_cloud-init-output.log
login_journalctl-puppet.log
Maybe I'm missing something obvious, in that case I apologise in advance.
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested