Skip to content

JupyterHub not working #361

@foglienimatteo

Description

@foglienimatteo

I created a cluster with Magic Custle 14.2.1 on Openstack; slurm seems to work fine, but I don't find any active JupyterHub: jupyter is not installed, port 8080 is not active and on 443 nothing runs.

In the logs seems to show that the JupyterHub installation didn't go well:

"journalctl -u puppet | grep -i jupyter" on login node ```bash [admin@mc-login1 ~]$ journalctl -u puppet | grep -i jupyter May 08 16:04:18 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub::Base/File[/opt/uv]/ensure) created May 08 16:04:18 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub::Base/File[/opt/uv/bin]/ensure) created May 08 16:04:19 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub::Base/Archive[jh_install_uv]/ensure) download archive from https://github.com/astral-sh/uv/releases/download/0.4.22/uv-x86_64-unknown-linux-gnu.tar.gz to /tmp/uv and extracted in /opt/uv/bin with cleanup May 08 16:04:22 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub::Base/Exec[jupyterhub_venv]/returns) executed successfully May 08 16:04:22 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub/File[/opt/jupyterhub/bin/ipa_create_user.py]/ensure) defined content as '{sha256}efa74f870040d8c0704bc95e01899ebe87f3e11458ae56ef65bc047fdc324bf0' May 08 16:04:22 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub/File[/opt/jupyterhub/bin/kinit_wrapper]/ensure) defined content as '{sha256}893eec077eb5909a9093d3caecae415250842b155951ed120f30af8fa0704456' May 08 16:04:22 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub::Keytab/File[/opt/jupyterhub/bin/ipa_register_service.py]/ensure) defined content as '{sha256}fb4c531d9a42bc698f5b0dab7ba287163c689010ab6d3b7caee1e6b3ab1350c1' May 08 16:04:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Reverse_proxy/File[/etc/caddy/conf.d/jupyter.conf]/ensure) defined content as '{sha256}c51c445dacc6687971d25a1a05af8a28d5b20215322d3c44c06995e335bfcd25' May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud groupadd[8225]: group added to /etc/group: name=jupyterhub, GID=2003 May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud groupadd[8225]: group added to /etc/gshadow: name=jupyterhub May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud groupadd[8225]: new group: name=jupyterhub, GID=2003 May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Group[jupyterhub]/ensure) created May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud useradd[8231]: new user: name=jupyterhub, UID=984, GID=2003, home=/run/jupyterhub, shell=/sbin/nologin, from=none May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud useradd[8231]: add 'jupyterhub' to group 'jupyterhub' May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud useradd[8231]: add 'jupyterhub' to shadow group 'jupyterhub' May 08 16:05:45 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/User[jupyterhub]/ensure) created May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Archive[traefik]/ensure) download archive from https://github.com/traefik/traefik/releases/download/v2.10.4/traefik_v2.10.4_linux_amd64.tar.gz to /opt/puppetlabs/puppet/cache/puppet-archive/traefik_v2.10.4_linux_amd64.tar.gz and extracted in /usr/bin with cleanup May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[jupyterhub.service]/ensure) defined content as '{sha256}209c829d10ec4eaf0ad4fa362de4ca275710d5b2f423d1488d0f463fed1bea17' May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/sudoers.d/99-jupyterhub-user]/ensure) defined content as '{sha256}bbccd8209423f14724073124d966653e0a8374603139f76d456a6d782373c2b7' May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[jupyterhub-auth]/ensure) defined content as '{sha256}0ccaf0d7d7a85389a6af74e476bcf5ab37e108b197ba2f737e888eacc6b5a2a0' May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[jupyterhub-login]/ensure) defined content as '{sha256}cd5292ef8b059ce2c12d20b415732f7ed68c26deb8f7120840e9633de21590e4' May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/jupyterhub]/ensure) created May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/jupyterhub/ssl]/ensure) created May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/jupyterhub/templates]/ensure) created May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/run/jupyterhub]/ensure) created May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/usr/lib/tmpfiles.d/jupyterhub.conf]/ensure) defined content as '{sha256}47efa6524f66accd174c753237b574a1849143423f7c30fe6b1bda3d76bf7e16' May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/jupyterhub/templates/page.html]/ensure) defined content as '{sha256}10378ff9543a9d1ac3c6e81f80cc88fdcbd9a3ea3ee3a11eef211f03d358f8b4' May 08 16:05:48 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[jupyterhub_config.json]/ensure) defined content as '{sha256}0c6f00443ecdedc29158fe1798902e1a0d164c43aef6f966c43f915b35d9b622' May 08 16:05:49 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[announcement_config.json]/ensure) defined content as '{sha256}a1f6e409f627c179e7e5cd0c8e6e0d7e647ca19ac731e6f34825244fcec3e66d' May 08 16:05:49 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[submit.sh]/ensure) defined content as '{sha256}acce56c4ebd743857faa6e16a204790a07c54a24bd0ecd7bfa4f9e2dc4248ca4' May 08 16:05:49 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/opt/jupyterhub/hub-requirements.txt]/ensure) defined content as '{sha256}63c0ea663a057ad0b48cc9901aba7bb4b551f6fec82f637449d22855426d01b3' May 08 16:05:57 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[hub_pip_install]/returns) Using Python 3.12.7 environment at /opt/jupyterhub May 08 16:05:57 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[hub_pip_install]/returns) × No solution found when resolving dependencies: May 08 16:05:57 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[hub_pip_install]/returns) ╰─▶ Because there is no version of slurmformspawner==2.8.0 and you require May 08 16:05:57 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[hub_pip_install]/returns) slurmformspawner==2.8.0, we can conclude that your requirements are May 08 16:05:57 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[hub_pip_install]/returns) unsatisfiable. May 08 16:05:57 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[hub_pip_install]) Failed to call refresh: 'uv pip install -r /opt/jupyterhub/hub-requirements.txt' returned 1 instead of one of [0] May 08 16:05:57 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[hub_pip_install]) 'uv pip install -r /opt/jupyterhub/hub-requirements.txt' returned 1 instead of one of [0] May 08 16:05:58 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Exec[create_self_signed_sslcert]/returns) executed successfully May 08 16:05:58 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/jupyterhub/ssl/cert.pem]/mode) mode changed '0640' to '0644' May 08 16:05:58 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/jupyterhub/ssl/key.pem]/group) group changed 'root' to 'jupyterhub' May 08 16:05:58 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/File[/etc/jupyterhub/ssl/key.pem]/mode) mode changed '0600' to '0640' May 08 16:05:58 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Service[jupyterhub]) Dependency Exec[hub_pip_install] has failures: true May 08 16:05:58 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Jupyterhub/Service[jupyterhub]) Skipping because of failed dependencies May 08 16:05:58 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub/File[/etc/jupyterhub/templates/login.html]/ensure) defined content as '{sha256}d4c07e484f250161498351eed0ae094d18a3b9e9dd06e876fd9ed6d5ebdd2154' May 08 16:06:29 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub/Consul::Service[jupyterhub]/File[/etc/consul/service_jupyterhub.json]/ensure) defined content as '{sha256}228c12847befee4403d1ac8d276e6f1f03a3b6f0365100c418cc8900f1e66449' May 08 16:11:18 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub::Keytab/Exec[jupyterhub_keytab]/returns) executed successfully May 08 16:11:18 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub::Keytab/Exec[jupyterhub_keytab]) Triggered 'refresh' from 1 event May 08 16:11:18 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub::Keytab/File[/etc/jupyterhub/jupyterhub.keytab]/group) group changed 'root' to 'jupyterhub' May 08 16:11:18 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[919]: (/Stage[main]/Profile::Jupyterhub::Hub::Keytab/File[/etc/jupyterhub/jupyterhub.keytab]/mode) mode changed '0600' to '0640' May 08 16:27:20 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[9587]: (/Stage[main]/Jupyterhub/Service[jupyterhub]/ensure) ensure changed 'stopped' to 'running' May 08 16:57:17 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[10022]: (/Stage[main]/Jupyterhub/Service[jupyterhub]/ensure) ensure changed 'stopped' to 'running' (corrective) May 08 17:27:17 mc-login1.int.magic-castle-cluster.calculquebec.cloud puppet-agent[10771]: (/Stage[main]/Jupyterhub/Service[jupyterhub]/ensure) ensure changed 'stopped' to 'running' (corrective) ```

I paste here the main.tf file I used to create the cluster (with the "sensitive" data changed WLOG):

main.tf
terraform {
  required_version = ">= 1.4.0"
}

variable "pool" {
  description = "Slurm pool of compute nodes"
  default = []
}

module "openstack" {
  source         = "./openstack"
  config_git_url = "https://github.com/ComputeCanada/puppet-magic_castle.git"
  config_version = "14.2.0"

  cluster_name = "magic-castle-cluster"
  domain       = "calculquebec.cloud"
  image        = "Rocky-9.3-GenericCloud"

  instances = {
    mc-mgmt   = { 
        type = "compute-inhpc.medium", 
        count = 1, 
        tags = ["puppet", "mgmt", "nfs"],
        #disk_type = ceph,
        disk_size = 30
    }
    mc-login  = { 
        type = "compute-inhpc.large",  
        count = 1,
        tags = ["login", "public", "proxy"],
        #disk_type = ceph,
        disk_size = 26
    }
    mc-node   = { 
        type = "compute-inhpc.large",
        count = 2, 
        tags = ["node"],
        #disk_type = ceph, 
        disk_size = 25
    }
  }

  # var.pool is managed by Slurm through Terraform REST API.
  # To let Slurm manage a type of nodes, add "pool" to its tag list.
  # When using Terraform CLI, this parameter is ignored.
  # Refer to Magic Castle Documentation - Enable Magic Castle Autoscaling
  pool = var.pool

  volumes = {
    nfs = {
      home     = { size = 100 }
      project  = { size = 30 }
      scratch  = { size = 30 }
    }
  }

  public_keys = [file("~/.ssh/id_rsa_1.pub"),file("~/.ssh/id_rsa_2.pub")]
  software_stack = "eessi"

  nb_users = 5
  # Shared password, randomly chosen if blank
  guest_passwd = "testpassword"
  sudoer_username = "admin"

  os_floating_ips = {
    mc-login1 = "138.246.1.1"
    mc-login2 = "138.246.1.2"
  }
  os_ext_network = "internet_pool"
  subnet_id = "274...d5e"

  firewall_rules = {
    ssh1     = { "from_port" = 22,    "to_port" = 22,    tag = "login", "protocol" = "tcp", "cidr" = "10.156.48.0/22" },           # Internal IP range 1
    ssh2     = { "from_port" = 22,    "to_port" = 22,    tag = "login", "protocol" = "tcp", "cidr" = "129.187.1.1/32" },            # IP 1
    ssh3     = { "from_port" = 22,    "to_port" = 22,    tag = "login", "protocol" = "tcp", "cidr" = "10.156.84.0/24" },          # Internal IP range 2
    ssh4     = { "from_port" = 22,    "to_port" = 22,    tag = "login", "protocol" = "tcp", "cidr" = "10.195.6.1/32" },            # IP 2
    http1     = { "from_port" = 80,    "to_port" = 80,    tag = "proxy", "protocol" = "tcp", "cidr" = "10.156.48.0/22" },        # Internal IP range 1
    http2    = { "from_port" = 80,    "to_port" = 80,    tag = "proxy", "protocol" = "tcp", "cidr" = "129.187.1.1/22" },          # IP 1
    https1    = { "from_port" = 443,   "to_port" = 443,   tag = "proxy", "protocol" = "tcp", "cidr" = "10.156.48.0/22" },    # Internal IP range 1
    https2    = { "from_port" = 443,   "to_port" = 443,   tag = "proxy", "protocol" = "tcp", "cidr" = "129.187.1.1/22" },     # IP 1
    #globus  = { "from_port" = 2811,  "to_port" = 2811,  tag = "dtn",   "protocol" = "tcp", "cidr" = "54.237.254.192/29" },
    #myproxy = { "from_port" = 7512,  "to_port" = 7512,  tag = "dtn",   "protocol" = "tcp", "cidr" = "0.0.0.0/0" },
    #gridftp = { "from_port" = 50000, "to_port" = 51000, tag = "dtn",   "protocol" = "tcp", "cidr" = "0.0.0.0/0" }
  }
}

output "accounts" {
  value = module.openstack.accounts
}

output "public_ip" {
  value = module.openstack.public_ip
}


## Uncomment to register your domain name with CloudFlare
# module "dns" {
#   source           = "./dns/cloudflare"
#   name             = module.openstack.cluster_name
#   domain           = module.openstack.domain
#   public_instances = module.openstack.public_instances
# }

## Uncomment to register your domain name with Google Cloud
# module "dns" {
#   source           = "./dns/gcloud"
#   project          = "your-project-id"
#   zone_name        = "you-zone-name"
#   name             = module.openstack.cluster_name
#   domain           = module.openstack.domain
#   public_instances = module.openstack.public_instances
# }

# output "hostnames" {
#   value = module.dns.hostnames
# }

I attach also the complete log of $ journalctl -u puppet and $ cat /var/log/cloud-init-output.log (in the latter there is an "Invalid cloud-config provided", but i don't understand to what is related and if it's relevant)

login_cloud-init-output.log
login_journalctl-puppet.log

Maybe I'm missing something obvious, in that case I apologise in advance.

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions