-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Labels
Description
What happened?
- Based on our communities' needs over the last few years, we have created multiple projects to manage user home directories.
- Two critical ones are prometheus-dirsize-exporter (provides information about each user's home directory (like total size, number of files, the last time they touched it, etc)) and jupyterhub-home-nfs (limits size of each user's home directory)
- Since prometheus-dirsize-exporter is designed to run on all sorts of home directories across cloud providers, it's very generic. It also intentionally runs slowly, since we most disks that home directories run on (like Amazon EFS, EBS, GCP Filestore, Azure File, or just a plain old hard disk on a disk) have a limited number of IO operations they can do per second (IOPS), and most of those should be reserved for actual users rather than reporting.
- So on some large communities' disks, the information from prometheus-dirsize-exporter can be sometimes hours out of date! This makes life difficult for admins, as they aren't able to fully trust the information they see in their Grafana about user's home directories, as it may be hours out of date.
- There was also no information about what each user's limits are, making it difficult to do alerts.
- With Add prometheus metrics for dirsize and limits jupyterhub-home-nfs#76, we now export two new metrics from
jupyterhub-home-nfs-total_size_bytes(total size of each user's home directory) andhard_limit_bytes(max allowed size of each user's home directory). Since these rely on the reporting features of the underlying XFS filesystem, they are practically instant. So admins will get up-to-date user home directory sizes within minutes, no matter the size of the home directories! - We also added Allow disabling total size metric prometheus-dirsize-exporter#29 to
prometheus-dirsize-exporter, so it will stop collecting duplicatetotal_size_bytesmetrics, but will continue to collect other metrics. This means that users of the upstream JupyterHub Grafana dashboards will get the same useful view about home directory usage, regardless of wether the metric comes fromprometheus-dirsize-exporterorjupyterhub-home-nfs. - This has been rolled out to all our communities with Collect home directory metrics from jupyterhub-home-nfs infrastructure#7261
- This will also help us in providing community-specific alerts to hub admins when a user is near their quota (work tracked in Setup minimal round of *community facing alerts* for user home directory usage infrastructure#7166)
Why should we be excited about it?
- Because A
- Because B
Where can we learn more?
- Link A
- Link B
Media and images
Home Directory Usage dashboard, with total size coming from jupyterhub-home-nfs and all other columns coming from prometheus-dirsize-exporter
Acknowledgements
To Jenny and Angus, who suggested finding ways to roll some parts of prometheus-dirsize-exporter into jupyterhub-home-nfs based on experiences with various communities
- Post published.
- Shared on socials
- Shared in the team Slack.
- (If applicable) Emailed to the partner/community member who was featured.
jnywong