-
Notifications
You must be signed in to change notification settings - Fork 86
NodeGroups
When administrating large compute clusters or server farms, working with groups of nodes/servers is much more safe and convenient. In most cases, you already have some sources of node groups available on your cluster (eg. from a database, flat files, genders, slurm or other resource management systems). But you probably have no easy way to efficiently reuse these available node groups elsewhere, like in administration scripts. As of version 1.3, the ClusterShell library defines a node groups syntax and allow you to bind these group sources to your applications: this is the Node Groups support. Having those group sources, group provisioning is easily done through user-defined external shell commands and results are cached in memory to avoid multiple calls. For advanced usage, you should be able to define your own group source directly in Python.
The ClusterShell library obtains node groups configuration options from the following system-wide configuration file:
/etc/clustershell/groups.conf
This file defines needed external calls to query node groups sources when node groups information is needed.
The configuration file format conforms to Python built-in ConfigParser. A first Main section defines only the group source to use by default:
[Main]
default: genders
The rest of the file consists in one or more group sources configurations, named by each section header, where up to four external commands (map, all, list and reverse) are defined, for example:
[genders]
map: nodeattr -n $GROUP
all: nodeattr -n ALL
list: nodeattr -l
[slurm]
map: sinfo -h -o "%N" -p $GROUP
all: sinfo -h -o "%N"
list: sinfo -h -o "%P"
reverse: sinfo -h -N -o "%P" -n $NODE
Specify the external shell command used to resolve a group name into a nodeset, list of nodes or list of nodeset (separated by space characters or by carriage returns). The variable $GROUP
is replaced before executing the command.
Optional external shell command that should return a nodeset, list of nodes or list of nodeset of all nodes for this group source. If not specified, the library will try to resolve all nodes by using the list external command in the same group source followed by map for each group.
Optional external shell command that should return the list of all groups for this group source (separated by space characters or by carriage returns).
Optional external shell command used to find the group(s) of a single node. The variable $NODE
is previously replaced. If this external call is not specified, the reverse operation is computed in memory by the library from the list and map external calls. Also, if the number of nodes to reverse is greater than the number of available groups, the reverse external command is avoided automatically.
Each external command might return a non-zero return code when the operation is not doable. But if the call return zero, for instance, for a non-existing group, the user will not receive any error when trying to resolve such unknown group. The desired behavior is up to the system administrator.
Generic node group representation for a group source :
@slurm:bigmem
represents the "bigmem" group of the "slurm" group source
Node group representation for the default group source:
@compute
represents the "compute" group of the default group source
Indexed node groups representation is very similar to nodesets, for example:
@chassis[0-2,5,7]
represents all the following node groups:
@chassis0
@chassis1
@chassis2
@chassis5
@chassis7
This syntax is also valid when the group source is specified, for example: @slurm:para[0-3]
.
Some nodeset
command line usage examples are shown below. Results are shown for illustration purposes only and will depend on your node groups configuration.
$ nodeset --groupsources
local(default)
clusterdb
slurm
$ nodeset -l
@oss
@mds
@io
@compute
@gpu
@all
$ nodeset --groupsource=slurm -l
@slurm:parallel
@slurm:bigmem
@slurm:amd
@slurm:test
$ nodeset -f @oss
node[4-9]
or
$ echo @oss | nodeset -f
node[4-9]
Groups union example:
$ nodeset -f @oss,@mds
node[2-9]
$ nodeset -f -a
node[2-9,12-99]
$ nodeset -e -a
node2 node3 node4 node5 node6 node7 node8 node9 node12 node13 node14 node15 node16 node17 node18 node19 node20 node21
node22 node23 node24 node25 node26 node27 node28 node29 node30 node31 node32 node33 node34 node35 node36 node37 node38
node39 node40 node41 node42 node43 node44 node45 node46 node47 node48 node49 node50 node51 node52 node53 node54 node55
node56 node57 node58 node59 node60 node61 node62 node63 node64 node65 node66 node67 node68 node69 node70 node71 node72
node73 node74 node75 node76 node77 node78 node79 node80 node81 node82 node83 node84 node85 node86 node87 node88 node89
node90 node91 node92 node93 node94 node95 node96 node97 node98 node99
$ nodeset -r node[4-9]
@oss
$ nodeset -r node[2-9]
@mds,@oss
By default, the NodeSet class of ClusterShell is able to interpret node groups syntax and is bound to a new NodeUtils.GroupResolverConfig
object in order to make external groups resolution. For instance, any scripts using the NodeSet class now support node groups automatically (assuming groups.conf
is well configured on the node). For example, resolving node groups is as simple as:
>>> from ClusterShell.NodeSet import NodeSet
>>> print NodeSet("@oss")
node[4-9]
To try performing reverse resolutions (ie. nodes to node groups), a new method has been added to the NodeSet class: NodeSet.regroup()
. Use this method to perform this operation, with optional group source as method parameter. For example:
>>> NodeSet("node[4-9]").regroup()
'@oss'
The NodeUtils module is a ClusterShell helper module that provides supplementary services to manage nodes in a cluster. It is primarily designed to enhance the NodeSet module providing some binding support to external node groups sources in separate namespaces.
Two base classes are defined in this module:
class GroupSource(object):
"""
GroupSource class managing external calls for nodegroup support.
""""
and
class GroupResolver(object):
"""
Base class GroupResolver that aims to provide node/group resolution
from multiple GroupSource's.
""""
The default NodeSet's GroupResolver
is also defined, named GroupResolverConfig
. A GroupResolverConfig
object is instantiated by NodeSet with /etc/clustershell/groups.conf
as configuration file parameter.