-
Notifications
You must be signed in to change notification settings - Fork 4
Alpha Cluster Deployment
The alpha cluster is deployed under the same architecture described for the test cluster
The alpha cluster deploys the following instances of each class
1 master/control plane server:
- kubmaster01
2 node/worker servers:
- kubnode01
- kubnode02
1 NFS Storage server:
- kubvol01
All of the deployed alpha nodes are one of two type of instances:
- 1GB instance: (kubmaster01, kubvol01)
- 2GB instance: (kubnode01, kubnode02)
All instances were deployed in the same datacenter of the same provider in order to enable private network communication
- 1GB RAM
- 1 vCPU
- 20GB Storage
- 2GB RAM
- 1 vCPU
- 30GB Storage
All three of these machines are deployed as Fedora 25 instances
- Set the system hostname
- Apply shared cluster configurations
- Disable password logins for the root user
- Install netdata for node monitoring
- Open firewall port for netdata
- Secure public ports
- Allow private network traffic
- Disable SELinux
Before copy/pasting, set shell variable:
-
host: desired machine hostname
(
set -e
hostnamectl set-hostname ${host?}
dnf -y install git-core
git clone https://github.com/CodeForPhilly/ops.git /opt/ops
(
cd /opt/ops
ln -s kubernetes/alpha-cluster/post-merge .git/hooks/post-merge
.git/hooks/post-merge
)
sed -i 's/^PermitRootLogin yes/PermitRootLogin without-password/' /etc/ssh/sshd_config
systemctl restart sshd
curl -Ss 'https://raw.githubusercontent.com/firehol/netdata-demo-site/master/install-required-packages.sh' >/tmp/kickstart.sh && bash /tmp/kickstart.sh -i netdata-all && rm -f /tmp/kickstart.sh
git clone https://github.com/firehol/netdata.git --depth=1
( cd netdata && ./netdata-installer.sh --install /opt )
firewallctl zone '' -p add port 19999/tcp
firewallctl zone '' -p remove service cockpit
firewallctl zone internal -p add source 192.168.0.0/16
firewall-cmd --permanent --zone=internal --set-target=ACCEPT # for some inexplicable reason, this version of firewallctl does not provide a way to do this
firewallctl reload
sed -i 's/SELINUX=enforcing/SELINUX=permissive/' /etc/selinux/config
setenforce 0
)
This machine is deployed as an openSUSE Leap 42.2 instance
- Set the system hostname
- Apply the shared cluster configurations
- Disable password logins for the root user
- Install
mancommand (don't ask me why it's not there to start with...) or why it depends on 30 f'ing packages) - Install netdata for node monitoring
- Lockdown public firewall
- Open firewall to private network
(
set -e
hostnamectl set-hostname ${host?}
zypper in -y git-core
git clone https://github.com/CodeForPhilly/ops.git /opt/ops
(
cd /opt/ops
ln -s kubernetes/alpha-cluster/post-merge .git/hooks/post-merge
.git/hooks/post-merge
)
sed -i 's/^PermitRootLogin yes/PermitRootLogin without-password/' /etc/ssh/sshd_config
systemctl restart sshd
zypper in -y man
curl -Ss 'https://raw.githubusercontent.com/firehol/netdata-demo-site/master/install-required-packages.sh' >/tmp/kickstart.sh && bash /tmp/kickstart.sh -i netdata-all && rm -f /tmp/kickstart.sh
git clone https://github.com/firehol/netdata.git --depth=1
( cd netdata && ./netdata-installer.sh --install /opt )
zypper in -y firewalld
systemctl start firewalld
systemctl enable firewalld
firewallctl zone '' -p add interface eth0
firewallctl zone '' -p add port 19999/tcp
firewallctl zone internal -p add source 192.168.0.0/16
firewall-cmd --permanent --zone=internal --set-target=ACCEPT
firewallctl reload
)
These instruction presume that the workstation from which the administrator is working has been appropriately configured with the necessary workstation resources.
These nodes are deployed using the kubernetes contrib ansible playbooks. The python environment from which ansible is run will require the python-netaddr module in order to use the playbooks.
Once the dependencies are satisfied, the following steps will provision the kubernetes nodes and master:
- Apply cluster configuration data to ansible playbooks
- Run ansible playbooks
- Open API port on master for remote use of kubectl
Before copy/pasting, set shell variables:
-
repo_contrib: path to kubernetes contrib repo -
repo_ops: path to the ops repo
(
set -e
cp "${repo_ops?}/kubernetes/alpha-cluster/workstation-resources/kubernetes-contrib.patch" "${repo_contrib?}/kubernetes-contrib.patch"
cd "${repo_contrib?}"
git apply kubernetes-contrib.patch
cd ansible/scripts
./deploy-cluster.sh
ssh root@kubmaster01 'firewallctl zone "" -p add service https && firewallctl reload'
)
- Install ZFS and NFS
- Load ZFS kernel module
- Create ZFS pool for container volumes
- Run NFS server
- Run ZFS programs
(
set -e
zypper ar obs://filesystems filesystems
zypper in zfs-kmp-default zfs yast2-nfs-server
modprobe zfs
echo zfs > /etc/modules-load.d/zfs.conf
zpool create -f kubvols /dev/sdc
systemctl start nfs-server
systemctl enable nfs-server
systemctl start zfs.target
systemctl enable zfs.target
)
NOTE: These service packages are currently under design, and have not yet been deployed
The ansible playbook deployment includes several performance monitoring and log collection add-ons which run in the kube-system namespace. While these services give good insights into how to operate such addons throughout the cluster, they are not configured to operate in a manner particularly suited to our purposes. This section describes a set of services to be stood up in their place.
So long as the docker engine has the json-file logging driver enabled, kubelet will automatically create symlinks for each container JSON log on the host under /var/log/containers. This puts the container logs under a single directory where they can be easily collected.
The included addons use fluentd to collect these logs, which is a good option. The cluster should run a fluentd DaemonSet whose only role is to collect container logs and ship them to a fluentd aggregator service.
The fluentd aggregator service should receive shipped log data and in turn forward it to an elasticsearch instance to be persisted. For simplicity's sake, fluentd should ship to elasticsearch using the logstash format.
On the elasticsearch instance, a daily curator job should be run to clean up old indexes, and to create per-project filtered aliases against each new or existing index to provide projects access only to the documents they will need.
Log data may be consumed by users through a Grafana interface, in which each filtered alias can be provided as a data source to an organization, members of which can be allowed to freely create and destroy dashboards and analyses against those sources.
Per-node metrics can be provided through running a privileged netdata container as a DaemonSet. These containers can be configured to ship metrics to the fluentd aggregator service for eventual persistence in an elasticsearch index (ideally logstash style named indexes as well). These metrics would the be viewable via dashboards in Grafana
Per-application metrics can be provided by shipping directly from the application containers to the fluentd aggregator service for eventual persistence in an elasticsearch index. Some thought still needs to be given to whether each application would be given its own index, or whether all application metrics should be stored in the same index, with per-project access being given by way of filtered aliases. Also, should it be undesirable for give application containers the ability to connect directly to the fluentd shipper, a separate fluentd instance or something like a statsd service could be setup as an intermediary. These metrics could be added to the per-project organization data sources in Grafana