Alpha Cluster Deployment

Architecture

The alpha cluster is deployed under the same architecture described for the test cluster

Deployed servers

The alpha cluster deploys the following instances of each class

1 master/control plane server:

kubmaster01

2 node/worker servers:

kubnode01
kubnode02

1 NFS Storage server:

kubvol01

Instance information

All of the deployed alpha nodes are one of two type of instances:

1GB instance: (kubmaster01, kubvol01)
2GB instance: (kubnode01, kubnode02)

All instances were deployed in the same datacenter of the same provider in order to enable private network communication

1GB Instance

1GB RAM
1 vCPU
20GB Storage

2GB Instance

2GB RAM
1 vCPU
30GB Storage

Base system deployment

kubmaster01, kubnode01, kubnode02

All three of these machines are deployed as Fedora 25 instances

Post-deployment configuration

Set the system hostname
Apply shared cluster configurations
Disable password logins for the root user
Install netdata for node monitoring
Open firewall port for netdata
Secure public ports
Allow private network traffic
Disable SELinux

Before copy/pasting, set shell variable:

host: desired machine hostname

(
  set -e
  hostnamectl set-hostname ${host?}
  dnf -y install git-core
  git clone https://github.com/CodeForPhilly/ops.git /opt/ops
  (
    cd /opt/ops
    ln -s kubernetes/alpha-cluster/post-merge .git/hooks/post-merge
    .git/hooks/post-merge  
  )
  sed -i 's/^PermitRootLogin yes/PermitRootLogin without-password/' /etc/ssh/sshd_config
  systemctl restart sshd
  curl -Ss 'https://raw.githubusercontent.com/firehol/netdata-demo-site/master/install-required-packages.sh' >/tmp/kickstart.sh && bash /tmp/kickstart.sh -i netdata-all && rm -f /tmp/kickstart.sh
  git clone https://github.com/firehol/netdata.git --depth=1
  ( cd netdata && ./netdata-installer.sh --install /opt )
  firewallctl zone '' -p add port 19999/tcp
  firewallctl zone '' -p remove service cockpit
  firewallctl zone internal -p add source 192.168.0.0/16
  firewall-cmd --permanent --zone=internal --set-target=ACCEPT  # for some inexplicable reason, this version of firewallctl does not provide a way to do this
  firewallctl reload
  sed -i 's/SELINUX=enforcing/SELINUX=permissive/' /etc/selinux/config
  setenforce 0
)

kubvol01

This machine is deployed as an openSUSE Leap 42.2 instance

Post-deployment configuration

Set the system hostname
Apply the shared cluster configurations
Disable password logins for the root user
Install man command (don't ask me why it's not there to start with...) or why it depends on 30 f'ing packages)
Install netdata for node monitoring
Lockdown public firewall
Open firewall to private network

(
  set -e
  hostnamectl set-hostname ${host?}
  zypper in -y git-core
  git clone https://github.com/CodeForPhilly/ops.git /opt/ops
  (
    cd /opt/ops
    ln -s kubernetes/alpha-cluster/post-merge .git/hooks/post-merge
    .git/hooks/post-merge  
  )
  sed -i 's/^PermitRootLogin yes/PermitRootLogin without-password/' /etc/ssh/sshd_config
  systemctl restart sshd
  zypper in -y man
  curl -Ss 'https://raw.githubusercontent.com/firehol/netdata-demo-site/master/install-required-packages.sh' >/tmp/kickstart.sh && bash /tmp/kickstart.sh -i netdata-all && rm -f /tmp/kickstart.sh
  git clone https://github.com/firehol/netdata.git --depth=1
  ( cd netdata && ./netdata-installer.sh --install /opt )
  zypper in -y firewalld
  systemctl start firewalld
  systemctl enable firewalld
  firewallctl zone '' -p add interface eth0
  firewallctl zone '' -p add port 19999/tcp
  firewallctl zone internal -p add source 192.168.0.0/16
  firewall-cmd --permanent --zone=internal --set-target=ACCEPT
  firewallctl reload
)

Cluster provisioning

These instruction presume that the workstation from which the administrator is working has been appropriately configured with the necessary workstation resources.

kubmaster01, kubnode01, kubnode02

These nodes are deployed using the kubernetes contrib ansible playbooks. The python environment from which ansible is run will require the python-netaddr module in order to use the playbooks.

Once the dependencies are satisfied, the following steps will provision the kubernetes nodes and master:

Apply cluster configuration data to ansible playbooks
Run ansible playbooks
Open API port on master for remote use of kubectl

Before copy/pasting, set shell variables:

repo_contrib: path to kubernetes contrib repo
repo_ops: path to the ops repo

(
  set -e
  cp "${repo_ops?}/kubernetes/alpha-cluster/workstation-resources/kubernetes-contrib.patch" "${repo_contrib?}/kubernetes-contrib.patch"
  cd "${repo_contrib?}"
  git apply kubernetes-contrib.patch
  cd ansible/scripts
  ./deploy-cluster.sh
  ssh root@kubmaster01 'firewallctl zone "" -p add service https && firewallctl reload'
)

kubvol01

Install ZFS and NFS
Load ZFS kernel module
Create ZFS pool for container volumes
Run NFS server
Run ZFS programs

(
  set -e
  zypper ar obs://filesystems filesystems
  zypper in zfs-kmp-default zfs yast2-nfs-server
  modprobe zfs
  echo zfs > /etc/modules-load.d/zfs.conf
  zpool create -f kubvols /dev/sdc
  systemctl start nfs-server
  systemctl enable nfs-server
  systemctl start zfs.target
  systemctl enable zfs.target
)

Cluster Services

NOTE: These service packages are currently under design, and have not yet been deployed

The ansible playbook deployment includes several performance monitoring and log collection add-ons which run in the kube-system namespace. While these services give good insights into how to operate such addons throughout the cluster, they are not configured to operate in a manner particularly suited to our purposes. This section describes a set of services to be stood up in their place.

Logging

So long as the docker engine has the json-file logging driver enabled, kubelet will automatically create symlinks for each container JSON log on the host under /var/log/containers. This puts the container logs under a single directory where they can be easily collected.

The included addons use fluentd to collect these logs, which is a good option. The cluster should run a fluentd DaemonSet whose only role is to collect container logs and ship them to a fluentd aggregator service.

The fluentd aggregator service should receive shipped log data and in turn forward it to an elasticsearch instance to be persisted. For simplicity's sake, fluentd should ship to elasticsearch using the logstash format.

On the elasticsearch instance, a daily curator job should be run to clean up old indexes, and to create per-project filtered aliases against each new or existing index to provide projects access only to the documents they will need.

Log data may be consumed by users through a Grafana interface, in which each filtered alias can be provided as a data source to an organization, members of which can be allowed to freely create and destroy dashboards and analyses against those sources.

Metrics and Monitoring

Per-node metrics can be provided through running a privileged netdata container as a DaemonSet. These containers can be configured to ship metrics to the fluentd aggregator service for eventual persistence in an elasticsearch index (ideally logstash style named indexes as well). These metrics would the be viewable via dashboards in Grafana

Per-application metrics can be provided by shipping directly from the application containers to the fluentd aggregator service for eventual persistence in an elasticsearch index. Some thought still needs to be given to whether each application would be given its own index, or whether all application metrics should be stored in the same index, with per-project access being given by way of filtered aliases. Also, should it be undesirable for give application containers the ability to connect directly to the fluentd shipper, a separate fluentd instance or something like a statsd service could be setup as an intermediary. These metrics could be added to the per-project organization data sources in Grafana

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Alpha Cluster Deployment

Architecture

Deployed servers

Instance information

1GB Instance

2GB Instance

Base system deployment

kubmaster01, kubnode01, kubnode02

Post-deployment configuration

kubvol01

Post-deployment configuration

Cluster provisioning

kubmaster01, kubnode01, kubnode02

kubvol01

Cluster Services

Logging

Metrics and Monitoring

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CivCloud

System Architecture

Operations

Development

Clone this wiki locally