The Artifact Discovery and Preparation Toolkit Kubernetes Cluster (NLP-ADAPT-Kube) is designed to allow researchers to increase the scale of the tools found in the NLP-ADAPT VM for processing larger volumes of data at scale for use in ensembling of multiple annotator engines.
Scalability and performance: NLP-ADAPT-Kube will allow researchers to distribute annotator engine processing in parallel across multiple systems.
Usability: We simplify the process of deploying a kubernetes cluster by utilizing standard utilities, including
ansible-playbookfor building local dependenciesdockerfor building Docker imageskubectlfor deployment and monitoring of kubernetes clusterargofor workflow management of kubernetes clusterkubeadmfor simplification of creating a multi-node kubernetes clusterminikubefor localized testing of kubernetes cluster
We utilize minikube for testing as a single node Kubernetes cluster. We also successfully deployed a Kubernetes cluster across multiple nodes with a single master node using kubeadm.
All local devlopment can be done using minikube and then pushed to a production cluster. Our production cluster uses barebones native Linux OS access using calico as a virtualized overlay network to manage internal requests within cluster.
If you are interested in using this, please submit an issue. We can provide assistance in installation and configuration, as well as providing requisite Docker images needed to run this workflow.
This project, including the Wiki is a work in progress...
Funding for this work was provided by:
- 5U01TR002062-02