deployment: add helm charts, configuration and openshift docs

ppanero · ppanero · commit d0c3834f7b63 · 2020-03-17T16:54:19.000+01:00
diff --git a/docs/deployment/configuration.md b/docs/deployment/configuration.md
@@ -0,0 +1,93 @@
+# Configuration
+
+This section explain the current configuration options that are available for the different components that are deployed but the Helm charts.
+
+## Global
+
+The is only one mandatory configuration, which is the host name, for example:
+
+```yaml
+host: your-rdm-instance.com
+```
+
+Moreover, the services can be deployed along, note that it is recommended to deploy separatelly Elasticsearch and PostgreSQL for a production deployment.
+Therefore, by default only `redis` and `rabbitmq` are enabled. Example configuration:
+
+``` yaml
+postgresql:
+  inside_cluster: false
+
+elasticsearch:
+  inside_cluster: false
+```
+
+!!! info "inside_cluster availability
+    Note that the `inside_cluster` variable is supported for `redis`, `rabbitmq`, `elasticsearch`, `postgresql` and `haproxy`. The rest of the components
+    are mandatory.
+
+## HAProxy
+
+You can change the number of connections allowed by the haproxy with the `maxconn` variable:
+
+```
+haproxy:
+  maxconn: 100
+```
+
+!!! warning "Only one parent element"
+    By default the HAProxy is enabled, `inside_cluster` has the value `true`, nonetheless if you decide to set it, you only need
+    to specify once the `haproxy` parent in the yaml file. Otherwise the last one will override the previous. It should be something like:
+    ``` yaml
+    haproxy:
+        inside_cluster: true
+        maxconn: 100
+    ```
+## Nginx
+
+The charts allow you to configure the amount of connections per nginx node (replica) and the amount of nodes:
+
+```
+nginx:
+  max_conns: 100
+  replicas: 2
+```
+
+## Web nodes
+
+The web nodes host the WSGI application, in order to be scalable you can configure the number of "nodes", called replicas, how many processes do each node run and with how many threads per process. The only mandatory parameter is the docker image (`image`) that should get as value the url where to pull the image from.
+
+In addition, you can add automatic scaling, by setting minimum and maximum replicas and the threshold of cpu usage in which a new node should be spawned. For example, with a threshold of 65%, it meand that when the average CPU utilization of the nodes reaches 65% a new node will de spawned, till it reaches the setted maximum:
+
+``` yaml
+web:
+  image: your/invenio-image
+  replicas: 6
+  uwsgi:
+    processes: 6
+    threads: 4
+  autoscaler:
+    enabled: false
+    # Scale when CPU usage gets to
+    scaler_cpu_utilization: 65
+    max_web_replicas: 10
+    min_web_replicas: 2
+```
+
+## Worker nodes
+
+Finally, the worker nodes. By default they are enabled, but you can cancel their deployment by setting `enabled` to `false`. If enabled, in the same fashion than the web nodes
+they require an `image`.
+
+In addition, you can configure how many worker nodes (replicas) will be deployed, which the application they will run, with which concurrency level and the logging level.
+
+``` yaml
+worker:
+  enabled: true
+  image: your/invenio-image
+  # Invenio Celery worker application
+  app: invenio_app.celery
+  # Number of concurrent Celery workers per pod
+  concurrency: 2
+  log_level: INFO
+  replicas: 2
+  ```
diff --git a/docs/deployment/index.md b/docs/deployment/index.md
@@ -0,0 +1,60 @@
+# How can I deploy InvenioRDM?
+
+You can deploy InvenioRDM in several ways. You can install it in your [local computer](../develop/index.md) to have more easily customizable environment, otherwise you can deploy a [containerized environment](../preview/index.md) that demonstrates the setup of all components runnning in docker containers. In this section it is explained how to deploy in a closer-to-production manner.
+
+!!! warning "Do not deploy as-is in production"
+    Please note that it is mentioned as "closer-to-production", this is because even if the designed architecture can scale and withstand the load of a production service (It has been tested to stand peaks of up to 180 requests/s), the security configurations might not be enough and you should review it. In addition, it can deploy the extra services (Elasticsearch and PostgreSQL) along with them, however this ones are not configured with redundancy and persistance.
+
+## Helm Charts
+
+[Helm](https://helm.sh) is the package manager for [Kubernetes](https://kubernetes.io/). This means, that by using Helm charts you can deploy InvenioRDM in any cloud provider that supports Kubernetes (e.g. OpenShift clusters, Google Cloud, Amazon Web Services, IBM Cloud).
+
+**What is a Helm chart?**
+
+A Helm chart is a definition of the architecture of the system, meaning how all components interconnect with each other (In a similar fashion that a `docker-compose` file).
+
+In addition, Helm allows you to **install, version and upgrade and rollback** your InvenioRDM installation in an easy way. You can find more information about Helm [here](https://helm.sh/docs/intro/quickstart/).
+
+### Charts description
+
+The currents charts propose the following architecture:
+
+- HAProxy as entry point. It provides load balancing and queuing of the requests.
+- Nginx as reverse proxy. It serves as reverse proxy, to help HAproxy and uWSGI "talk" the same language (protocol).
+- Web application nodes, running the uWSGI application.
+- Redis and RabbitMQ come along in containers.
+- Elasticsearch and PostgreSQL can be added to the deployment, however they are not configured in-depth and therefore not suited for more than demo purposes.
+
+For more in-depth documentation see the [services description](services.md) and the configuration available [here](configuration.md).
+
+## Pre-Requirements
+
+- [Helm](https://helm.sh/docs/intro/install/) version 3.x
+- Adding the [helm-invenio](https://github.com/inveniosoftware/helm-invenio) repository
+
+``` console
+$ helm repo add helm-invenio https://inveniosoftware.github.io/helm-invenio/
+$ helm repo update
+$ helm search invenio
+
+NAME                   	CHART VERSION	APP VERSION	DESCRIPTION
+helm-invenio/invenio	0.2.0        	1.16.0     	Open Source framework for large-scale digital repositories
+helm-invenio/invenio	0.1.0        	1.16.0     	Open Source framework for large-scale digital repositories
+```
+
+You can also install by cloning from GitHub by cloning the repository:
+
+```
+$ git clone https://github.com/inveniosoftware/helm-invenio.git
+$ cd helm-invenio/
+```
+
+Then you will need, to reference the `./invenio` folder rather than the chart name (`helm-invenio/invenio`).
+
+## Supported Platforms
+
+!!! warning "Only compatible with OpenShift"
+    Pleas note that currently these Helm charts are only compatible with OpenShift.
+
+- [OpenShift](openshift.md)
+- [Kubernetes](kubernetes.md)
diff --git a/docs/deployment/openshift.md b/docs/deployment/openshift.md
@@ -1,3 +1,162 @@
 # OpenShift
 
-Coming soon
+## Pre-Requirements
+
+- [Global deployment pre-requirements](index.md#pre-requirements)
+- [OpenShift CLI](https://docs.openshift.com/container-platform/4.3/cli_reference/openshift_cli/getting-started-cli.html#cli-installing-cli_cli-developer-commands) version 3.11+
+
+
+## Deploying InvenioRDM
+
+First of all login and select the right project in your OpenShift cluster:
+
+```console
+$ oc login <your.openshift.cluster>
+$ oc project invenio
+```
+
+### Secrets
+
+Before deploying in need to provide the credentials so the application can access the different services:
+
+**Database secrets:**
+
+```console
+$ POSTGRESQL_PASSWORD=$(openssl rand -hex 8)
+$ POSTGRESQL_USER=invenio
+$ POSTGRESQL_HOST=db
+$ POSTGRESQL_PORT=5432
+$ POSTGRESQL_DATABASE=invenio
+$ oc create secret generic \
+  --from-literal="POSTGRESQL_PASSWORD=$POSTGRESQL_PASSWORD" \
+  --from-literal="SQLALCHEMY_DB_URI=postgresql+psycopg2://$POSTGRESQL_USER:$POSTGRESQL_PASSWORD@$POSTGRESQL_HOST:$POSTGRESQL_PORT/$POSTGRESQL_DATABASE" \
+  db-secrets
+secret "db-secrets" created
+```
+
+**RabbitMQ secrets:**
+
+```console
+$ RABBITMQ_DEFAULT_PASS=$(openssl rand -hex 8)
+$ oc create secret generic \
+  --from-literal="RABBITMQ_DEFAULT_PASS=$RABBITMQ_DEFAULT_PASS" \
+  --from-literal="CELERY_BROKER_URL=amqp://guest:$RABBITMQ_DEFAULT_PASS@mq:5672/" \
+  mq-secrets
+secret "mq-secrets" created
+```
+
+**Elasticsearch secrets:**
+
+!!! info "Elasticaserch variables"
+    Currently, and until [invenio-search#198](https://github.com/inveniosoftware/invenio-search/issues/198) has been addressed, the Elasticsearch configuration
+    has to be loaded in a single environment variable.
+
+``` console
+$ export INVENIO_SEARCH_ELASTIC_HOSTS="[{'host': 'localhost', 'timeout': 30, 'port': 9200, 'use_ssl': True, 'http_auth':('USERNAME_CHANGEME', 'PASSWORD_CHANGEME')}]"
+$ oc create secret generic \
+  --from-literal="INVENIO_SEARCH_ELASTIC_HOSTS=$INVENIO_SEARCH_ELASTIC_HOSTS" \
+  elasticsearch-secrets
+```
+
+!!! info "Extra configuration is possible"
+    Note that you might need to add extra configuration to the elasticsearch hosts, sucha as vertificate verification (`verify_certs`), prefixing (`url_prefix`) and more.
+
+### Install InvenioRDM
+
+Before installing you need to configure two things, the rest are optional and you can read more about it [here](configuration.md):
+
+- Your host in a `values.yaml` file.
+- The web/worker docker images.
+
+``` yaml
+host: yourhost.localhost
+
+web:
+  image: your/invenio-image
+
+worker:
+  image: your/invenio-image
+```
+
+The next step is the installation itself, with your own configuration in the `values.yaml`. If you added the repository you can install it by using the chart name and the desired version:
+
+``` console
+$ helm install -f values.yaml invenio helm-invenio/invenio --version 0.2.0
+```
+
+If you want to install from GitHub, in a clone you can do so as follows:
+
+``` console
+$ cd helm-invenio/
+$ helm install -f values.yaml invenio ./invenio [--disable-openapi-validation]
+```
+
+In both cases the output will be:
+
+``` console
+NAME: invenio
+LAST DEPLOYED: Mon Mar  9 16:25:15 2020
+NAMESPACE: default
+STATUS: deployed
+REVISION: 1
+TEST SUITE: None
+NOTES:Invenio is ready to rock :rocket:
+```
+
+!!! warning "Bypassing openapi validation"
+    We must pass `--disable-openapi-validation` as there is currently a problem with OpenShift objects and Helm when it comes to client side validation, see [issue](https://github.com/openshift/origin/issues/24060).
+
+
+### Setup the instance
+
+Once the instance has been installed you have to set up the services. Note that this step is only needed the first time, if you are upgrading the instance this is not needed.
+
+Get a bash terminal in a web pod:
+
+```console
+$ oc get pods
+$ oc exec -it <web-pod> bash
+```
+
+Setup the instance using the `invenio` commands:
+
+``` console
+$ . scl_source enable rh-python36
+$ invenio db init # If the db does not exist already, otherwise `create` is enough
+$ invenio db create
+$ invenio index init
+$ invenio index queue init purge
+$ invenio files location --default 'default-location'  $(invenio shell --no-term-title -c "print(app.instance_path)")'/data'
+$ invenio roles create admin
+$ invenio access allow superuser-access role admin
+```
+
+#### Launching jobs
+
+**One time job**
+
+In some cases you might want to run jobs, for example to populate the instance with records.
+
+``` console
+$ oc process -f job.yml --param JOB_NAME='demo-data-1' \
+  --param JOB_COMMAND='invenio rdm-records demo' | oc create -f -
+```
+
+**Cron job**
+
+Now imagine you might have some bulk record creation that needs to be indexed, or any other task that you have to do every certain period of time.
+For that you can define cronjobs:
+
+``` console
+$ oc process -f cronjob.yml --param JOB_NAME=index-run \
+  --param JOB_COMMAND=invenio index run -d | oc create -f -
+```
+
+### Upgrade your instance
+
+If you have performed some changes to your instance (e.g. configuration) or you want to upgrade the version of the charts, you can do so with
+the `upgrade` command of `helm`, note that you still need to disable the openapi validation which is only supoprted after version 3.1.2:
+
+``` console
+$ helm upgrade -f values.yaml --disable-openapi-validation
+```
diff --git a/docs/deployment/services.md b/docs/deployment/services.md
@@ -0,0 +1,3 @@
+# Services
+
+Coming soon
diff --git a/docs/index.md b/docs/index.md
@@ -49,4 +49,4 @@ Ready to deploy into production? Follow these guides to put a small instance in
 production and learn the avenues to take and decisions to make to grow it.
 Whether you use Openshift or Kubernetes, we've got you covered.
 
-[> Deployment Guides](deployment/openshift.md)
+[> Deployment Guides](deployment/index.md)
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -43,6 +43,9 @@ nav:
     - Add your extensions: 'extensions/custom.md'
     - S3 Storage: 'extensions/s3.md'
   - Deploy:
+    - How can I deploy it?: 'deployment/index.md'
+    - System components: 'deployment/services.md'
+    - Configuration:  'deployment/configuration.md'
     - Kubernetes: 'deployment/kubernetes.md'
     - OpenShift: 'deployment/openshift.md'