-
Notifications
You must be signed in to change notification settings - Fork 6
Improve how to deploy a collection #172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 14 commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
0a3194d
Improve how to deploy a collection
Ndpnt dd477c4
Document how to manage a custom terms type
Ndpnt abe1dd2
Fix weight
Ndpnt 8033aec
Improve how to deploy a collection
Ndpnt 115e36e
Apply suggestions from code review
Ndpnt bc04f4a
Improve how to manage a custom terms type
Ndpnt 949df7c
Fix hover color for footer links
Ndpnt e62abd9
Increase footer menu items spacing for readability
Ndpnt 380f03e
Fix english grammar
Ndpnt 409bffc
Add shortcode to show content by query param
Ndpnt 4dbf569
Explain terms types
Ndpnt 3312601
Move and improve guide to manage custom terms type
Ndpnt 532d689
Improve how content specific to ota members is handled
Ndpnt 2a23b6d
Create a reference for Deployment architecture
Ndpnt f6c4616
Improve showIfParam shortcode
Ndpnt dc53468
Improve CSS
Ndpnt File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
--- | ||
title: Take over a collection | ||
weight: 4 | ||
weight: 5 | ||
--- | ||
|
||
# How to take over a collection | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
--- | ||
title: Terminate a collection | ||
weight: 3 | ||
weight: 4 | ||
--- | ||
|
||
# How to terminate a collection | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,141 +5,226 @@ weight: 1 | |
|
||
# How to deploy a collection | ||
|
||
This guide will help you deploy an Open Terms Archive collection to a server. | ||
This guide will help you deploy an Open Terms Archive collection to a server. The deployment is automated using [Ansible](https://docs.ansible.com/ansible/latest/index.html) and will set up the Open Terms Archive engine and configure it to track your collection's terms. | ||
|
||
## Prerequisites | ||
|
||
Before starting, ensure you have: | ||
|
||
- A basic understanding of the [deployment architecture]({{< relref "deployment/reference/architecture" >}}) | ||
- A server with admin access | ||
- All collections repositories created, if not, see the [guide to create repositories]({{< relref "collections/how-to/create-repositories" >}}) | ||
- At least one declaration added to your collection | ||
- A GitHub user account to automate actions such as committing entries in versions and snapshots repositories, reporting issues when tracking fails, publishing releases… | ||
- [Ansible](https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html) installed on your local machine | ||
|
||
## 1. Configure the server | ||
|
||
First, ensure your server provides unsupervised access: | ||
|
||
1. Check the SSH host key: | ||
1. Check the SSH host key and get the SSH fingerprint by running the following command on your local machine: | ||
|
||
```shell | ||
ssh-keyscan --type=ed25519 <server_address> | ||
ssh-keyscan -t ed25519 <server_address> | ||
``` | ||
If no Ed25519 key appears, generate one on the server: | ||
|
||
If no Ed25519 key appears, generate one by running the following commands on the server: | ||
|
||
```shell | ||
sudo ssh-keygen --type=ed25519 --file=/etc/ssh/ssh_host_ed25519_key | ||
sudo ssh-keygen -t ed25519 -f /etc/ssh/ssh_host_ed25519_key | ||
sudo systemctl restart ssh | ||
``` | ||
|
||
2. Create a non-root user if needed: | ||
> **Note**: A server fingerprint is a unique identifier for your server's SSH key. It helps verify that you're connecting to the correct server and not a malicious one. The fingerprint is a hash of the server's public key and is used to prevent man-in-the-middle attacks. You'll need this fingerprint in the next steps for secure deployment. | ||
|
||
2. Create a dedicated user account specifically for deployment purposes, by running the following commands on the server: | ||
|
||
```shell | ||
adduser <user> | ||
usermod --append --groups=sudo <user> | ||
adduser <deployment_user> | ||
usermod --append --groups=sudo <deployment_user> | ||
``` | ||
|
||
3. Grant passwordless sudo access: | ||
> **Note**: The `adduser` command might not be installed by default on your system. It can be installed with `sudo apt-get install adduser`. | ||
|
||
3. Configure passwordless sudo access for this user, by adding the following line to the `/etc/sudoers` file on the server: | ||
|
||
```shell | ||
# Add to /etc/sudoers: | ||
<user> ALL=(ALL) NOPASSWD:ALL | ||
<deployment_user> ALL=(ALL) NOPASSWD:ALL | ||
``` | ||
|
||
> **Note**: While passwordless sudo access does reduce security compared to requiring a password, it is required for full automation in deployment workflows with Ansible. The deployment process requires system-level operations (like installing packages and configuring services) that must be executed without manual intervention. To mitigate security risks, this configuration is limited to a dedicated deployment user that should only be used for deployment purposes, and the server must be properly secured with SSH key authentication. | ||
|
||
## 2. Set up the deployment configuration | ||
|
||
1. Clone the collection declarations repository: | ||
1. Clone the collection declarations repository that you want to deploy and navigate to the collection folder: | ||
|
||
```shell | ||
git clone https://github.com/OpenTermsArchive/<collection_id>-declarations.git | ||
git clone https://github.com/<organization>/<collection_id>-declarations.git | ||
cd <collection_id>-declarations | ||
``` | ||
|
||
2. Configure the inventory file `deployment/inventory.yml`: | ||
2. Configure the inventory file `deployment/inventory.yml` with your server's IP address, deployment user, server fingerprint and the repository URL: | ||
|
||
```yaml | ||
<host>: "your.server.ip" | ||
ansible_user: "your_username" | ||
ed25519_fingerprint: "your_ssh_fingerprint" | ||
<server_ip>: | ||
ansible_user: <deployment_user> | ||
ed25519_fingerprint: <server_ssh_fingerprint> | ||
ota_source_repository: https://github.com/<organization>/<collection_id>-declarations.git | ||
``` | ||
|
||
3. Add the server fingerprint to GitHub: | ||
- Go to `https://github.com/OpenTermsArchive/<collection_name>-declarations/settings/secrets/actions` | ||
3. Add the server fingerprint to GitHub, to allow the deployment workflow to uniquely identify the server: | ||
- Go to `https://github.com/<organization>/<collection_id>-declarations/settings/secrets/actions` | ||
- Create a new secret named `SERVER_FINGERPRINT` with your Ed25519 fingerprint | ||
|
||
## 3. Configure SSH deployment keys | ||
|
||
1. On the server, generate a deployment key: | ||
1. On the server, generate a deployment key, which will be used by the continuous deployment workflow to connect to the server to deploy the collection: | ||
|
||
```shell | ||
ssh-keygen --type=ed25519 --quiet --passphrase="" --file=~/.ssh/ota-deploy | ||
ssh-keygen -t ed25519 -N "" -f ~/.ssh/ota-deploy | ||
cat ~/.ssh/ota-deploy.pub >> ~/.ssh/authorized_keys | ||
``` | ||
|
||
2. Add the private key to GitHub: | ||
- Go to the repository secrets | ||
- Create `SERVER_SSH_KEY` with the private key content | ||
2. Add the private key to GitHub, to allow the deployment workflow to connect to the server: | ||
- Go to `https://github.com/<organization>/<collection_id>-declarations/settings/secrets/actions` | ||
- Create a new secret named `SERVER_SSH_KEY` with the private key content | ||
|
||
3. Back up the keys: | ||
- Store both public and private keys in the shared password database | ||
- Create an entry titled "Deployment SSH key" in the collection folder | ||
{{< showIfParam "ota" >}} | ||
3. Back up the keys in the shared password database by creating an entry titled "Deployment SSH Key" in the collection folder and storing both public and private keys in this entry | ||
{{< /showIfParam >}} | ||
|
||
## 4. Set up GitHub permissions | ||
|
||
1. Create a fine-grained GitHub token: | ||
- Log in as OTA-Bot | ||
1. Log in as the user account dedicated to bot-related actions in GitHub | ||
|
||
2. Create a fine-grained GitHub token: | ||
- Create a new token at github.com/settings/personal-access-tokens/new | ||
- Set repository access for both declarations and versions repos | ||
- Set repository access for both declarations and versions repositories | ||
- Grant "Contents" and "Issues" write permissions | ||
|
||
2. Back up the token: | ||
- Store it in the shared password database under "GitHub Token" | ||
3. If relevant, get the token approved by having an organization admin approve the token request | ||
|
||
3. Get the token approved: | ||
- Have an organization admin approve the token request | ||
4. Keep this token for the next steps | ||
|
||
## 5. Configure secrets | ||
{{< showIfParam "ota" >}} | ||
5. Back up the token in the shared password database by creating an entry titled "GitHub Token" in the collection folder and storing the token in this entry | ||
{{< /showIfParam >}} | ||
|
||
## 5. Configure and encrypt secrets | ||
|
||
This section uses [Ansible Vault](https://docs.ansible.com/ansible/latest/vault_guide/index.html), a feature of Ansible that allows you to encrypt sensitive data like passwords and keys. The encrypted files can be safely committed to version control while keeping the actual secrets secure. The vault key you'll create will be used to encrypt and decrypt these secrets. | ||
|
||
1. Generate and store a vault key: | ||
- Generate a secure password without quotes/backticks | ||
- Store it in the password database | ||
- Create `deployment/vault.key` with the password | ||
- Add it as `ANSIBLE_VAULT_KEY` in GitHub secrets | ||
- Inside the collection folder, create a file named `deployment/vault.key` and paste the generated password into it. | ||
- Go to `https://github.com/<organization>/<collection_id>-declarations/settings/secrets/actions` | ||
- Create a new secret named `ANSIBLE_VAULT_KEY` and paste the same password into it. | ||
|
||
2. Store GitHub token: | ||
``` | ||
# In deployment/.env: | ||
> **Note**: The same vault key is used in two places: | ||
> - Locally as `vault.key` to encrypt/decrypt files during development | ||
> - In GitHub Actions as `ANSIBLE_VAULT_KEY` to decrypt files during automated deployment | ||
|
||
2. Store the GitHub token, generated in the previous section, in `deployment/.env`: | ||
|
||
```shell | ||
OTA_ENGINE_GITHUB_TOKEN=your_token | ||
``` | ||
|
||
3. Encrypt the `.env` file: | ||
3. Encrypt the `.env` file by running the following command inside the `deployment` folder of the collection: | ||
|
||
```shell | ||
ansible-vault encrypt .env | ||
``` | ||
|
||
> **Note**: Running the command from the `deployment` folder will ensure that the `vault.key` file is used as vault key, since this folder contains an `ansible.cfg` file that explicitly configures this behavior. | ||
> | ||
> To decrypt an encrypted file, use: | ||
> | ||
> ```shell | ||
> ansible-vault decrypt deployment/.env | ||
> ``` | ||
> | ||
> After making changes, re-encrypt it: | ||
> | ||
> ```shell | ||
> ansible-vault encrypt deployment/.env | ||
> ``` | ||
|
||
4. Commit the changes to the repository | ||
|
||
{{< showIfParam "ota" >}} | ||
5. Back up the vault key in the shared password database by creating an entry titled "Vault Key" in the collection folder and storing the vault key in this entry | ||
{{< /showIfParam >}} | ||
|
||
## 6. Set up collection-specific SSH key | ||
|
||
1. Generate a new key: | ||
1. Generate a new key, which will be used by the Open Terms Archive engine to perform actions on GitHub as the bot user: | ||
|
||
```shell | ||
ssh-keygen --type=ed25519 --comment=[email protected] --passphrase="" --file=./<collection_name>-key | ||
ssh-keygen -t ed25519 -C [email protected] -N "" -f ./<collection_name>-key | ||
``` | ||
|
||
2. Encrypt and store the private key: | ||
2. Store the private key in `deployment/github-bot-private-key` | ||
|
||
3. Encrypt the private key file by running the following command inside the `deployment` folder of the collection: | ||
|
||
```shell | ||
# Copy private key to deployment/github-bot-private-key | ||
ansible-vault encrypt github-bot-private-key | ||
``` | ||
|
||
3. Add the public key to OTA-Bot's GitHub account: | ||
4. Commit the changes to the repository | ||
|
||
5. Add the public key to bot user's GitHub account: | ||
- Go to github.com/settings/ssh/new | ||
- Add the public key with title "<collection_name> collection" | ||
|
||
{{< showIfParam "ota" >}} | ||
6. Back up the key in the shared password database by creating an entry titled "OTA-Bot GitHub SSH key" in the collection folder and storing both public and private keys in this entry | ||
{{< /showIfParam >}} | ||
|
||
## 7. Configure email notifications | ||
|
||
1. Generate SMTP credentials: | ||
- Create a new SMTP key in Brevo | ||
- Name it "<collection_name> collection" | ||
This section describes how to configure the engine to use a specific SMTP server to send email notifications when it encounters errors during the tracking process. This helps you stay informed about issues that need attention and allows you to restart the tracking process if needed. | ||
|
||
1. Get the SMTP credentials (host, username, password) from your email provider | ||
|
||
2. Update collection SMTP configuration within the `logger` key of `@opentermsarchive/engine` in the `config/production.json` file: | ||
|
||
```json | ||
"logger": { | ||
"smtp": { | ||
"host": "<smtp_host>", | ||
"username": "<smtp_username>" | ||
}, | ||
}, | ||
``` | ||
|
||
3. Store the password in `deployment/.env`: | ||
|
||
2. Store the credentials: | ||
```shell | ||
# In deployment/.env: | ||
OTA_ENGINE_SMTP_PASSWORD=your_smtp_key | ||
``` | ||
|
||
3. Encrypt the `.env` file: | ||
> **Note**: To decrypt the file encrypted in a previous step in order to add the password, run `ansible-vault decrypt .env` | ||
|
||
4. Encrypt the `.env` file: | ||
|
||
```shell | ||
ansible-vault encrypt .env | ||
``` | ||
|
||
{{< showIfParam "ota" >}} | ||
5. Create a new SMTP key in Brevo and name it "<collection_name> collection" | ||
6. Back up the key in the shared password database by creating an entry titled "SMTP Key" in the collection folder and storing the credentials in this entry | ||
{{< /showIfParam >}} | ||
|
||
## 8. Test the deployment | ||
|
||
1. Via GitHub Actions: | ||
- Check that the `deploy` action completes successfully | ||
|
||
2. Via local deployment: | ||
|
||
```shell | ||
cd <collection_id>-declarations/deployment | ||
ansible-galaxy collection install --requirements-file requirements.yml | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
--- | ||
title: Deployment architecture | ||
linkTitle: Architecture | ||
weight: 1 | ||
--- | ||
|
||
# Deployment architecture | ||
|
||
This document provides an overview of the key components and elements involved in the deployment process of a collection. | ||
|
||
## Repository structure | ||
|
||
A collection is defined by three repositories that work together to manage and track terms. | ||
|
||
The declarations repository, `<collection_name>-declarations`, serves as the primary workspace for collection maintainers, containing declarations of the terms to track along with engine and deployment configurations. | ||
|
||
This repository is complemented by two automatically managed repositories: | ||
|
||
- The versions repository, `<collection_name>-versions`, which maintains a chronological history of terms changes in their readable format | ||
- The snapshots repository, `<collection_name>-snapshots`, which maintains a chronological history of the original source document (HTML, PDF…) from which the terms will be extracted | ||
|
||
These repositories must be considered as databases and are automatically updated by the engine whenever changes are detected in the tracked terms. | ||
|
||
## Infrastructure | ||
|
||
The server is where the Open Terms Archive engine runs. | ||
|
||
The server requires administrative access to allow setting up the system in the appropriate state. | ||
|
||
It has an Ed25519 SSH host key pair, `ssh_host_ed25519_key`, which provides a unique server fingerprint, `<server_ssh_fingerprint>`, for identity verification. | ||
|
||
There is also a dedicated deployment user account, `<deployment_user>`, with passwordless sudo access to facilitate automated deployment tasks while maintaining security. | ||
|
||
Process management is handled through [PM2](https://pm2.keymetrics.io/) and ensures the Open Terms Archive engine runs continuously and reliably. | ||
|
||
The engine itself is the core application that performs the actual term tracking and repository management tasks. | ||
|
||
## Security elements | ||
|
||
### Authentication | ||
|
||
Security is maintained through multiple layers of authentication. | ||
|
||
The server's SSH host key pair, `ssh_host_ed25519_key`, generates a unique server fingerprint, `<server_ssh_fingerprint>`. This fingerprint verifies server identity and prevents man-in-the-middle attacks during deployment. | ||
|
||
The deployment process uses a dedicated SSH key pair, `ota-deploy`, for secure server connections during the continuous deployment workflow. | ||
|
||
A separate collection-specific SSH key pair, `<collection_name>-key`, enables the engine to perform GitHub actions as a bot user. | ||
|
||
Access to GitHub repositories is controlled through a fine-grained access token, `OTA_ENGINE_GITHUB_TOKEN`, that provides specific permissions for repository management. | ||
|
||
### Secret management | ||
|
||
Sensitive information is protected by the [Ansible Vault](https://docs.ansible.com/ansible/latest/vault_guide/index.html) encryption system. | ||
|
||
The vault system uses a master password, `vault.key` to encrypt and decrypt sensitive data. This includes the environment configuration file, `.env`, and the GitHub bot's private key, `github-bot-private-key`, ensuring that sensitive credentials remain secure while still being accessible to the deployment process. | ||
|
||
## Automation tools | ||
|
||
[GitHub Actions](https://docs.github.com/en/actions) and [Ansible](https://www.ansible.com/) automate the deployment process. GitHub Actions runs the workflow while Ansible configures the server and deploys the engine. | ||
|
||
A dedicated GitHub user account is used for bot-related actions such as committing entries in versions and snapshots repositories, reporting issues when tracking fails, and publishing releases. This account is configured with specific permissions to perform these automated tasks. | ||
|
||
The engine sends email notifications to collection administrators when errors or issues occur during the tracking process, enabling prompt intervention when needed. | ||
|
||
The engine automatically creates issues in the declarations repository to notify collection maintainers when terms can no longer be tracked. These issues provide details about the tracking failure to allow maintainers to investigate and resolve the problem. | ||
|
||
## Configuration files | ||
|
||
The system's behavior is controlled through several key configuration files: | ||
|
||
- `inventory.yml`: Defines server address and deployment parameters | ||
- `production.json`: Stores application-specific settings | ||
- `vault.key`: Protects sensitive data through encryption | ||
|
||
## Maintenance | ||
|
||
The Open Terms Archive system is designed for continuous operation with minimal intervention. | ||
|
||
The engine automatically tracks changes in terms, commits updates to the appropriate repositories, reports issues and sends notifications when issues occur. | ||
|
||
System health is maintained through PM2's process management capabilities. | ||
|
||
Regular administrative maintenance involves updating collections dependencies such as engine and deployment recipes. It also includes monitoring email notifications and reviewing application logs in case of issues or tracking interruptions. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
--- | ||
title: Server specifications | ||
weight: 1 | ||
weight: 2 | ||
--- | ||
|
||
# Server specifications | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we use the demo template, all declarations are removed by the first time setup process.