CogStack
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎anoncat-demo-app/README.md‎
Lines changed: 2 additions & 2 deletions b/‎anoncat-demo-app/README.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎anoncat-demo-app/app/frontend/src/App.vue‎
Lines changed: 1 addition & 1 deletion b/‎anoncat-demo-app/app/frontend/src/App.vue‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎medcat-trainer/README.md‎
Lines changed: 18 additions & 0 deletions b/‎medcat-trainer/README.md‎
Lines changed: 18 additions & 0 deletions
diff --git a/‎medcat-trainer/docs/installation.md‎
Lines changed: 74 additions & 0 deletions b/‎medcat-trainer/docs/installation.md‎
Lines changed: 74 additions & 0 deletions
diff --git a/‎medcat-trainer/docs/maintenance.md‎
Lines changed: 61 additions & 0 deletions b/‎medcat-trainer/docs/maintenance.md‎
Lines changed: 61 additions & 0 deletions
@@ -2,7 +2,7 @@
 
 [![Build Status](https://github.com/CogStack/cogstack-nlp/actions/workflows/medcat-v2_main.yml/badge.svg?branch=main)](https://github.com/CogStack/cogstack-nlp/actions/workflows/medcat-v2_main.yml/badge.svg?branch=main)
 [![Documentation Status](https://readthedocs.org/projects/cogstack-nlp/badge/?version=latest)](https://readthedocs.org/projects/cogstack-nlp/badge/?version=latest)
-[![Latest release](https://img.shields.io/github/v/release/CogStack/MedCAT2)](https://github.com/CogStack/MedCAT2/releases/latest)
+[![Latest release](https://img.shields.io/github/v/release/CogStack/cogstack-nlp?filter=medcat/*)](https://github.com/CogStack/cogstack-nlp/releases/latest)
 <!-- [![pypi Version](https://img.shields.io/pypi/v/medcat.svg?style=flat-square&logo=pypi&logoColor=white)](https://pypi.org/project/medcat/) -->
 
 Cogstack Natural Language Processing is for analysing clinical data using AI to draw insights from text in or documents in an Electronic Health Records.
 
@@ -1,6 +1,6 @@
 # Deidentify app
 
-Demo for AnonCAT. It uses [MedCAT](https://github.com/CogStack/MedCAT), an advanced natural language processing tool, to identify and classify sensitive information, such as names, addresses, and medical terms.
+Demo for AnonCAT. It uses [MedCAT](https://github.com/CogStack/cogstack-nlp/tree/main/medcat-v1), an advanced natural language processing tool, to identify and classify sensitive information, such as names, addresses, and medical terms.
 
 ## Example
 
@@ -22,7 +22,7 @@ MODEL_NAME = '<NAME OF MODEL HERE.zip>'
 
 ### Build your own model
 
-To build your own models please follow the tutorials outlined in [MedCATtutorials](https://github.com/CogStack/MedCATtutorials)
+To build your own models please follow the tutorials outlined in [MedCATtutorials](https://github.com/CogStack/cogstack-nlp/tree/main/medcat-v1-tutorials)
 
 *__Note:__ This is currently under development*
 
 
@@ -21,7 +21,7 @@
         <br>
         <p>Please DO NOT test with any real sensitive PHI data.</p>
         <br>
-        <p>Local validation and fine-tuning available via <a href="https://github.com/CogStack/MedCATtrainer">MedCATtrainer</a>.
+        <p>Local validation and fine-tuning available via <a href="https://github.com/CogStack/cogstack-nlp/tree/main/medcat-trainer">MedCATtrainer</a>.
           Email us, <a href="mailto:[email protected]">[email protected]</a>, to discuss model access, model performance, and your use case.
         </p>
         <br>
 
@@ -0,0 +1,18 @@
+ # Medical <img src="https://github.com/CogStack/cogstack-nlp/blob/main/media/cat-logo.png?raw=true" width=45>oncept Annotation Tool Trainer
+
+[![Build Status](https://github.com/CogStack/cogstack-nlp/actions/workflows/medcat-trainer_qa.yml/badge.svg?branch=main)](https://github.com/CogStack/cogstack-nlp/actions/workflows/medcat-trainer_qa.yml?query=branch%3Amain)
+[![Build Status](https://github.com/CogStack/cogstack-nlp/actions/workflows/medcat-trainer_release.yml/badge.svg)](https://github.com/CogStack/cogstack-nlp/actions/workflows/medcat-trainer_release.yml)
+[![Documentation Status](https://readthedocs.org/projects/cogstack-nlp-medcat-trainer/badge/?version=latest)](https://readthedocs.org/projects/cogstack-nlp-medcat-trainer/badge/?version=latest)
+[![Latest release](https://img.shields.io/github/v/release/CogStack/cogstack-nlp?filter=medcat-trainer/*)](https://github.com/CogStack/cogstack-nlp/releases/latest)
+
+MedCATTrainer is an interface for building, improving and customising a given Named Entity Recognition
+and Linking (NER+L) model (MedCAT) for biomedical domain text.
+
+MedCATTrainer was presented at EMNLP/IJCNLP 2019 :tada:
+[here](https://www.aclweb.org/anthology/D19-3024.pdf)
+
+# Documentation and Discussion
+
+Official docs available [here](https://docs.cogstack.org/projects/medcat-trainer)
+
+If you have any questions why not reach out to the community [discourse forum here](https://discourse.cogstack.org/)
@@ -0,0 +1,74 @@
+# Installation
+MedCATtrainer is a docker-compose packaged Django application.
+
+## Download from Dockerhub
+Clone the repo, run the default docker-compose file and default env var:
+```shell
+$ git clone https://github.com/CogStack/cogstack-nlp
+$ cd cogstack-nlp/medcat-trainer
+$ docker-compose up
+```
+
+This will use the pre-built docker images available on DockerHub. If your internal firewall does on permit access to DockerHub, you can build directly from source.
+
+To check logs of the MedCATtrainer running containers
+```bash
+$  docker logs <containerid> | grep "\[medcattrainer\]"
+$  docker logs <containerid> | grep "\[bg-process\]"
+$  docker logs <containerid> | grep "\[db-backup\]"
+```
+
+## MedCAT v0.x models
+If you have MedCAT v0.x models, and want to use the trainer please use the following docker-compose file:
+This refences the latest built image for the trainer that is still compatible with [MedCAT v0.x.](https://pypi.org/project/medcat/0.4.0.6/) and under.
+```shell
+$ docker-compose -f docker-compose-mc0x.yml up
+```
+
+## Build images from source
+The above commands runs the latest release of MedCATtrainer, if you'd prefer to build the Docker images from source, use
+```shell
+$ docker-compose -f docker-compose-dev.yml up
+```
+
+To change environment variables, such as the exposed host ports and language of spaCy model, use:
+```shell
+$ cp .env-example .env
+# Set local configuration in .env
+```
+
+## Troubleshooting
+If the build fails with an error code 137, the virtual machine running the docker
+daemon does not have enough memory. Increase the allocated memory to containers in the docker daemon
+settings CLI or associated docker GUI.
+
+On MAC: https://docs.docker.com/docker-for-mac/#memory
+
+On Windows: https://docs.docker.com/docker-for-windows/#resources
+
+### (Optional) SMTP Setup
+
+For password resets and other emailing services email environment variables are required to be set up.
+
+Personal email accounts can be set up by users to do this, or you can contact someone in CogStack for a cogstack no email credentials.
+
+The environment variables required are listed in [Environment Variables.](#(optional)-environment-variables)
+
+Environment Variables are located in envs/env or envs/env-prod, when those are set webapp/frontend/.env must change "VITE_APP_EMAIL" to 1.
+
+### (Optional) Environment Variables
+Environment variables are used to configure the app:
+
+|Parameter|Description|
+|---------|-----------|
+|MEDCAT_CONFIG_FILE|MedCAT config file as described [here](https://github.com/CogStack/cogstack-nlp/blob/main/medcat-v2/medcat/config/config.py)|
+|BEHIND_RP| If you're running MedCATtrainer, use 1, otherwise this defaults to 0 i.e. False|
+|MCTRAINER_PORT|The port to run the trainer app on|
+|EMAIL_USER|Email address which will be used to send users emails regarding password resets|
+|EMAIL_PASS|The password or authentication key which will be used with the email address|
+|EMAIL_HOST|The hostname of the SMTP server which will be used to send email (default: mail.cogstack.org)|
+|EMAIL_PORT|The port that the SMTP server is listening to, common numbers are 25, 465, 587 (default: 465)|
+
+Set these and re-run the docker-compose file.
+
+You'll need to `docker stop` the running containers if you have already run the install.
@@ -0,0 +1,61 @@
+# Maintanence
+
+MedCATtrainer is actively maintained. To ensure you receive the latest
+security patches of the software and its dependencies you should regularly
+be upgrading to the latest release.
+
+The latest stable releases update the `docker-compose.yml` and `docker-compose-prod.yml` files.
+
+To update these docker compose files, either copy them directly from the [repo](https://github.com/CogStack/cogstack-nlp/tree/main/medcat-trainer)
+or update the cloned files via:
+
+```shell
+$ cd MedCATtrainer
+$ git pull
+$ docker-compose up
+# alternatively for prod releases use:
+$ docker-compose -f docker-compose-prod.yml up
+```
+
+MedCATtrainer follows [Semver](https://semver.org/), so patch and minor release should always be backwards compatible, 
+whereas major releases, e.g. v1.x vs 2.x versions signify breaking changes. 
+
+Neccessary Django DB migrations will automatically applied between releases, which should largely be invisible to an end admin 
+or annotation user. Nevertheless, migrating ORM / DB models, then rolling back a release can cause issues if values are defaulted 
+or removed from a later version. 
+
+## Backup and Restore
+
+### Backup
+Before updating to a new release, a backup will be created in the `DB_BACKUP_DIR`, as configured in `envs/env`.
+A further crontab runs the same backup script at 10pm every night. This does not cause any downtime and will look like
+this in the logs:
+```shell
+medcattrainer-medcattrainer-db-backup-1  | Found backup dir location: /home/api/db-backup and DB_PATH: /home/api/db/db.sqlite3
+medcattrainer-medcattrainer-db-backup-1  | Backed up existing DB to /home/api/db-backup/db-backup-2023-09-26__23-26-01.sqlite3
+medcattrainer-medcattrainer-db-backup-1  | To restore this backup use $ ./restore.sh /home/api/db-backup/db-backup-2023-09-26__23-26-01.sqlite3
+```
+
+A backup is also automatically performed each time the service starts, and any migrations are performed, in the events of a new release
+introducing a breaking change and corrupting a DB.
+
+### Restore
+If a DB is corrupted or needs to be restored to an existing backed up db use the following commands, whilst the service is running:
+
+```shell
+$ docker ps
+CONTAINER ID   IMAGE                                          COMMAND                  CREATED      STATUS      PORTS                                               NAMES
+a2489b0c681b   cogstacksystems/medcat-trainer-nginx:v2.11.2   "/docker-entrypoint.…"   4 days ago   Up 4 days   80/tcp, 0.0.0.0:8001->8000/tcp, :::8001->8000/tcp   medcattrainer-nginx-1
+20fed153d798   solr:8                                         "docker-entrypoint.s…"   4 days ago   Up 4 days   0.0.0.0:8983->8983/tcp, :::8983->8983/tcp           mct_solr
+2b250a0975fe   cogstacksystems/medcat-trainer:v2.11.2         "/home/run.sh"           4 days ago   Up 4 days                                                       medcattrainer-medcattrainer-1
+$ docker exec -it 2b250a0975fe bash
+root@2b250a0975fe:/home/api# cd ..
+$ restore_db.sh db-backup-2023-09-25__23-21-39.sqlite3  # run the restore.sh script
+Found backup dir location: /home/api/db-backup, found db path: home/api/db/db.sqlite3
+DB file to restore: db-backup-2023-09-25__23-21-39.sqlite3
+Found db-backup-2023-09-25__23-21-39.sqlite3 - y to confirm backup: y  # you'll need tp confirm this is the correct file to restore.
+Restored db-backup-2023-09-25__23-21-39.sqlite3 to /home/db/db.sqlite3
+```
+
+The `restore_db.sh` script will automatically restore the latest db file, if no file is specified.
+