You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CogStack is composed of a range of adaptable modular interoperable tools which introduce tiered functionality which can be used for a variety of use-technologies:
6
+
Get started by looking at the [CogStack Overview](overview/CogStack-Documentation.md)
6
7
7
-
Centralise and lake clinical data including structured data i.e. observations, results, and unstructured data i.e. clinical narratives such as clinic letters, discharge and admission summaries and radiology reports also varying formats e.g. binary word docs, PDFs, images.
8
+
Any broad questions then please do reach out in our community space [here](https://discourse.cogstack.org/)
8
9
9
-
Search and visualise millions of distinct data points in near-real-time – ‘unlocking’ capabilities that would otherwise have taken days or months previously.
10
+
Further in development projects are [here](https://github.com/orgs/CogStack/repositories)
10
11
11
-
Natural Language Processing of clinical text to standardised clinical terminologies (SNOMED-CT) for interoperable clinical data combined with semantic context. This allows cohorting based on “find all patients with a heart attack”, regardless of how this has been referred to in the clinical text, such as “patient had myocardial infarct”, “MI“, “infarct of heart”, “cardiac infarct” and distinguishing “the patient’s father had a MI”.
Deep phenotyping using NLP allows accelerated NHS clinical coding, disease registry submissions and advanced cohorting for observational studies.
14
+
| Tool | Description |
15
+
|:-----|:------------|
16
+
| <imgsrc="./overview/attachments/36c0d23f-a632-4fbf-9f7c-6669e88bbd39.png"width="100"/> <br/> [**CogStack-Nifi**](https://cogstack-nifi.readthedocs.io/en/latest/main.html)| Data flow orchestration using Apache NiFi |
17
+
| <imgsrc="./overview/attachments/09a8bb60-9864-41fa-be7b-cf9a9dc04498.png"width="100"/> <br/> [**MedCAT**](https://medcat.readthedocs.io/en/latest/)| Medical Concept Annotation Toolkit |
18
+
| <imgsrc="./overview/attachments/09a8bb60-9864-41fa-be7b-cf9a9dc04498.png"width="100"/> <br/> [**MedCATTrainer**](https://medcattrainer.readthedocs.io/en/latest/)| Web-based annotation and training interface for MedCAT |
14
19
15
-
Population health dashboards for combining data from structured and text components of the electronic health record to track patient outcomes, enhance patient safety and improve patient care.
16
-
17
-
Advanced analytics using generative AI for virtual trial emulation, high-dimensional patient or disease modelling and digital patient twins.
| Overview <br/> In this part are covered the available services that can be running in an example CogStack deployment. To such deployment with many running services we refer as an *ecosystem* or a *platform*. Below is presented a high-level perspective of CogStack platform with the possibilities it enables through many components and services. <br/> []() <br/> <br/> In practice, many of the functionalities that CogStack platform enables are implemented as separate, but interconnected services working inside the ecosystem. <br/> <br/> | > [!NOTE] **On this page :** <br/> <ul class="toc-indentation"><br/><li><a href="#CogStackecosystem(v1)-Overview">Overview</a></li><br/><li><a href="#CogStackecosystem(v1)-platform-coreCoreservices">Core services</a></li><br/><li><a href="#CogStackecosystem(v1)-platform-pipelineCogStackPipeline">CogStack Pipeline</a></li><br/><li><a href="#CogStackecosystem(v1)-platform-postgres"></a></li><br/><li><a href="#CogStackecosystem(v1)-PostgreSQL">PostgreSQL</a></li><br/><li><a href="#CogStackecosystem(v1)-platform-esElasticSearch">ElasticSearch</a></li><br/><li><a href="#CogStackecosystem(v1)-platform-kibanaKibana">Kibana</a></li><br/><li><a href="#CogStackecosystem(v1)-platform-nginxNGINX">NGINX</a></li><br/><li><a href="#CogStackecosystem(v1)-platform-fluentdFluentd">Fluentd</a></li><br/></ul> <br/> |
6
+
In this part are covered the available services that can be running in an example CogStack deployment. To such deployment with many running services we refer as an *ecosystem* or a *platform*. Below is presented a high-level perspective of CogStack platform with the possibilities it enables through many components and services. In practice, many of the functionalities that CogStack platform enables are implemented as separate, but interconnected services working inside the ecosystem.
9
7
10
-
---
11
-
12
-
---
13
-
14
-
# Core services
8
+
## Core services
15
9
16
10
In most scenarios CogStack platform will consist of *core* services tailored to specific use-cases. Additional application and services can be run on top of it, such as [SemEHR](../../CogStack%20General/CogStack%20Wiki/CogStack%20projects/SemEHR.md), [Patient Timeline](../../CogStack%20General/CogStack%20Wiki/CogStack%20projects/Patient%20Timeline.md), Live Alerting (through ElasticSearch plugins) or any other custom developed applications. For an ease-of-use, when deploying a sample CogStack platform, we always emphasise to use Docker Compose (see: [Running CogStack](Running%20CogStack.md)).
17
11
@@ -41,7 +35,7 @@ It is essential to note that presented is a very simplified scenario, which can
41
35
42
36
---
43
37
44
-
# CogStack Pipeline
38
+
###CogStack Pipeline
45
39
46
40
CogStack Pipeline is the main data processing service used inside the CogStack platform. Within the ecosystem it's main responsibilities is to ingest the EHR data from a specified data source, process the data (e.g. by applying the text extraction methods, records de-identification or extracting the NLP annotations) and store the resulting data in the specified sink.
47
41
@@ -60,9 +54,9 @@ The information about available data processing components offered by CogStack P
60
54
61
55
---
62
56
63
-
#
64
57
65
-
# PostgreSQL
58
+
59
+
### PostgreSQL
66
60
67
61
[PostgreSQL](https://www.postgresql.org/) is a widely used object-relational database management system. In CogStack platform it is primarily used as a job repository, for storing the jobs execution status of running CogStack Pipeline instances. However, there may be cases where one may need to store the partial results treating PostgreSQL DB either as a data cache (see: [Examples](Examples.md) ) or an auxiliary data sink.
68
62
@@ -88,7 +82,7 @@ When used as a job repository, it requires defining appropriate tables with a us
88
82
89
83
---
90
84
91
-
# ElasticSearch
85
+
###ElasticSearch
92
86
93
87
[ElasticSearch](https://www.elastic.co/guide/) is a popular NoSQL search engine based on the Lucene library that provides a distributed full-text search engine storing the data as schema-free JSON documents. Inside CogStack platform it is usually used as a primary data store for processed EHR data by CogStack Pipeline.
94
88
@@ -151,7 +145,7 @@ Depending on the use-case, the processed EHR data is usually stored in indices a
151
145
152
146
---
153
147
154
-
# Kibana
148
+
###Kibana
155
149
156
150
[Kibana](https://www.elastic.co/products/kibana) is a data visualisation module for ElasticSeach that be easily used to explore and query the data. In sample CogStack platform deployments it can be used as a ready-to-use data exploration tool.
157
151
@@ -168,7 +162,7 @@ Apart from providing exploratory data analysis functionality it also offers admi
168
162
169
163
---
170
164
171
-
# NGINX
165
+
###NGINX
172
166
173
167
NGINX is a popular, open-source web server that can also be used as a reverse proxy, load balancer, HTTP cache and more. In CogStack platform deployments, it can be used as a reverse-proxy and providing a basic security access to the exposed data stores and service endpoints. Some of the functionality may include general user-based authentication, IP filtering and selective service access. A more detailed description of security features offered by NGINX can be found in the [official documentation](https://docs.nginx.com/nginx/admin-guide/security-controls/).
174
168
@@ -185,7 +179,7 @@ NGINX is a popular, open-source web server that can also be used as a reverse pr
185
179
186
180
---
187
181
188
-
# Fluentd
182
+
###Fluentd
189
183
190
184
[Fluentd](https://www.fluentd.org/) is an open source data collector providing a unified logging layer. In sample CogStack platform deployments it can be used running as a service collecting the logs from all the running services which can be used for auditing.
Copy file name to clipboardExpand all lines: docs/overview/CogStack-Documentation.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@
3
3
4
4
# CogStack Documentation
5
5
6
-
# What is CogStack?
6
+
##What is CogStack?
7
7
8
8
CogStack is a lightweight distributed, fault tolerant database processing architecture and ecosystem, intended to make NLP processing and preprocessing easier in resource constrained environments. It comprises of multiple components, and has been designed to provide configurable data processing pipelines for working with EHR data. For the moment it mainly uses databases and files as the primary source of EHR data with the possibility of adding custom data connectors in the near future. It makes use of the [Apache-Nifi](https://nifi.apache.org/) framework in order to provide a fully configurable data processing pipeline with the goal of generating annotated JSON standardised schema files that can be readily indexed into [ElasticSearch](https://www.elastic.co/), stored as files or pushed back to a database.
9
9
@@ -16,7 +16,7 @@ The CogStack ecosystem has been developed as an open source project with the cod
16
16
>
17
17
> Starting from version 1.2 CogStack is preferably being run as an ecosystem using a set of different microservices and deployed using [Docker Compose](https://docs.docker.com/compose/). The ready-to-use CogStack images are available to pull directly from the official Docker Hub under [cogstacksystems](https://hub.docker.com/u/cogstacksystems/) organisation. We’re actively pursuing running the stack in a K8s cluster also.
18
18
19
-
# Why does this project exist?
19
+
##Why does this project exist?
20
20
21
21
The CogStack consists of a range of technologies designed to to support modern, open source healthcare analytics within the NHS, and is chiefly comprised of the Elastic stack ([ElasticSearch](https://www.elastic.co/products/elasticsearch), [Kibana](https://www.elastic.co/products/kibana), etc.), [MedCAT](https://github.com/CogStack/MedCAT) (clinical natural language processing for named entity extraction and linking), clinical text [OCR](https://github.com/CogStack/ocr-service), clinical text de-identification. Since the processed EHR data can be represented and stored in databases or ElasticSearch, CogStack can be perfectly utilised as one of the solutions for integrating EHR data with other types of biomedical, -omics, wearables data, etc.
0 commit comments