Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
ff3fe68
feat: cache huggingface models
rti Feb 1, 2024
38a3bf9
fix: sentence_transformers version
rti Feb 1, 2024
3fb6fd0
chore: remove custom model based on modelfile
rti Feb 1, 2024
a4c7294
fix(frontend): do not filter by score for now TBD
rti Feb 1, 2024
d38c5f0
chore: remove debug/test code
rti Feb 1, 2024
dc4501a
fix: required sentence_transformers version was actually > 2.2.0
rti Feb 1, 2024
42cdcc5
docs: add notes about embedding models to readme
rti Feb 1, 2024
13bc12e
chore: add debug output to api.py
rti Feb 1, 2024
4933a9a
fix: question in prompt
rti Feb 1, 2024
b23833b
chore: top_k 3 results for now
rti Feb 1, 2024
da1017b
wip: embeddings cache
rti Feb 1, 2024
41ff046
feat: document splitter
rti Feb 1, 2024
4e69697
Update .dockerignore
exowanderer Feb 2, 2024
10103c6
Merge branch 'main' into integration
rti Feb 4, 2024
0ee6ed5
docs: note on how to dev locally
rti Feb 4, 2024
7a2c955
docs: add research_log.md
rti Feb 4, 2024
0a5e2be
feat: set top_k via api
rti Feb 5, 2024
332e3dc
feat: support en and de on the api to switch prompts
rti Feb 5, 2024
6225fcc
feat: cache embedding model during docker build
rti Feb 5, 2024
4877807
wip: smaller chunk size, 5 sentences for now
rti Feb 5, 2024
da9859d
chore: remove comment
rti Feb 5, 2024
291aaaf
feat: enable embeddings cache (for developmnet)
rti Feb 9, 2024
936d83e
feat: add document cleaner
rti Feb 9, 2024
1b88437
Merge branch 'main' into integration
rti Feb 9, 2024
3e0b8f4
docs: long docker run options
rti Feb 9, 2024
edf5eb2
fix: access mode
rti Feb 9, 2024
63baf2b
fix: redraw loading animation on subsequent searches
rti Feb 9, 2024
56a7b8c
wip: workaround for runpod.io http port forwarding
rti Feb 9, 2024
8e05473
feat: switch to openchat 7b model
rti Feb 9, 2024
8276e35
Merge branch 'openchat' into integration
rti Feb 9, 2024
22b04d0
added logging via logger with Handler to api.py; PEP8 formatted api.py
exowanderer Feb 9, 2024
10f6b21
debugging use of homepage instead of hard coded endpoint values
exowanderer Feb 9, 2024
bfbd245
returning to previous to restart without errors
exowanderer Feb 9, 2024
7b6ba0a
renewed app.mount; bug fixed PEP8 changes in api.py; reformatted rag.…
exowanderer Feb 9, 2024
0428f87
returned to stablelm2 model for testing purposes. PEP8 upgrades in ap…
exowanderer Feb 9, 2024
8104dde
added OLLAMA_MODEL_NAME and OLLAMA_URL as environment variables; call…
exowanderer Feb 9, 2024
fbc4591
created logger.py to serve get_logger to all modules
exowanderer Feb 9, 2024
caecfd1
created a rag_pipeline in the rag.py based on the usage in api.py; re…
exowanderer Feb 9, 2024
5c0b4d0
UPdated with PEP8 formatting in vector_store_interface.py
exowanderer Feb 9, 2024
8833af7
chore(Dockerfile): install python deps early
rti Feb 12, 2024
9ee8a32
fix(sentence-transformers): use cuda if available
rti Feb 12, 2024
b2357e3
fix(frontend): run from webserver root
rti Feb 12, 2024
b518abf
feat: store embedding cache in volume
rti Feb 12, 2024
69800b0
feat(start.sh): pull llm using ollama (if not built into container)
rti Feb 12, 2024
7803649
feat(ollama): use chat api to leverage prompt templates
rti Feb 12, 2024
ff1fcab
docs: fix run cmd
rti Feb 19, 2024
8c0a2cb
wip: postgres vecto.rs db, s/haystack/langchain
rti Feb 25, 2024
a0e5a11
fix: vars and fields
rti Feb 25, 2024
507f7e3
feat: separate data import
rti Feb 26, 2024
57ec054
auto fetch gs-wiki articles
Silvan-WMDE Feb 21, 2024
c750dbc
Merge remote-tracking branch 'origin/integration-gswiki' into postgre…
rti Feb 28, 2024
fb891cd
Merge remote-tracking branch 'origin/main' into postgres-vecto.rs
rti Feb 28, 2024
33db5ab
chore: get_llm
rti Feb 28, 2024
b6fee21
feat: use fetch_articles from the command line
rti Feb 28, 2024
284aa55
Merge remote-tracking branch 'origin/main' into postgres-vecto.rs
rti Feb 28, 2024
8f7c49e
chore: simplify docker commands
rti Feb 28, 2024
eaabe70
docs: readme fix
rti Feb 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,6 @@ __pycache__/

# macOS
.DS_Store

# logs
*.log
19 changes: 17 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,18 +12,31 @@ FROM $CUDA_FROM

ENV PATH="/usr/local/cuda/bin:${PATH}"

# Install unattendedly
ENV DEBIAN_FRONTEND=noninteractive

# Force a config for tzdata package, otherwise it will interactively ask during install
RUN ln -fs /usr/share/zoneinfo/UTC /etc/localtime

# Install essential packages from ubuntu repository
RUN apt-get update -y && \
apt-get install -y --no-install-recommends openssh-server openssh-client git git-lfs && \
apt-get install -y curl && \
apt-get install -y python3 python3-pip python3-venv && \
apt-get install -y postgresql-14 && \
apt-get install -y jq && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*


# Install vecto.rs extension to postgres
RUN curl -L -O https://github.com/tensorchord/pgvecto.rs/releases/download/v0.2.0/vectors-pg14_0.2.0_amd64.deb
RUN dpkg -i vectors-pg14_0.2.0_amd64.deb


# Install node from upstream, ubuntu packages are too old
RUN curl -sL https://deb.nodesource.com/setup_18.x | bash
RUN apt-get install -y nodejs && \
RUN curl -sL https://deb.nodesource.com/setup_18.x | bash && \
apt-get install -y nodejs && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*

Expand Down Expand Up @@ -53,11 +66,13 @@ ARG OLLAMA_URL=http://localhost:11434
ENV OLLAMA_MODEL_NAME=${OLLAMA_MODEL_NAME}
ENV OLLAMA_URL=${OLLAMA_URL}

# TODO: cache path
RUN ollama serve & while ! curl ${OLLAMA_URL}; do sleep 1; done; ollama pull $OLLAMA_MODEL_NAME


# Load sentence-transformers model once in order to cache it in the image
# TODO: ARG / ENV for embedder model
# TODO: SENTENCE_TRANSFORMERS_HOME for cache path
RUN echo "from haystack.components.embedders import SentenceTransformersDocumentEmbedder\nSentenceTransformersDocumentEmbedder(model='svalabs/german-gpl-adapted-covid').warm_up()" | python3


Expand Down
17 changes: 15 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,7 @@ To build and run the container locally with hot reload on python files do:
```
DOCKER_BUILDKIT=1 docker build . -t gbnc
docker run \
--env HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN \
--volume "$(pwd)/gswikichat":/workspace/gswikichat \
--volume gbnc_cache:/root/.cache \
--publish 8000:8000 \
--rm \
--interactive \
Expand All @@ -22,6 +20,21 @@ docker run \
```
Point your browser to http://localhost:8000/ and use the frontend.

To fetch data from a `toc.json` wiki fetching definition, run:
```
$ docker exec -it gbnc bash
# export WIKI_USER=<wikibotusername>
# export WIKI_PW=<yoursecretbotuserpassword>
# python3 -m gswikichat.fetch_articles toc.json > articles.json
```

To import data run:
```
$ docker exec -it gbnc bash
# cat json_input/excellent-articles_10.json | jq 'to_entries | map({content: .value, meta: {source: .key}})' > import.json
# python3 -m gswikichat.db import.json
```

### Runpod.io

The container works on [runpod.io](https://www.runpod.io/) GPU instances. A [template is available here](https://runpod.io/gsc?template=0w8z55rf19&ref=yfvyfa0s).
Expand Down
4 changes: 2 additions & 2 deletions frontend/src/components/field/FieldAnswer.vue
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@
class="text-sm cursor-pointer text-light-distinct-text dark:text-dark-distinct-text"
>
<summary>
{{ $t('source') }} ({{ s.score.toFixed(1) }}/5):
<a class="link-text" :href="s.src">{{ s.src }}</a>
{{ $t('source') }} ({{ s.score.toFixed(1) }}):
<a class="link-text" :href="s.source">{{ s.source }}</a>
</summary>
<p class="pt-2 pl-4">{{ s.content }}</p>
</details>
Expand Down
2 changes: 1 addition & 1 deletion frontend/src/types/source.d.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
export type Source = {
id: number
src: string
source: string
content: string
score: number
}
42 changes: 19 additions & 23 deletions gswikichat/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,61 +6,57 @@
from fastapi.staticfiles import StaticFiles
from fastapi import FastAPI, Header

from .rag import rag_pipeline

from .logger import get_logger
from .rag import rag_pipeline

# Create logger instance from base logger config in `logger.py`
logger = get_logger(__name__)

FRONTEND_STATIC_DIR = './frontend/dist'
FRONTEND_STATIC_DIR = "./frontend/dist"
API_SECRET = os.environ.get("API_SECRET")

app = FastAPI()

app.mount(
"/assets",
StaticFiles(directory=f"{FRONTEND_STATIC_DIR}/assets"),
name="frontend-assets"
name="frontend-assets",
)


@app.get("/")
async def root():
return FileResponse(f"{FRONTEND_STATIC_DIR}/index.html")


@app.get("/favicon.ico")
async def favicon():
return FileResponse(f"{FRONTEND_STATIC_DIR}/favicon.ico")


@app.get("/api")
async def api(x_api_secret: Annotated[str, Header()], query, top_k=3, lang='en'):
if not API_SECRET == x_api_secret:
raise Exception("API key is missing or incorrect")

if not lang in ['en', 'de']:
if not lang in ["en", "de"]:
raise Exception("language must be 'en' or 'de'")

logger.debug(f'{query=}') # Assuming we change the input name
logger.debug(f'{top_k=}')
logger.debug(f'{lang=}')
logger.debug(f"{query=}")
logger.debug(f"{top_k=}")
logger.debug(f"{lang=}")

answer = rag_pipeline(query=query, top_k=top_k, lang=lang)

answer = rag_pipeline(
query=query,
top_k=top_k,
lang=lang
)
if not answer:
return {}

sources = [
{
"src": d_.meta['src'],
"content": d_.content,
"score": d_.score
} for d_ in answer.documents
{"id": d_.id, "source": d_.meta["source"], "content": d_.content, "score": d_.score}
for d_ in answer.documents
]

logger.debug(f'{answer=}')
logger.debug(f"{answer.data=}")
logger.debug(f"{answer.documents=}")

return {
"answer": answer.data.content,
"sources": sources
}
return {"answer": answer.data.content, "sources": sources}
107 changes: 107 additions & 0 deletions gswikichat/db.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
import os

import torch

from langchain.text_splitter import CharacterTextSplitter
from langchain_community.document_loaders import JSONLoader
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores.pgvecto_rs import PGVecto_rs

from .logger import get_logger


SENTENCE_TRANSFORMER_MODEL = "svalabs/german-gpl-adapted-covid"

logger = get_logger(__name__)


def get_device():
device = "cpu"
if torch.cuda.is_available():
logger.info("GPU is available.")
device = "cuda"
return device


def get_embedding_model():
# https://huggingface.co/svalabs/german-gpl-adapted-covid
logger.info(f"Embedding model: {SENTENCE_TRANSFORMER_MODEL}")

return HuggingFaceEmbeddings(
model_name=SENTENCE_TRANSFORMER_MODEL,
model_kwargs={"device": get_device()},
show_progress=True,
)


def get_db():
PORT = os.getenv("DB_PORT", 5432)
HOST = os.getenv("DB_HOST", "127.0.0.1")
USER = os.getenv("DB_USER", "gbnc")
PASS = os.getenv("DB_PASS", "")
DB_NAME = os.getenv("DB_NAME", "gbnc")

URL = "postgresql+psycopg://{username}:{password}@{host}:{port}/{db_name}".format(
port=PORT,
host=HOST,
username=USER,
password=PASS,
db_name=DB_NAME,
)

return PGVecto_rs.from_collection_name(
embedding=get_embedding_model(),
db_url=URL,
collection_name="gbnc",
)


def import_data(file):
def metadata_func(record: dict, metadata: dict) -> dict:
metadata["source"] = record.get("meta", {}).get("source")
return metadata

loader = JSONLoader(
file_path=file,
jq_schema=".[]",
content_key="content",
metadata_func=metadata_func,
)

documents = loader.load()

logger.debug(f"Loaded {len(documents)} documents.")

text_splitter = CharacterTextSplitter(chunk_size=250, chunk_overlap=0)
chunks = text_splitter.split_documents(documents)
logger.debug(f"Split documents into {len(chunks)} chunks.")

logger.debug(f"Importing into database.")
get_db().add_documents(chunks)


if __name__ == "__main__":
import sys

if len(sys.argv) > 1:
file = sys.argv[1]
import_data(file)

else:
logger.error(
"""Provide JSON file with the following structure as first parameter
[
{
"content":"document content one", "meta":{
"source": "https://source.url/one"
}
},
{
"content":"document content two", "meta":{
"source": "https://source.url/two"
}
}
]
"""
)
sys.exit(1)
Loading