Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
dbed833
cleaning up instructions
Sybrand Oct 31, 2023
248b02b
changes to allow for local (uncontaineraized) development
Sybrand Nov 2, 2023
695bf29
allow druid share folder on pipeline to differ from druid folder
Sybrand Nov 3, 2023
79063e0
update templates to match harmony_demo changes w.r.t. /home/share fol…
Sybrand Nov 3, 2023
4cbf168
black re-format
Sybrand Nov 3, 2023
afa1548
rather use an email on the zenysis domain
Sybrand Nov 6, 2023
826ab14
s3 alias must be configurable
Sybrand Nov 6, 2023
8b4293c
Link to step by step instructions
Sybrand Nov 6, 2023
4da9fef
function name that makes more sense
Sybrand Nov 9, 2023
4fa5fb3
switching to no using named volumes, reduces confusion as named volum…
Sybrand Nov 14, 2023
66bbed6
fleshing out build
Sybrand Nov 16, 2023
8a8383c
just some comments for me
Sybrand Nov 16, 2023
0d444b6
pipeline steps
Sybrand Nov 16, 2023
6ec4d6e
Merge remote-tracking branch 'origin/main' into sybrand-clean-instruc…
Sybrand Jul 15, 2024
730b150
allow default override for local development
Sybrand Jul 16, 2024
043a18a
upgrade celery
Sybrand Jul 18, 2024
dff2f31
celery needs to be upgraded to work with latest pip
Sybrand Jul 18, 2024
ec3b003
upgrade to python 3.9
Sybrand Jul 18, 2024
f441244
upgrade to resolve conflicts
Sybrand Jul 18, 2024
e929706
Merge branch 'celery-upgrade' into sybrand-clean-instructions-demo
Sybrand Jul 18, 2024
38b7c1f
black re-formatting required
Sybrand Jul 18, 2024
da86502
black re-formatting required
Sybrand Jul 18, 2024
db5c0ae
black re-formatting required
Sybrand Jul 18, 2024
55ab5b9
Merge branch 'celery-upgrade' into sybrand-clean-instructions-demo
Sybrand Jul 18, 2024
b0ad91a
port 8080 is used by druid, so we move webpack to 9000
Sybrand Jul 22, 2024
76ecf07
port 8088 is used by druid, so we move hasura to 8088
Sybrand Jul 22, 2024
9ebd869
port 8080 is used by druid, so we move hasura to 8088, and webpack to…
Sybrand Jul 22, 2024
9dd4d2f
port 8088 is used by druid, so we move hasura to 9001
Sybrand Jul 22, 2024
e5f4578
map 8080 on container to 9001 on host
Sybrand Jul 22, 2024
4980000
minio runs on 9000, so move webpack to 9001
Sybrand Jul 22, 2024
d833f32
webpack runs on 9001, so move hasura to 9002
Sybrand Jul 22, 2024
577c13b
minio runs on 9000, so move webpack to 9001
Sybrand Jul 22, 2024
e1fc21a
9000 range is giving issues on wsl
Sybrand Jul 22, 2024
54b0627
trying hasura in 5003 range
Sybrand Jul 22, 2024
1bf626a
linux does not have host.docker.internal by default
Sybrand Jul 24, 2024
cf79bb9
Merge remote-tracking branch 'origin/main' into sybrand-clean-instruc…
Sybrand Jul 25, 2024
b595ab6
trying to pin setuptools to earlier version
Sybrand Jul 29, 2024
02f3dc2
trying to pin setuptools to earlier version
Sybrand Jul 29, 2024
aca534f
trying to pin setuptools to earlier version
Sybrand Jul 29, 2024
f935b75
trying to pin setuptools to earlier version
Sybrand Jul 29, 2024
94f624a
trying to pin setuptools to earlier version
Sybrand Jul 29, 2024
ffee43d
upgrade minor version of sql alchemy
Sybrand Jul 29, 2024
e119ee1
resolve build issue
Sybrand Jul 29, 2024
479ae48
resolve build issue
Sybrand Jul 29, 2024
d94e287
remove pip upgrade
Sybrand Jul 29, 2024
0f8eb23
remove pip upgrade
Sybrand Jul 29, 2024
e940113
need to be specific about build requirements
Sybrand Jul 29, 2024
cf23258
do I need the wheel in a seperate step?
Sybrand Jul 29, 2024
d1bea61
does wheel have to happen outside?
Sybrand Jul 29, 2024
42e613a
adding comments re. wheel
Sybrand Jul 29, 2024
a1a93b3
adding comments re. wheel
Sybrand Jul 29, 2024
2528878
update github action
Sybrand Jul 29, 2024
5f4bb9c
Merge remote-tracking branch 'origin/setuptools-module-error' into sy…
Sybrand Jul 29, 2024
370c4e2
need build isolation because latest setuptools is not compatible
Sybrand Aug 20, 2024
993b954
Merge remote-tracking branch 'origin/main' into sybrand-clean-instruc…
Sybrand Aug 20, 2024
b20b87c
[119] - Create bash script for local backend setup (#135)
rohitkori Sep 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
138 changes: 138 additions & 0 deletions demo.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
#!/bin/bash
set -euo pipefail

DRUID_SHARED_FOLDER=~/druid/home/share
DATA_OUTPUT_FOLDER=~/druid/data/output

#######
# MINIO
#######

# # Create an environment file if it doesn't exist.
# if [ ! -f ".env.demo.minio" ]; then
# password=$(cat /dev/urandom | tr -dc 'A-Za-z0-9!' | head -c 13)

# echo "MINIO_DATA_FOLDER=./
# MINIO_ROOT_USER=minio_demo
# MINIO_ROOT_PASSWORD=$password" > .env.demo.minio
# fi
# # Load the environment file
source .env.demo.minio
# # Start the minio server.
# docker compose --env-file .env.demo.minio -f docker-compose.minio.yaml up --detach
# # Create an alias for the minio server.
# docker exec -it harmony-minio-1 /bin/bash -c "mc alias set local http://localhost:9000 minio_demo $MINIO_ROOT_PASSWORD"
# # Create a bucket.
# docker exec -it harmony-minio-1 /bin/bash -c "mc mb /local/zenysis-harmony-demo"
# # Create a self serve folder in the bucket
# docker exec -it harmony-minio-1 /bin/bash -c "touch /tmp/delete_me && mc cp /tmp/delete_me /local/zenysis-harmony-demo/self_serve/delete_me && rm /tmp/delete_me"

#######
# DRUID
#######

# mkdir -p ~/druid/home/share
# mkdir -p ~/druid/data/output

# if [ ! -f "druid_setup/.env" ]; then
# echo "SINGLE_SERVER_DOCKER_HOST=
# DRUID_SHARED_FOLDER=$DRUID_SHARED_FOLDER
# DATA_OUTPUT_FOLDER=$DATA_OUTPUT_FOLDER" > druid_setup/.env
# fi

# cd druid_setup
# make single_server_up
# cd ..

###############################
# Prepare Harmony Configuration
###############################

mkdir -p ./.mc
if [ ! -f ".mc/config.json" ]; then
echo "{
\"version\": \"10\",
\"hosts\": {
\"s3\": {
\"url\": \"http://host.docker.internal:9000\",
\"accessKey\": \"$MINIO_ROOT_USER\",
\"secretKey\": \"$MINIO_ROOT_PASSWORD\",
\"api\": \"S3v4\",
\"lookup\": \"auto\"
}
}
}" > ./.mc/config.json
fi

if [ ! -f ".env.demo" ]; then

POSTGRES_PASSWORD=$(cat /dev/urandom | tr -dc 'A-Za-z0-9!' | head -c 13)

echo "DEFAULT_SECRET_KEY=somesecret

ZEN_ENV=harmony_demo

DRUID_HOST=http://host.docker.internal
HASURA_HOST=http://hasura:8080

DATABASE_URL='postgresql://postgres:zenysis@postgres:5432/harmony_demo-local'

POSTGRES_HOST=postgres

# You can go to https://www.mapbox.com and create an API token.
MAPBOX_ACCESS_TOKEN=some_mapbox_access_token

NOREPLY_EMAIL=noreply@zenysis.com
SUPPORT_EMAIL=suppport@zenysis.com

MC_CONFIG_PATH=./.mc

POSTGRES_PASSWORD=$POSTGRES_PASSWORD

DRUID_SHARED_FOLDER=$DRUID_SHARED_FOLDER
DATA_OUTPUT_FOLDER=$DATA_OUTPUT_FOLDER

DOCKER_NAMESPACE=zengineering
DOCKER_TAG=latest

# Assuming you've created a minio alias called "local":
OBJECT_STORAGE_ALIAS=local" > .env.demo
fi

source .env.demo

##########################
# Prepare Harmony Database
##########################

# docker compose --env-file .env.demo -f docker-compose.yaml -f docker-compose.dev.yaml convert
# docker compose --env-file .env.demo -f docker-compose.yaml -f docker-compose.dev.yaml run --rm web /bin/bash -c "source venv/bin/activate && yarn init-db harmony_demo"

##################
# Run the pipeline
##################

# zengineering
# latest
# $(DOCKER_NAMESPACE)/harmony-etl-pipeline:$(DOCKER_TAG)
# zengineering/harmony-etl-pipeline:latest

# For the Harmony Demo, we skip the generate step.

COMMAND="./pipeline/harmony_demo/process/process_all" \
docker compose --project-name harmony-etl-process --env-file .env.demo \
-f docker-compose.pipeline.yaml up

COMMAND="./pipeline/harmony_demo/index/index_all" \
docker compose --project-name harmony-etl-index --env-file .env.demo \
-f docker-compose.pipeline.yaml up

COMMAND="./pipeline/harmony_demo/validate/validate_all" \
docker compose --project-name harmony-etl-validate --env-file .env.demo \
-f docker-compose.pipeline.yaml up






9 changes: 1 addition & 8 deletions docker-compose.pipeline.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ services:
pull_policy: always
volumes:
- ${DRUID_SHARED_FOLDER:-/home/share}:/home/share
- ${OUTPUT_PATH:-/data/output}:/zenysis/pipeline/out
- ${DATA_OUTPUT_FOLDER:-/data/output}:/zenysis/pipeline/out
# Map minio config folder
# When pipeline runs, mc should pick up $USER and point to this:
- ${MC_CONFIG_PATH:-/home/zenysis/.mc}:/home/zenysis/.mc
Expand All @@ -26,10 +26,3 @@ services:
memory: 50g
user: ${PIPELINE_USER:-1000}:${PIPELINE_GROUP:-1000}
command: [ "/bin/bash", "-c", "./docker/entrypoint_pipeline.sh" ]
volumes:
mc_config:
driver: local
driver_opts:
type: none
o: bind
device: ${MC_CONFIG_PATH:-/home/zenysis/.mc}
19 changes: 7 additions & 12 deletions docker/dev/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -122,30 +122,25 @@ COPY --chmod=755 --from=downloader /usr/local/bin/mc /usr/local/bin/mc
WORKDIR /app

# CPython
# Update setup and create venv
RUN /opt/python/3.9.16/bin/python3.9 -m pip install --upgrade pip setuptools \
&& /opt/python/3.9.16/bin/python3.9 -m venv venv \
&& . venv/bin/activate && python -m pip install --upgrade pip setuptools
# Create venv
RUN /opt/python/3.9.16/bin/python3.9 -m venv venv

# Install dependencies
COPY requirements.txt requirements-web.txt requirements-dev.txt requirements-pipeline.txt ./
RUN . venv/bin/activate \
&& python -m pip install -r requirements.txt \
&& python -m pip install -r requirements-web.txt \
&& python -m pip install -r requirements-dev.txt \
&& python -m pip install -r requirements-pipeline.txt
&& python -m pip install wheel==0.43.0 \
&& python -m pip install --no-build-isolation -r requirements.txt -r requirements-web.txt -r requirements-dev.txt -r requirements-pipeline.txt

# PyPy
# Install pypy
COPY --from=downloader /opt/pypy3.9-v7.3.11 /opt/pypy3.9-v7.3.11
# Update setup and create venv
RUN /opt/pypy3.9-v7.3.11/bin/pypy3 -m ensurepip \
&& /opt/pypy3.9-v7.3.11/bin/pypy3 -m pip install --upgrade pip setuptools \
&& /opt/pypy3.9-v7.3.11/bin/pypy3 -m venv venv_pypy3
# Install pypy dependencies
RUN . venv_pypy3/bin/activate \
&& pypy3 -m pip install --upgrade pip setuptools \
&& pypy3 -m pip install -r requirements.txt \
&& pypy3 -m pip install -r requirements-pipeline.txt
&& pypy3 -m pip install wheel==0.43.0 \
&& pypy3 -m pip install --no-build-isolation -r requirements.txt -r requirements-pipeline.txt

# Install flow.
COPY --from=downloader /usr/local/bin/flow /usr/local/bin/flow
Expand Down
1 change: 1 addition & 0 deletions pipeline/harmony_demo/index/run/00_druid/00_index
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,6 @@ set -o pipefail
--data_files="${DRUID_SHARED_FOLDER:-/home/share}/data/harmony_demo/*/current/processed_rows.*" \
--task_id_file="${PIPELINE_TMP_DIR}/task_id" \
--task_hash_dir "${DRUID_SHARED_FOLDER:-/home/share}/data/logs/druid_indexing/hash" \
--druid_server_shared_folder="${DRUID_SHARED_FOLDER:-/home/share}" \
--local_server_shared_folder "${DRUID_SHARED_FOLDER:-/home/share}" \
--min_data_date='1970-01-01'
4 changes: 3 additions & 1 deletion scripts/clean_published_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,10 @@

from util.file.directory_util import compute_dir_hash, equal_dir_content

DRUID_SHARED_FOLDER = os.environ.get('DRUID_SHARED_FOLDER', '/home/share')

# Base directory of where to begin search.
MAIN_DIR = '/home/share/data'
MAIN_DIR = f'{DRUID_SHARED_FOLDER}/data'

# Directories to whitelist
WHITELIST_DIRS = set(['current'])
Expand Down
3 changes: 2 additions & 1 deletion scripts/db/hasura/dev/start_hasura.sh
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,11 @@ if (( CONTAINER_COUNT == 0 )) ; then
docker run \
-d \
-it \
--add-host=host.docker.internal:host-gateway \
--name "${CONTAINER_NAME}" \
-v "${HASURA_METADATA_DIR}:/hasura-metadata" \
-v "${HASURA_CONTAINER_SCRIPTS_DIR}:/zenysis:ro" \
-p '8088:8080' \
-p '5003:8080' \
--entrypoint 'sh' \
"${IMAGE_NAME}"
fi
Expand Down
93 changes: 93 additions & 0 deletions scripts/local_setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
#!/bin/bash
set -euo pipefail

# This script is used to run the harmony backend locally.
# It consist of 5 arguments which are:
# 1. The environment file to be used
# 2. The email of the user to be created
# 3. The first name of the user to be created
# 4. The last name of the user to be created
# 5. The password of the user to be created

# Function to check if all required arguments are provided
check_arguments() {
if [ $# -ne 5 ]; then
echo "Error: Missing arguments."
echo "Usage: $0 <env_file> <email> <first_name> <last_name> <password>"
exit 1
fi
}

# Function to check if druid is running or not using curl
check_druid() {
if curl localhost:8888 >/dev/null 2>&1; then
echo "Druid running"
else
echo "Error: Druid not running"
exit 1
fi
}

# Function to install and configure the virtual environment
setup_virtualenv() {
if [ ! -d "venv" ]; then
echo "Virtual environment not found, creating one..."
python3.9 -m venv venv
fi
echo "Activating virtual environment..."
source ./venv/bin/activate
echo "Upgrading pip and installing dependencies..."
python -m pip install --upgrade pip
python -m pip install wheel==0.43.0
python -m pip install --no-build-isolation -r requirements.txt -r requirements-dev.txt -r requirements-web.txt -r requirements-pipeline.txt
}

# Function to load environment variables
load_env_vars() {
echo "Loading environment variables..."
set -o allexport
source $ENV_FILE
set +o allexport
}

# Function to run pipeline processes
run_pipeline() {
echo "Running pipeline processes..."
./pipeline/harmony_demo/process/process_all
./pipeline/harmony_demo/index/index_all
./pipeline/harmony_demo/validate/validate_all
}

# Function to initialize the database and populate indicators
init_db() {
echo "Initializing the database..."
yarn init-db harmony_demo --populate_indicators_from_config
}

# Function to create a user
create_user() {
echo "Creating user $EMAIL..."
./scripts/create_user.py --username "$EMAIL" --first_name "$FIRST_NAME" --last_name "$LAST_NAME" --site_admin -o --password "$PASSWORD"
}


check_arguments "$@"

ENV_FILE=$1
EMAIL=$2
FIRST_NAME=$3
LAST_NAME=$4
PASSWORD=$5

echo "Using environment file: $ENV_FILE"

# Main execution starts here
check_druid
setup_virtualenv
load_env_vars
init_db
run_pipeline
create_user $EMAIL $FIRST_NAME $LAST_NAME $PASSWORD

echo "All tasks completed successfully!"

4 changes: 2 additions & 2 deletions web/server/routes/webpack_dev_proxy.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,11 @@ def _proxy_request(url):
@webpack_dev_proxy.route('/build/<path:asset>')
def route_to_webpack_build(asset):
# Built files will live in webpack-dev-server's virtual `build/` directory.
return _proxy_request(f'http://localhost:8080/build/{asset}')
return _proxy_request(f'http://localhost:5001/build/{asset}')


@webpack_dev_proxy.route('/static/<path:asset>')
def route_to_webpack_static_asset(asset):
# Static assets will not be copied into webpack-dev-server's virtual
# directories but will exist in the same path on the filesystem.
return _proxy_request(f'http://localhost:8080/{asset}')
return _proxy_request(f'http://localhost:5001/{asset}')
3 changes: 2 additions & 1 deletion web/webpack.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ module.exports = {

// Needed for friendly errors plugin.
quiet: true,
port: 5001
},
devtool: 'source-map',
entry: {
Expand Down Expand Up @@ -184,5 +185,5 @@ module.exports = {
// You can now require('file') instead of require('file.jsx').
extensions: ['.js', '.json', '.jsx'],
modules: [absPath('web/client'), absPath('node_modules')],
},
}
};