Skip to content

Commit f90894b

Browse files
GitHKAndrei Neagupcrespov
authored
✨Run dynamic services via dynamic-sidecar (🏗️ OPS + CI action) (ITISFoundation#1887)
* codestyle * updated openapi.json * fixed failing tests * added defaults for registry in development * fixed models-library failing tests * fixing pylint * added back missing * fix pylint * containe-http-entrypoint requires compose-spec if mising a validation error will be raised * fixed test which was causing all others to fail * removed fixture for network_name * changed application setup * added mocked swarm network * fix codeclimate * fixed weberrver integration 06 and 09 tests * fixed webserver integration 10 tests * fix failing unit tests * codeclimate will not fix * monitor needs to be shutdown while not in swarm mode * fixing timeouts for the CI * removed adminer * CI can be a pain, give it a lot of time * updated specs * removed otudated TODO * refactored, to using model instead of properties * unmeaningful comment * fixed type * added more hints for developers * removed undesired log * fixed errors after refactor * fixed makefile syntax * displaying message in terminal * updated comment and now raising error if fails * Update test_service_settings.py * @sanderegg typo fix * @sanderegg fixed type * added missing commits to previous merge * director-v2 codestyle * added missing dependencies * updated pyton version in worfklow * added missing dependency * updated types and fixed deserialization * updated types * renamed service_settings entries * renamed modes related to the service settings * fixing output * renamed and semplified constructors * node_uuid type is now NodeID * removed unsued * fixed regex * reanbled and fixed tests * changed :labels to /labels * renamed other_paths to state_paths * expanded example * fixed imports and types * test required to be in swarm mode * doubling the time for the CI * refactor naming * refactored RegistrySettings to new standard * refactored test * more registry refactor * removed unused definitions and RunningDynamicServiceDetails to avoid confusion * refactored to semplify filtering * refactor, added constructor moved model * updated comments * added constructor to model * removed comments * renamed moitor.core to monitor.monitor.task * simplifyng imports * moved gt_moinitor to dependencies * removed TODO and expanded comment * simplified status reported to the user * fixed dynamic-sidcar stuck boot * fixed type order * adding missing optional * - moved needs_dynamic_sidecar to property - added explanation on how model works * refactored factory method * moved hardcoded traefik version to DynamicSidecarSettings * refactored list_running_dynamic_services * update docstring * will no longer suppress but display a warning * moved where the function is used * renamed properly * renamed OverallStatus to Status * refactored make_from_http_request * refactored settings to new pattern and cleanup * pylint * updated openapi-specs * renamed to module_setup * renamed to docker_api * renamed to client_api interface and refactored interface * renamed to docker_compose_specs * renamed to docker_service_specs * renamed to docker_states * renaming to errors * renaming to errors part 2 * renamed events * renamed to abc and inherited from ABC * renaming to task * renamed task part 2 * refactor @pcrespov * refactor * review @pcrespov * review @Crespov * typing error * fixed typing * using py3.8 syntax * pylance improovment * py3.8 syntax * @Crespov refactor * pylance suggestions * more pylance suggestions * making pylance happy * minor pylance suggsestions * using a valid status * refactored to Pydantic model and moved to models * removed make from name * removed unecessary code * fixed constructor * models restructured: - fixed circular dependency - moved all models inside models/schemas/dynamic_services - transformed some cotaniner to Pydantic models * removed unncessary * pylance suggestion * py38 syntax * remove unsued import * removed unused code * added test for fetched labesl and refactored * fixed module setup * inverted parameter order * added missing changes * fixing workflow * fixes issues with imagename * switching to default * refactored * reverted test * trying to debug * fixed validator for CI * added todo before merging * downgrading dependency which causes issues * pylint * updated name in CI * fixes failing fixture * typing * bumping timeout to avoid test failing * refactor * renamed back to original name * moved traefik settings to env vars and pydantic models * - refactor function names - more reliable registy settings forwarding * codestyle * using better name * added reminder * typing refactor * added test to be finished in fugure PRs * fixes command output README * do not fail when removing service * log message for the user * dynamic_sidecar_env_vars no longer contains obsucured password * empty string is treated as None * fixes broken behaviour * codestyle * services are now queried for availability * avoids GC restarts upon errors * added get_services timeout * locking is not required when recoviring the state * refactoring and renaming * speeding up orphaned services removal * stopping services no longer hang on first error * stopping services will no longer raise errors * extend doc * fixing tests * reverting changes * fixed worfklow file * refactor * will codeclimate compalin? * enhancing performance * minor refactor * fixed test * @sanderegg review * no longer fails on boot * reverting change * removing comment * moved code to correct place and removed duplication * refactored * refactored and moved to director-v2 * renamed functions * pylint * fixing "typo master" issues * fixed comment * more typos * refactored boot mode test * renaming and updating * @sanderegg refactor to try/except * fixing test after rename * removing autouse * removed autouse * making it clear from where fucntion calls come * moved to utils * fixing typing * removing and renaming fixtures * fixed import * fixed failing test * pylint * should fix issue with boot mode * removed unused dependency * reverted requiremtns updates * giving test more time * fixing time bump * reverting to old timeout * trying to remove randomly hanging code * CI does not like localhost to reference docker * put back registry * refactor and docstrings * refactor + update docstrings * removing autoreuse * fixed too broad error capture and message logging * refactor * uses erorr handling, updated docstring * fix garbage collector * refactor common test parts * refactor tests * no longer return "null" * transfomed form system to integration test * fixing broken test * fixing test again * bumping to avoid timeout Co-authored-by: Andrei Neagu <[email protected]> Co-authored-by: Pedro Crespo-Valero <[email protected]>
1 parent 3299ea6 commit f90894b

File tree

63 files changed

+4050
-673
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+4050
-673
lines changed

.env-devel

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ CATALOG_DEV_FEATURES_ENABLED=0
1818

1919
DASK_SCHEDULER_HOST=dask-scheduler
2020

21+
DYNAMIC_SIDECAR_IMAGE=${DOCKER_REGISTRY:-itisfoundation}/dynamic-sidecar:${DOCKER_IMAGE_TAG:-latest}
22+
2123
DIRECTOR_REGISTRY_CACHING_TTL=900
2224
DIRECTOR_REGISTRY_CACHING=True
2325

.github/workflows/ci-testing-deploy.yml

Lines changed: 82 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1824,9 +1824,9 @@ jobs:
18241824
name: codeclimate-${{ github.job }}-coverage
18251825
path: codeclimate.${{ github.job }}_coverage.json
18261826

1827-
integration-test-director-v2:
1827+
integration-test-director-v2-01:
18281828
timeout-minutes: 30 # if this timeout gets too small, then split the tests
1829-
name: "[int] director-v2"
1829+
name: "[int] director-v2 01"
18301830
needs: [build-test-images]
18311831
runs-on: ${{ matrix.os }}
18321832
strategy:
@@ -1873,7 +1873,81 @@ jobs:
18731873
- name: install
18741874
run: ./ci/github/integration-testing/director-v2.bash install
18751875
- name: test
1876-
run: ./ci/github/integration-testing/director-v2.bash test
1876+
run: ./ci/github/integration-testing/director-v2.bash test 01
1877+
- name: upload failed tests logs
1878+
if: failure()
1879+
uses: actions/upload-artifact@v2
1880+
with:
1881+
name: ${{ github.job }}_docker_logs
1882+
path: ./services/director-v2/test_failures
1883+
- name: cleanup
1884+
if: always()
1885+
run: ./ci/github/integration-testing/director-v2.bash clean_up
1886+
- uses: codecov/codecov-action@v1
1887+
with:
1888+
flags: integrationtests #optional
1889+
- name: prepare codeclimate coverage file
1890+
run: |
1891+
curl -L https://codeclimate.com/downloads/test-reporter/test-reporter-0.7.0-linux-amd64 > ./cc-test-reporter
1892+
chmod +x ./cc-test-reporter && ./cc-test-reporter --version
1893+
./cc-test-reporter format-coverage -t coverage.py -o codeclimate.${{ github.job }}_coverage.json coverage.xml
1894+
- name: upload codeclimate coverage
1895+
uses: actions/upload-artifact@v2
1896+
with:
1897+
name: codeclimate-${{ github.job }}-coverage
1898+
path: codeclimate.${{ github.job }}_coverage.json
1899+
1900+
integration-test-director-v2-02:
1901+
timeout-minutes: 20 # if this timeout gets too small, then split the tests
1902+
name: "[int] director-v2 02"
1903+
needs: [build-test-images]
1904+
runs-on: ${{ matrix.os }}
1905+
strategy:
1906+
matrix:
1907+
python: [3.8]
1908+
os: [ubuntu-20.04]
1909+
docker_buildx: [v0.5.1]
1910+
docker_compose: [1.29.1]
1911+
include:
1912+
- docker_compose: 1.29.1
1913+
docker_compose_sha: 8097769d32e34314125847333593c8edb0dfc4a5b350e4839bef8c2fe8d09de7
1914+
fail-fast: false
1915+
steps:
1916+
- name: set PR default variables
1917+
# only pushes have access to the docker credentials, use a default
1918+
if: github.event_name == 'pull_request'
1919+
run: |
1920+
export TMP_DOCKER_REGISTRY=${GITHUB_REPOSITORY%/*}
1921+
echo "DOCKER_REGISTRY=${TMP_DOCKER_REGISTRY,,}" >> $GITHUB_ENV
1922+
- uses: actions/checkout@v2
1923+
- name: setup docker buildx
1924+
id: buildx
1925+
uses: docker/setup-buildx-action@v1
1926+
with:
1927+
version: ${{ matrix.docker_buildx }}
1928+
driver: docker
1929+
1930+
- name: setup docker-compose
1931+
run: sudo ./ci/github/helpers/setup_docker_compose.bash ${{ matrix.docker_compose }} ${{ matrix.docker_compose_sha }}
1932+
- name: setup python environment
1933+
uses: actions/setup-python@v2
1934+
with:
1935+
python-version: ${{ matrix.python }}
1936+
- name: show system version
1937+
run: ./ci/helpers/show_system_versions.bash
1938+
- uses: actions/cache@v2
1939+
name: getting cached data
1940+
with:
1941+
path: ~/.cache/pip
1942+
key: ${{ runner.os }}-pip-director-v2-${{ hashFiles('services/director-v2/requirements/ci.txt') }}
1943+
restore-keys: |
1944+
${{ runner.os }}-pip-director-v2-
1945+
${{ runner.os }}-pip-
1946+
${{ runner.os }}-
1947+
- name: install
1948+
run: ./ci/github/integration-testing/director-v2.bash install
1949+
- name: test
1950+
run: ./ci/github/integration-testing/director-v2.bash test 02
18771951
- name: upload failed tests logs
18781952
if: failure()
18791953
uses: actions/upload-artifact@v2
@@ -2044,7 +2118,7 @@ jobs:
20442118
path: codeclimate.${{ github.job }}_coverage.json
20452119

20462120
system-test-public-api:
2047-
timeout-minutes: 25 # if this timeout gets too small, then split the tests
2121+
timeout-minutes: 30 # if this timeout gets too small, then split the tests
20482122
name: "[sys] public api"
20492123
needs: [build-test-images]
20502124
runs-on: ${{ matrix.os }}
@@ -2320,7 +2394,8 @@ jobs:
23202394
unit-test-webserver-10,
23212395
integration-test-webserver-01,
23222396
integration-test-webserver-02,
2323-
integration-test-director-v2,
2397+
integration-test-director-v2-01,
2398+
integration-test-director-v2-02,
23242399
integration-test-sidecar,
23252400
integration-test-simcore-sdk,
23262401
]
@@ -2380,7 +2455,8 @@ jobs:
23802455
unit-test-webserver-10,
23812456
integration-test-webserver-01,
23822457
integration-test-webserver-02,
2383-
integration-test-director-v2,
2458+
integration-test-director-v2-01,
2459+
integration-test-director-v2-02,
23842460
integration-test-sidecar,
23852461
integration-test-simcore-sdk,
23862462
system-test-public-api,

Makefile

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -195,17 +195,19 @@ define _show_endpoints
195195
set -o allexport; \
196196
source $(CURDIR)/.env; \
197197
set +o allexport; \
198-
separator=------------------------------------------;\
198+
separator=------------------------------------------------------------------------------------;\
199199
separator=$${separator}$${separator}$${separator};\
200-
rows="%-22s | %80s | %12s | %12s\n";\
200+
rows="%-22s | %90s | %12s | %12s\n";\
201201
TableWidth=140;\
202-
printf "%22s | %80s | %12s | %12s\n" Name Endpoint User Password;\
202+
printf "%22s | %90s | %12s | %12s\n" Name Endpoint User Password;\
203203
printf "%.$${TableWidth}s\n" "$$separator";\
204-
printf "$$rows" 'oSparc platform' "http://$(if $(IS_WSL2),$(get_my_ip),127.0.0.1):9081";\
205-
printf "$$rows" 'Postgres DB' "http://$(if $(IS_WSL2),$(get_my_ip),127.0.0.1):18080/?pgsql=postgres&username=$${POSTGRES_USER}&db=$${POSTGRES_DB}&ns=public" $${POSTGRES_USER} $${POSTGRES_PASSWORD};\
206-
printf "$$rows" Portainer "http://$(if $(IS_WSL2),$(get_my_ip),127.0.0.1):9000" admin adminadmin;\
207-
printf "$$rows" Redis-commander "http://$(if $(IS_WSL2),$(get_my_ip),127.0.0.1):18081";\
208-
printf "$$rows" "Docker Registry" "$${REGISTRY_URL}" $${REGISTRY_USER} $${REGISTRY_PW}
204+
printf "$$rows" 'oSparc platform' 'http://$(get_my_ip).nip.io:9081';\
205+
printf "$$rows" 'Postgres DB' 'http://$(get_my_ip).nip.io:18080/?pgsql=postgres&username='$${POSTGRES_USER}'&db='$${POSTGRES_DB}'&ns=public' $${POSTGRES_USER} $${POSTGRES_PASSWORD};\
206+
printf "$$rows" Portainer 'http://$(get_my_ip).nip.io:9000' admin adminadmin;\
207+
printf "$$rows" Redis 'http://$(get_my_ip).nip.io:18081';\
208+
printf "$$rows" 'Docker Registry' $${REGISTRY_URL} $${REGISTRY_USER} $${REGISTRY_PW};\
209+
echo "⚠️ if a DNS is not used (as displayed above), the interactive services started via dynamic-sidecar"
210+
echo "⚠️ will not be shown. The frontend accesses them via the uuid.services.YOUR_IP.nip.io:9081"
209211
endef
210212

211213
show-endpoints:

README.md

Lines changed: 25 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -52,16 +52,17 @@ This is the common workflow to build and deploy locally:
5252
make info-swarm
5353

5454
# open front-end in the browser
55-
# localhost:9081 - simcore front-end site
55+
# 127.0.0.1.nip.io:9081 - simcore front-end site
5656
#
57-
xdg-open http://localhost:9081/
57+
xdg-open http://127.0.0.1.nip.io:9081/
5858

5959
# stops
6060
make down
6161
```
6262

63-
Services are deployed in two stacks:``simcore-stack`` comprises all core-services in the framework
64-
and ``ops-stack`` is a subset of services from [ITISFoundation/osparc-ops](https://github.com/ITISFoundation/osparc-ops) used
63+
Some routes can only be reached via DNS such as `UUID.services.DNS`. Since `UUID.services.127.0.0.1` is **not a valid DNS**, the solution is to use [nip.io](https://nip.io/). A service that maps ``<anything>[.-]<IP Address>.nip.io`` in "dot", "dash" or "hexadecimal" notation to the corresponding ``<IP Address>``.
64+
65+
Services are deployed in two stacks:``simcore-stack`` comprises all core-services in the framework and ``ops-stack`` is a subset of services from [ITISFoundation/osparc-ops](https://github.com/ITISFoundation/osparc-ops) used
6566
for operations during development. This is a representation of ``simcore-stack``:
6667

6768
![](docs/img/.stack-simcore-version.yml.png)
@@ -101,6 +102,26 @@ In **windows**, it works under [WSL] (windows subsystem for linux). Some details
101102

102103
In **MacOS**, [replacing the MacOS utilities with GNU utils](https://apple.stackexchange.com/a/69332) might be required.
103104

105+
#### Upgrading services requirements
106+
107+
Updates are upgraded using a docker container and pip-sync.
108+
Build and start the container:
109+
110+
cd requirements/tools
111+
make build
112+
make shell
113+
114+
Once inside the container navigate to the service's requirements directory.
115+
116+
To upgrade all requirements run:
117+
118+
make reqs
119+
120+
To upgrade a single requirement named `fastapi`run:
121+
122+
make reqs upgrade=fastapi
123+
124+
104125
## Releases
105126

106127
**WARNING** This application is **still under development**.

ci/github/integration-testing/director-v2.bash

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,10 @@ install() {
2020
}
2121

2222
test() {
23+
echo "testing in services/director-v2/tests/integration/$1"
2324
pytest --cov=simcore_service_director_v2 --durations=10 --cov-append \
2425
--color=yes --cov-report=term-missing --cov-report=xml --cov-config=.coveragerc \
25-
-v -m "not travis" services/director-v2/tests/integration --log-level=DEBUG
26+
-v -m "not travis" "services/director-v2/tests/integration/$1" --log-level=DEBUG
2627
}
2728

2829
clean_up() {

packages/models-library/src/models_library/service_settings_labels.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,8 @@ def needs_dynamic_sidecar(self) -> bool:
144144

145145
@validator("container_http_entry", always=True)
146146
@classmethod
147-
def compose_spec_requires_container_http_entry(cls, v, values):
147+
def compose_spec_requires_container_http_entry(cls, v, values) -> Optional[str]:
148+
v = None if v == "" else v
148149
if v is None and values.get("compose_spec") is not None:
149150
raise ValueError(
150151
"Field `container_http_entry` must be defined but is missing"
@@ -186,6 +187,7 @@ class SimcoreServiceLabels(DynamicSidecarServiceLabels):
186187
)
187188

188189
class Config(_BaseConfig):
190+
extra = Extra.allow
189191
schema_extra = {
190192
"examples": [
191193
# legacy service
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
from typing import Optional
2+
3+
from pydantic import Field, SecretStr
4+
5+
from .base import BaseCustomSettings
6+
7+
8+
class RegistrySettings(BaseCustomSettings):
9+
10+
REGISTRY_AUTH: bool = Field(..., description="do registry authentication")
11+
REGISTRY_PATH: Optional[str] = Field(
12+
None, description="development mode only, in case a local registry is used"
13+
)
14+
REGISTRY_URL: str = Field("", description="url to the docker registry")
15+
16+
REGISTRY_USER: str = Field(
17+
..., description="username to access the docker registry"
18+
)
19+
REGISTRY_PW: SecretStr = Field(
20+
..., description="password to access the docker registry"
21+
)
22+
REGISTRY_SSL: bool = Field(..., description="access to registry through ssl")
23+
24+
@property
25+
def resolved_registry_url(self) -> str:
26+
return self.REGISTRY_PATH or self.REGISTRY_URL

packages/models-library/src/models_library/settings/services_common.py

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,30 @@
11
from pydantic import BaseSettings, Field, PositiveInt
22

3-
_BASE_TIMEOUT_FOR_STOPPING_SERVICES = 60 * 60
3+
_MINUTE = 60
4+
_HOUR = 60 * _MINUTE
45

56

67
class ServicesCommonSettings(BaseSettings):
78
# set this interval to 1 hour
89
director_dynamic_service_save_timeout: PositiveInt = Field(
9-
_BASE_TIMEOUT_FOR_STOPPING_SERVICES,
10+
_HOUR,
1011
description=(
1112
"When stopping a dynamic service, if it has "
1213
"big payloads it is important to have longer timeouts."
1314
),
1415
)
1516
webserver_director_stop_service_timeout: PositiveInt = Field(
16-
_BASE_TIMEOUT_FOR_STOPPING_SERVICES + 10,
17+
_HOUR + 10,
1718
description=(
18-
"When the webserver invokes the director API to stop "
19-
"a service which has a very long timeout, it also "
20-
"requires to wait that amount plus some extra padding."
19+
"The below will try to help explaining what is happening: "
20+
"webserver -(stop_service)-> director-v* -(save_state)-> service_x"
21+
"- webserver requests stop_service and uses a 01:00:10 timeout"
22+
"- director-v* requests save_state and uses a 01:00:00 timeout"
23+
"The +10 seconds is used to make sure the director replies"
2124
),
2225
)
2326
storage_service_upload_download_timeout: PositiveInt = Field(
24-
60 * 60,
27+
_HOUR,
2528
description=(
2629
"When dynamic services upload and download data from storage, "
2730
"sometimes very big payloads are involved. In order to handle "

packages/pytest-simcore/src/pytest_simcore/docker_registry.py

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -225,3 +225,54 @@ def jupyter_service(docker_registry: str, node_meta_schema: Dict) -> Dict[str, s
225225
docker_registry,
226226
node_meta_schema,
227227
)
228+
229+
230+
DY_STATIC_FILE_SERVER_VERSION = "1.0.5"
231+
232+
233+
@pytest.fixture(scope="session")
234+
def dy_static_file_server_service(
235+
docker_registry: str, node_meta_schema: Dict
236+
) -> Dict[str, str]:
237+
"""
238+
Adds the below service in docker registry
239+
itisfoundation/dy-static-file-server
240+
"""
241+
return _pull_push_service(
242+
"itisfoundation/dy-static-file-server",
243+
DY_STATIC_FILE_SERVER_VERSION,
244+
docker_registry,
245+
node_meta_schema,
246+
)
247+
248+
249+
@pytest.fixture(scope="session")
250+
def dy_static_file_server_dynamic_sidecar_service(
251+
docker_registry: str, node_meta_schema: Dict
252+
) -> Dict[str, str]:
253+
"""
254+
Adds the below service in docker registry
255+
itisfoundation/dy-static-file-server-dynamic-sidecar
256+
"""
257+
return _pull_push_service(
258+
"itisfoundation/dy-static-file-server-dynamic-sidecar",
259+
DY_STATIC_FILE_SERVER_VERSION,
260+
docker_registry,
261+
node_meta_schema,
262+
)
263+
264+
265+
@pytest.fixture(scope="session")
266+
def dy_static_file_server_dynamic_sidecar_compose_spec_service(
267+
docker_registry: str, node_meta_schema: Dict
268+
) -> Dict[str, str]:
269+
"""
270+
Adds the below service in docker registry
271+
itisfoundation/dy-static-file-server-dynamic-sidecar-compose-spec
272+
"""
273+
return _pull_push_service(
274+
"itisfoundation/dy-static-file-server-dynamic-sidecar-compose-spec",
275+
DY_STATIC_FILE_SERVER_VERSION,
276+
docker_registry,
277+
node_meta_schema,
278+
)

packages/settings-library/src/settings_library/docker_registry.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
from functools import cached_property
22
from typing import Optional
33

4-
from pydantic import Field, SecretStr
4+
from pydantic import Field, SecretStr, validator
55

66
from .base import BaseCustomSettings
77

@@ -23,6 +23,11 @@ class RegistrySettings(BaseCustomSettings):
2323
)
2424
REGISTRY_SSL: bool = Field(..., description="access to registry through ssl")
2525

26+
@validator("REGISTRY_PATH", pre=True)
27+
@classmethod
28+
def escape_none_string(cls, v) -> Optional[str]:
29+
return None if v == "None" else v
30+
2631
@cached_property
2732
def resolved_registry_url(self) -> str:
2833
return self.REGISTRY_PATH or self.REGISTRY_URL

0 commit comments

Comments
 (0)