Skip to content

Commit c1e2c57

Browse files
authored
chore: Functions python refactoring (#850)
Refactored functions-python build to get packages from api and more
1 parent 2cc6d67 commit c1e2c57

File tree

99 files changed

+800
-476
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

99 files changed

+800
-476
lines changed

.github/workflows/api-deployer.yml

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -201,10 +201,6 @@ jobs:
201201
name: database_gen
202202
path: api/src/database_gen/
203203

204-
- name: Copy to db models to functions directory
205-
run: |
206-
cp -R api/src/database_gen/ functions-python/database_gen
207-
208204
# api schema was generated and uploaded in api-build-test job above.
209205
- uses: actions/download-artifact@v4
210206
with:
@@ -249,10 +245,6 @@ jobs:
249245
name: database_gen
250246
path: api/src/database_gen/
251247

252-
- name: Copy to db models to functions directory
253-
run: |
254-
cp -R api/src/database_gen/ functions-python/database_gen
255-
256248
# api schema was generated and uploaded in api-build-test job above.
257249
- uses: actions/download-artifact@v4
258250
with:

.github/workflows/datasets-batch-deployer.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ jobs:
119119
uses: actions/upload-artifact@v4
120120
with:
121121
name: database_gen
122-
path: functions-python/database_gen/
122+
path: api/src/database_gen/
123123

124124
- name: Build python functions
125125
run: |

functions-python/.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,3 @@
1-
.dist
1+
.dist
2+
shared
3+
test_shared

functions-python/README.md

Lines changed: 90 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@ The function configuration file contains the following properties:
2222
- `timeout`: The timeout of the function in seconds. The default value is 60 seconds.
2323
- `memory`: The memory of the function in MB. The default value is 128 MB.
2424
- `trigger_http`: A boolean value that indicates if the function is triggered by an HTTP request. The default value is `false`.
25-
- `include_folders`: A list of folders to be included in the function zip. By default, the function zip will include all the files in the function folder.
25+
- `include_folders`: A list of folders from functions-python to be included in the function zip. By default, the function zip will include all the files in the function folder.
26+
- `include_api_folders`: A list of folders from the api folder to be included in the function zip. By default, the function zip will include all the files in the function folder.
2627
- `secret_environment_variables`: A list of objects, each representing a secret environment variable. These are securely used within the function. Each object should include:
2728
- `key`: The name of the environment variable as used in the function, acting as the secret's identifier.
2829
- `secret` [Optional]: The specific GCP secret to be used. If omitted, a default secret name is generated using the environment prefix (`DEV`, `QA`, or `PROD`) followed by an underscore and the `key` value. For example, if `key` is `api_key` in the `DEV` environment, the default secret name is `DEV_api_key`.
@@ -33,6 +34,13 @@ The function configuration file contains the following properties:
3334
- `available_cpu_count`: The number of CPU cores that are available to the function.
3435
- `available_memory`: The amount of memory available to the function.
3536

37+
# Test configuration(test_config.json)
38+
Some folders in functions-python are not destined to be deployed functions but are in support for other functions (e.g. helpers)m
39+
Some of these folders contain tests and the `test_config.json` file is used to configure the tests.
40+
The test configuration file contains the following properties:
41+
- `include_folders`: A list of folders from functions-python used in the tests.
42+
- `include_api_folders`: A list of folders from the api folder to be included in the tests.
43+
3644
# Local Setup
3745

3846
## Requirements
@@ -53,19 +61,76 @@ gcloud components install pubsub-emulator
5361
```
5462

5563
# Useful scripts
56-
- To locally execute a function use the following command:
64+
65+
The following sections uses `batch_datasets` as an example function. Replace `batch_datasets` with the function name you want to work with.
66+
67+
## Setting up an environment for editing and testing a function
68+
- To setup an self-contained environment specific to a function, you can use the following command:
69+
```
70+
./scripts/function-python-setup.sh --function_name batch_datasets
71+
```
72+
This will create a `shared` folder in the function's folder (e.g. batch_datasets/src/shared)
73+
with symbolic links to the necessary packages to run the function locally.
74+
It will also create a `test_shared` folder in the function's test folder (e.g. batch_datasets/tests/test_shared)
75+
e.g.:
76+
- functions-python/batch_datasets/src/shared
77+
- database_gen -> symlink to api/database_gen
78+
- dataset_service -> symlink to functions-python/dataset_service
79+
- helpers -> symlink to functions-python/helpers
80+
- functions-python/batch_datasets/tests/test_shared
81+
- test_utils -> symlink to functions-python/test_utils
82+
83+
The python code should refer to these shared folders to import the necessary modules.
84+
e.g. in `batch_datasets/src/main.py` we use the import:
85+
```
86+
from shared.database_gen.sqlacodegen_models import Gtfsfeed, Gtfsdataset
87+
```
88+
and in `batch_datasets/tests/testbatch_datasets_main.py` we use the import:
89+
```
90+
from test_shared.test_utils.database_utils import get_testing_session, default_db_url
91+
```
92+
Notice the `shared` and `test_shared` prefixes in the import path.
93+
94+
- Also you can do a setup for all functions by running:
95+
```
96+
./scripts/function-python-setup.sh --all
97+
```
98+
- And remove the shared and test_shared folders by running:
99+
```
100+
./scripts/function-python-setup.sh --function_name batch_datasets --clean
101+
```
102+
or
57103
```
58-
./scripts/function-python-run.sh --function_name tokens
104+
./scripts/function-python-setup.sh --all --clean
59105
```
106+
107+
## Creating a distribution zip
108+
60109
- To locally create a distribution zip use the following command:
61110
```
62-
./scripts/function-python-build.sh --function_name tokens
111+
./scripts/function-python-build.sh --function_name batch_datasets
63112
```
64113
or
65114
```
66115
./scripts/function-python-build.sh --all
67116
```
68-
- Start local and test database
117+
This script will create a .dist folder in the function's folder with the distribution zip and a build folder with the necessary files.
118+
e.g.
119+
```
120+
functions-python/batch_datasets/.dist/batch_datasets.zip
121+
functions-python/batch_datasets/build/
122+
```
123+
124+
## Executing a function
125+
- To locally execute a function use the following command:
126+
```
127+
./scripts/function-python-run.sh --function_name batch_datasets
128+
```
129+
This will create a virtual environment specific to the function (i.e with the specific requirements.txt installed)
130+
and run the function locally using function-framework
131+
132+
## Start local and test database
133+
69134
```
70135
docker compose --env-file ./config/.env.local up -d liquibase-test
71136
```
@@ -82,11 +147,26 @@ Make sure the testing database is running before executing the tests.
82147
```
83148
docker compose --env-file ./config/.env.local up -d liquibase-test
84149
```
85-
Execute all tests within the functions-python folder
150+
151+
To run the tests:
86152
```
87-
./scripts/api-tests.sh --folder functions-python
153+
scripts/api-tests.sh --folder functions-python/batch_datasets
88154
```
89-
Execute test from a specific function
155+
or
90156
```
91-
./scripts/api-tests.sh --folder functions-python/batch_datasets
92-
```
157+
scripts/api-tests.sh --folder functions-python
158+
```
159+
This will
160+
- run the `function-python-setup.sh` script for the function (ie create the `shared` and `test_shared` folders with symlinks)
161+
- Create a python virtual environment in the function folder, e.g.: `functions-python/batch_datasets/venv`
162+
- Install the requirements.txt and requirements_dev.txt specific to the function
163+
- Run the tests with coverage using installed virtual environment
164+
- If there is any requirements missing you should be able to catch it at this point.
165+
166+
# Development using Pycharm
167+
168+
- You can open the function directly in Pycharm, e.g. open `functions-python/batch_datasets` directly.
169+
- You need to set `batch_datasets/src` as the source root in Pycharm.
170+
- You need to set the python interpreter to the one in the virtual environment created by
171+
the `function-python-build.sh` or `api-tests.sh` script, i.e. `functions-python/batch_datasets/venv`
172+
- This will provide an environment the same or similar to the one used in deployment and allow you to catch issues early.

functions-python/batch_datasets/function_config.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,8 @@
55
"timeout": 20,
66
"memory": "256Mi",
77
"trigger_http": true,
8-
"include_folders": ["database_gen", "helpers", "dataset_service"],
8+
"include_folders": ["helpers", "dataset_service"],
9+
"include_api_folders": ["database_gen"],
910
"secret_environment_variables": [
1011
{
1112
"key": "FEEDS_DATABASE_URL"

functions-python/batch_datasets/main_local_debug.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212

1313
# Usage:
1414
# - python batch_datasets/main_local_debug.py
15-
from src.main import batch_datasets
15+
from main import batch_datasets
1616
from dotenv import load_dotenv
1717

1818
# Load environment variables from .env.local

functions-python/batch_datasets/src/main.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,9 @@
2525
from google.cloud.pubsub_v1.futures import Future
2626
from sqlalchemy import or_
2727
from sqlalchemy.orm import Session
28-
from database_gen.sqlacodegen_models import Gtfsfeed, Gtfsdataset
29-
from dataset_service.main import BatchExecutionService, BatchExecution
30-
from helpers.database import Database
28+
from shared.database_gen.sqlacodegen_models import Gtfsfeed, Gtfsdataset
29+
from shared.dataset_service.main import BatchExecutionService, BatchExecution
30+
from shared.helpers.database import Database
3131

3232
pubsub_topic_name = os.getenv("PUBSUB_TOPIC_NAME")
3333
project_id = os.getenv("PROJECT_ID")

functions-python/batch_datasets/tests/conftest.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,12 @@
1717
from faker import Faker
1818
from faker.generator import random
1919
from datetime import datetime
20-
from database_gen.sqlacodegen_models import Gtfsfeed, Gtfsrealtimefeed, Gtfsdataset
21-
from test_utils.database_utils import clean_testing_db, get_testing_session
20+
from shared.database_gen.sqlacodegen_models import (
21+
Gtfsfeed,
22+
Gtfsrealtimefeed,
23+
Gtfsdataset,
24+
)
25+
from test_shared.test_utils.database_utils import clean_testing_db, get_testing_session
2226

2327

2428
def populate_database():

functions-python/batch_datasets/tests/test_batch_datasets_main.py

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,8 @@
1818
from unittest import mock
1919
import pytest
2020
from unittest.mock import Mock, patch, MagicMock
21-
from batch_datasets.src.main import get_non_deprecated_feeds, batch_datasets
22-
from test_utils.database_utils import get_testing_session, default_db_url
21+
from main import get_non_deprecated_feeds, batch_datasets
22+
from test_shared.test_utils.database_utils import get_testing_session, default_db_url
2323

2424

2525
def test_get_non_deprecated_feeds():
@@ -39,17 +39,19 @@ def test_get_non_deprecated_feeds():
3939
"FEEDS_LIMIT": "5",
4040
},
4141
)
42-
@patch("batch_datasets.src.main.publish")
43-
@patch("batch_datasets.src.main.get_pubsub_client")
42+
@patch("main.publish")
43+
@patch("main.get_pubsub_client")
4444
def test_batch_datasets(mock_client, mock_publish):
4545
mock_client.return_value = MagicMock()
4646
with get_testing_session() as session:
4747
feeds = get_non_deprecated_feeds(session)
4848
with patch(
49-
"dataset_service.main.BatchExecutionService.__init__", return_value=None
49+
"shared.dataset_service.main.BatchExecutionService.__init__",
50+
return_value=None,
5051
):
5152
with patch(
52-
"dataset_service.main.BatchExecutionService.save", return_value=None
53+
"shared.dataset_service.main.BatchExecutionService.save",
54+
return_value=None,
5355
):
5456
batch_datasets(Mock())
5557
assert mock_publish.call_count == 5
@@ -64,7 +66,7 @@ def test_batch_datasets(mock_client, mock_publish):
6466
]
6567

6668

67-
@patch("batch_datasets.src.main.Database")
69+
@patch("main.Database")
6870
def test_batch_datasets_exception(database_mock):
6971
exception_message = "Failure occurred"
7072
mock_session = MagicMock()

functions-python/batch_process_dataset/function_config.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,8 @@
55
"timeout": 540,
66
"memory": "2Gi",
77
"trigger_http": true,
8-
"include_folders": ["database_gen", "helpers", "dataset_service"],
8+
"include_folders": ["helpers", "dataset_service"],
9+
"include_api_folders": ["database_gen"],
910
"secret_environment_variables": [
1011
{
1112
"key": "FEEDS_DATABASE_URL"

0 commit comments

Comments
 (0)