You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/GettingStarted/Environment.md
+27-6Lines changed: 27 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,16 +7,19 @@ description: How to set up environment variables for DevLake
7
7
This document explains how to set environment variables for Apache DevLake and what environment variables can be set.
8
8
9
9
## Environment Variables
10
+
10
11
### ENABLE_SUBTASKS_BY_DEFAULT
12
+
11
13
This environment variable is used to enable or disable the execution of subtasks.
12
14
13
15
#### How to set
16
+
14
17
The format is as follows: plugin_name1:subtask_name1:enabled_value,plugin_name2:subtask_name2:enabled_value,plugin_name3:subtask_name3:enabled_value
15
-
18
+
16
19
Guidance on locating the [plugin_name and subtask_name](https://github.com/apache/incubator-devlake/blob/release-v1.0/backend/plugins/jira/tasks/issue_changelog_collector.go#L41):
17
20
18
21
- plugin_name: Represents the plugin's name, such as 'jira' for the Jira plugin.
19
-
- subtask_name: Denotes the subtask's name, like 'collectIssueChangelogs' for the Jira plugin."
22
+
- subtask_name: Denotes the subtask's name, like 'collectIssueChangelogs' for the Jira plugin."
20
23
21
24
Example 1: Enable some subtasks that are closed by default
| GITHUB_GRAPHQL_JOB_COLLECTION_MODE | Specifies the mode of job collection. Possible values are `BATCHING` and `PAGINATING`|`BATCHING`|
43
+
| GITHUB_GRAPHQL_JOB_BATCHING_INPUT_STEP | Defines the step size for batching mode. |`10`|
44
+
| GITHUB_GRAPHQL_JOB_BATCHING_PAGE_SIZE | Defines the limit of jobs to collect in a batch for each run. |`20`|
45
+
| GITHUB_GRAPHQL_JOB_PAGINATING_PAGE_SIZE | Defines the page size for paginating mode. |`50`|
36
46
47
+
#### When to Use
37
48
49
+
These environment variables are particularly useful when dealing with large repositories that have a significant number of job runs. By adjusting these settings, you can optimize the data collection process to better suit your specific needs and infrastructure capabilities. Also this can help to avoid timeouts on the github GraphQL API with too large requests.
38
50
51
+
- Use `BATCHING` for `GITHUB_GRAPHQL_JOB_COLLECTION_MODE` when your workflow runs typically have less than 20 jobs and you want to minimize the number of API calls to GitHub.
52
+
- Adjust `GITHUB_GRAPHQL_JOB_BATCHING_INPUT_STEP` and `GITHUB_GRAPHQL_JOB_BATCHING_PAGE_SIZE` to control how many jobs are collected in each batch. **NOTE:** Increasing these values can lead to timeouts if the requests become too large.
53
+
- Use `PAGINATING` for `GITHUB_GRAPHQL_JOB_COLLECTION_MODE` when your workflow runs have a large number of jobs (e.g., more than 50). This mode will only query 1 Workflow run at a time and paginate through the jobs, reducing the risk of timeouts.
54
+
- Adjust `GITHUB_GRAPHQL_JOB_PAGINATING_PAGE_SIZE` to control how many jobs are fetched per page. A smaller page size can help avoid timeouts but may increase the total number of API calls.
39
55
56
+
TLDR: `BATCHING` is more efficient for smaller workflows, while `PAGINATING` will guarantee complete collection of jobs for larger workflows.
40
57
58
+
## How to take effect
41
59
60
+
After setting the environment variable, restart the DevLake service to take effect.
42
61
62
+
- For Docker Compose, run `docker-compose down` and `docker-compose up -d`.
63
+
- For Helm, run `helm upgrade devlake devlake/devlake --recreate-pods`.
0 commit comments