Skip to content

Commit 5b69e3a

Browse files
authored
[DevOps] ADR005: Development Lifecycle (#192)
* [DevOps] ADR005: Development Lifecycle * Add notes on spec files
1 parent 243c877 commit 5b69e3a

File tree

1 file changed

+102
-0
lines changed

1 file changed

+102
-0
lines changed

docs/adrs/005-release-strategy.md

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
# Development Lifecycle
2+
3+
Status: In-Progress.
4+
5+
## Background
6+
7+
The AI SDKs largely depend on the AI Core service, which currently is released roughly every 2 weeks.
8+
The release process itself (from the first integration landscape to the last production landscape) also takes around 2 weeks.
9+
10+
This raises the question when to release new versions of the SDKs.
11+
12+
## Development
13+
14+
For any AI Core functionality not yet publicly released, the following options can be considered for the AI SDKs:
15+
16+
- Allow no code
17+
- Allow test code
18+
- Allow (released) internal code
19+
- Allow (released) public API
20+
21+
Note that for JavaScript and Python, code can be marked as internal and thus be considered _non-public_.
22+
For Java, the options are more limited: Manually written code can be marked package-private, but generated code can not.
23+
Since most new features will rely on generated code, the options for Java are limited.
24+
25+
## Testing
26+
27+
Testing can be performed against multiple landscapes and can vary based on the type of tests.
28+
29+
- Unit tests may use test data from any landscape (including development landscapes) manually tested with e.g. Bruno.
30+
- E2E tests may use canary or production landscapes.
31+
In case of multiple landscapes:
32+
- GitHub matrix builds can be used to easily testing against multiple landscapes
33+
- For any differences between landscapes, test toggles need to be considered (e.g. `@EnabledIfSystemProperty`)
34+
- Such toggles come with a bit of additional maintenance cost, as they need to be removed once the feature is released to all landscapes
35+
36+
## Release
37+
38+
For releasing new AI SDK versions, several points during the AI Core release process can be considered:
39+
40+
- Release to any canary landscape
41+
- Release to all canary landscapes
42+
- Release to any production landscape
43+
- Release to all production landscapes
44+
45+
Furthermore, the following boundary conditions need to be considered:
46+
47+
- Releases to canary are usually deployed to production after 1-2 weeks
48+
- Releases to canary are not guaranteed to reach production
49+
- Different landscapes (both canary and prod) may host different versions of AI Core or different feature-toggle configurations
50+
- Historically, production EU10 is the most feature-complete landscape
51+
52+
Finally, please note that in case of daily delivery the term _release_ still holds, typically referring to the enablement of a feature toggle rather than a physical deployment.
53+
However, here there would be more freedom in when a feature toggle is enabled for which landscape, and the timelines can be much shorter than weeks.
54+
55+
## Decision
56+
57+
We decide as follows:
58+
59+
> 1. New AI SDK versions are released roughly every 2 weeks, shortly after new AI Core versions have been released to _production EU10_.
60+
> 2. Any _publicly available release_ of the AI SDKs must only contain _public API_ for AI Core features available in _production EU10_ under the service plan _extended_.
61+
> 3. E2E tests run against canary EU12 and production EU10 using test toggles.
62+
63+
Further explanations and notes:
64+
65+
- Any features released exclusively under the `sap-internal` plan are not supported.
66+
Similarly, any features released only to specific landscapes (other than prod EU10) are not supported.
67+
- There will be no releases of the SDKs to internal artifactory.
68+
- Please note that the following is allowed:
69+
- Public API in an unreleased SDK version for unreleased AI Core features.
70+
Notably, this will **block** the release of the SDK until the AI Core feature is released publicly.
71+
- Internal code in a released SDK version for unreleased AI Core features.
72+
73+
### An Example Development Lifecycle Iteration
74+
75+
The following depicts a development flow where the AI SDK development steps are performed as soon as possible.
76+
77+
1. A new AI Core feature is being developed.
78+
2. A PR is raised on the AI SDK with a corresponding implementation, but so far not E2E-tested.
79+
- Potentially aided by unit tests based on test data manually copied from e.g. Bruno.
80+
- Generated code is created from a development version of the relevant spec file.
81+
3. (+2 weeks later) The feature is released to EU12 canary landscape.
82+
4. The AI SDK PR is enhanced:
83+
- With an updated spec file.
84+
- With an E2E test against canary.
85+
5. The AI SDK PR is merged.
86+
6. (+1 week later) The feature is released to EU10 production landscape.
87+
7. The E2E test is updated to now also run against production.
88+
8. The AI SDK is released publicly.
89+
90+
In case of delays in the release process of AI Core:
91+
92+
- If step (3) is delayed, the open PR may be closed and re-opened later
93+
- If step (5) is delayed, we consider 3 options:
94+
1. The PR is reverted, together with potentially other related PRs
95+
2. The AI SDK release is delayed equally
96+
3. The AI SDK is released anyway with exceptional PO approval
97+
98+
## Further Links
99+
100+
- The single source of truth for all landscapes is in [mlf-gitops](https://github.tools.sap/MLF-prod/mlf-gitops-prod)
101+
- In particular, we care about the [version of orchestration in Prod EU10](https://github.tools.sap/MLF-prod/mlf-gitops-prod/blob/aws.eu-central-1.prod-eu/current/services/llm-orchestration/source/Chart.yaml)
102+
- [This JIRA ticket](https://jira.tools.sap/browse/AI-44024) tracks releases

0 commit comments

Comments
 (0)