|
| 1 | +# Pull Request FAQ |
| 2 | + |
| 3 | +The `pullrequest` pipeline is a single public definition that handles all `pull request` changes to an `azure-sdk-for-X` repository. This document is intended to answer some common questions that users may have about the `pullrequest` definition. |
| 4 | + |
| 5 | +## Can I get a bit more context first? |
| 6 | + |
| 7 | +Let's get some basic repo structure discussion out of the way. The `azure-sdk` team maintains a consistent repo structure for all shipping packages to package managers (Read NPM, Nuget, pypi, Maven, etc) |
| 8 | + |
| 9 | +```jsonc |
| 10 | +sdk/ |
| 11 | + storage |
| 12 | + Azure.Storage.Blobs |
| 13 | + Azure.Storage.Queues |
| 14 | + ... |
| 15 | + <service> |
| 16 | + <service-package-1> |
| 17 | + .. |
| 18 | + <service-package-N> |
| 19 | + // the ci.yml is what AZDO build defs are based upon |
| 20 | + ci.yml |
| 21 | +``` |
| 22 | + |
| 23 | +This necessitates that release definitions on the [internal](https://dev.azure.com/azure-sdk/internal/) azure devops exist for each service in a repository. However, each build definition can only build and ship packages **within the service it was created for**. |
| 24 | + |
| 25 | +This service-directory also applied to `public` build definitions that triggered on [pull requests](https://github.com/Azure/azure-sdk-for-python/pulls) in our repos. Due to this, large changesets that touched multiple service directories would incur a build for _every service directory that was touched_. The `azure-sdk` EngSys calls this situation a `build storm`. |
| 26 | + |
| 27 | +The `<language> - pullrequest` definitions entirely replace service-specific build definitions. It has the ability to expand and contract the targeted packages for build according to a **git diff** of the actual changes made. Because of this, any repo that has cut over to `pullrequest` will enjoy no longer incurring build storms on large cross-cutting changes. While the individual build run will be very long running and batch up tests across a bunch of agents, it _will_ eventually complete. It will be _impossible_ to exhaust GitHub or Azure DevOps token utilization as well, given that it is a single definition triggering checks. |
| 28 | + |
| 29 | +The `pullrequest` pipeline is currently deployed in the following `azure-sdk` repositories: |
| 30 | + |
| 31 | +| Pipeline Def | Completed? | |
| 32 | +|---|---| |
| 33 | +| [Java](https://dev.azure.com/azure-sdk/public/_build?definitionId=7413) |❌| |
| 34 | +| [JS](https://dev.azure.com/azure-sdk/public/_build?definitionId=7140) |✅| |
| 35 | +| [.NET](https://dev.azure.com/azure-sdk/public/_build?definitionId=7327) |❌| |
| 36 | +| [Python](https://dev.azure.com/azure-sdk/public/_build?definitionId=7050) |✅| |
| 37 | +| [Rust](https://dev.azure.com/azure-sdk/public/_build?definitionId=7126) |✅| |
| 38 | + |
| 39 | +Only repos that appear in the above list are enabled with a single unified `pullrequest` pipeline. All other `azure-sdk` shipping repositories ship and PR using a build definition per service directory. |
| 40 | + |
| 41 | +## Pullrequest pipeline order of operations |
| 42 | + |
| 43 | +- Generate a PR diff |
| 44 | +- Save Package Properties using the `diff` |
| 45 | +- Run `build` and `analyze` steps only against artifacts that come out of the package-properties folder |
| 46 | + - The primary change between service build and the pullrequest build is the scoping mechanism. For a service build, a specific service directory is examined for packages. For a pullrequest build, the entire repository is considered before being scoped down to only packages that were actually changed. |
| 47 | +- Tests are run against `indirect`ly and `direct`ly changed packages separately in batches. |
| 48 | + |
| 49 | +## What is a `direct` vs `indirect` change? |
| 50 | + |
| 51 | +- A `direct`ly changed package is one whose actual package code has changed. |
| 52 | +- An `indirect` changed package is a package that has been added for verification of code that is not directly within the package itself. |
| 53 | + - For example, in `java`, when the `eng/` package is changed, we trigger `azure-core` indirectly. |
| 54 | + |
| 55 | +## Why do I see jobs with `bX` or `ibY` suffixes? |
| 56 | + |
| 57 | +As mentioned above, direct and indirect packages are batched separately. Batching is best explained by the following pseudocode |
| 58 | + |
| 59 | +``` |
| 60 | +batchSize = configurable # of packages in each test batch, defaults to 10 |
| 61 | +directPackages = the list of packages with directly changed code in the PR |
| 62 | +
|
| 63 | +group the direct packages by matrix configuration |
| 64 | + - each matrix contribution |
| 65 | + - group by batch size |
| 66 | + - assign the matrix to the full batch |
| 67 | + - if multiple batches exist, add suffix |
| 68 | +``` |
| 69 | + |
| 70 | +Notice that packages are grouped initially by _the matrix associated with their ci.yml_. In the `pullrequest` pipeline, the service directory of a package no longer matters, only what matrix it belongs to. |
| 71 | + |
| 72 | +`indirect` batching works the same way, but doesn't use the _full_ test matrix by default. It instead deterministically selects a single item from the resolved test matrix and assigns the batch of packages to it. |
| 73 | + |
| 74 | +The suffixes `b1` or `ib1` or are added automatically as needed by the job pull request [matrix creation.](https://github.com/Azure/azure-sdk-tools/blob/main/eng/common/scripts/job-matrix/Create-PrJobMatrix.ps1). |
| 75 | + |
| 76 | +## Can I disable this matrix batching? |
| 77 | + |
| 78 | +Yes! Users can entirely disable the batching for a specific matrix by setting `PRBatching` to false in the matrix configuration. |
| 79 | + |
| 80 | +Example: |
| 81 | + |
| 82 | +```yml |
| 83 | +MatrixConfigs: |
| 84 | + - Name: version_overrides_tests |
| 85 | + Path: sdk/core/version-overrides-matrix.json |
| 86 | + Selection: all |
| 87 | + PRBatching: false # the new key |
| 88 | + GenerateVMJobs: true |
| 89 | +``` |
0 commit comments