Skip to content

Commit 38216a5

Browse files
jaimergpbeckermrCJ-Wright
committed
add blog/2020-07-11-R-4.md
Co-authored-by: beckermr <[email protected]> Co-authored-by: cj-wright <[email protected]>
1 parent 535f206 commit 38216a5

File tree

1 file changed

+114
-0
lines changed

1 file changed

+114
-0
lines changed

blog/2020-07-11-R-4.md

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
---
2+
authors:
3+
- cj-wright
4+
- beckermr
5+
tags: [scipy]
6+
---
7+
8+
# R 4.0 Migration Retrospective
9+
10+
While the R 4.0 migration has been functionally complete for quite a
11+
while, the recent migration of `r-java` and its dependents gives a good
12+
opportunity to write a retrospective on the technical issues with
13+
large-scale migrations in `conda-forge` and how we solved them.
14+
15+
The R 4.0 migration rebuilt every package in `conda-forge` that had
16+
`r-base` as a requirement, including more than 2200 feedstocks. A
17+
migration of this size in `conda-forge` faces several hurdles. First,
18+
since every feedstock is a separate GitHub repository, one needs to
19+
merge more 2200 pull requests (PRs). Second, `conda-forge`'s packages
20+
on `anaconda.org` are behind a CDN (content delivery network). This
21+
service reduces web hosting costs for Anaconda Inc. but introduces an
22+
approximately 30 minute delay from when a package is uploaded to
23+
`anaconda.org` and when it will appear as available using `conda` from
24+
the command line. Thus, even if the dependencies of a package have been
25+
built, we have to wait until they appear on the CDN before we can
26+
successfully issue the next PR and have it build correctly. Finally, the
27+
existing bot and `conda` infrastructure limited the throughput of the
28+
migrations, due in part to the speed of the `conda` solver.
29+
30+
Given the size of the R 4.0 migration, we took this opportunity to try
31+
out a bunch of new technology to speed up large-scale migrations. The
32+
main enhancements were using GitHub Actions to automerge PRs, using
33+
`mamba` to quickly check for solvability of package environments, and
34+
enabling long-running migration jobs for the autotick bot. All told, the
35+
bulk of the feedstocks for R 4.0 were rebuilt in less than a week, with
36+
many PRs being merged in 30 minutes or less from when they were issued.
37+
These enhancements to the autotick bot and `conda-forge` infrastructure
38+
can be used to enhance future migrations (e.g., Python 3.9) and reduce
39+
maintenance burdens for feedstocks.
40+
41+
## Automerging conda-forge PRs
42+
43+
In a typical migration on `conda-forge`, we issue a PR to a feedstock
44+
and then ask the feedstock maintainers to make sure it passes and merge
45+
it. In the case of the R 4.0 migration, the maintainers of R packages on
46+
`conda-forge` use a maintenance team (i.e., `@conda-forge/r`) on the
47+
vast majority of feedstocks. This team is small and so merging over 2000
48+
PRs by hand is a big undertaking. Thus, with their permission, we added
49+
the `conda-forge` automerge functionality to all R feedstocks that they
50+
maintain. The automerge bot, which relies on GitHub Actions, is able to
51+
automatically merge any PR from the autotick bot that passes the recipe
52+
linter, the continuous integration services, and has the special
53+
`[bot automerge]` slug in the PR title. This feature removed the
54+
bottleneck of waiting for maintainers to merge PRs and reduced the
55+
maintenance burden on the R maintenance team.
56+
57+
## Checking Solvability with mamba
58+
59+
While being able to automatically merge PRs removed much of the work of
60+
performing the R 4.0 migration, it relied on the PR building correctly
61+
the first time it was issued. Due to the CDN delays and the build times
62+
of a package's dependencies, the dependencies of a package may not be
63+
immediately available after all of their migration PRs are merged. If
64+
the bot issued the packages migration PR before the dependents are
65+
available, the PR would fail with an unsolvable environment and have to
66+
be restarted manually. This failure would negate any of the benefits of
67+
using automerge in the first place.
68+
69+
To control for this edge case, we employed the `mamba` package to check
70+
for the solvability of a PR's environments before the PR was issued.
71+
`mamba` is a fast alternative to `conda` that produces solutions for
72+
environments orders of magnitude more quickly. Since, we have to perform
73+
our checks of PR environments many times, an extremely fast solver was
74+
essential for making the code efficient enough to run as part of the
75+
autotick bot. We ended up using mamba to try to install the dependencies
76+
for every variant produced by the feedstock to be migrated. With this
77+
check in place, the autotick bot was able to issue migration PRs that
78+
passed on the first try and were thus automatically merged, many within
79+
30 minutes or less.
80+
81+
## Improving the Autotick Bot's Efficiency
82+
83+
Finally, we made several upgrades to the autotick bot infrastructure to
84+
increase the uptime of the bot and its efficiency. First, we moved from
85+
an hourly cron job to a set of chained CI jobs. This change eliminated
86+
downtime between the runs of the bot. Second, we started to refactor the
87+
autotick bot from one monolithic piece of code into a distributed set of
88+
microservices which perform various independent tasks in parallel. These
89+
independent tasks, used for things like checking the statuses of
90+
previously issued PRs, are run separately allowing the bot to spend more
91+
time issuing PRs. Finally, we optimized the internal prioritization of
92+
the PRs to make sure the bot was spending more time on larger migrations
93+
where there is more work to do. More work on the autotick bot
94+
infrastructure, including work done by Vinicius Cerutti as part of the
95+
Google Summer of Code program, will further streamline the bot's
96+
operation.
97+
98+
Despite some initial hiccups with the bot infrastructure, the migration
99+
ran quite smoothly for an endeavor of its size. The vast majority of
100+
migration PRs were completed within a week from when we started, which
101+
is a first for a migration of this size on `conda-forge`. The largest
102+
issue was solved recently, with the fixing of the `openjdk` recipe and
103+
the removal of `aarch64` and `ppc64le` builds from `r-java`, enabling
104+
the last large piece of the R ecosystem to be updated.
105+
106+
Looking forward, the improvements we made for the R 4.0 migration seem
107+
broadly applicable to other migration tasks, including the yearly python
108+
minor version bump. These kinds of large-scale migrations are
109+
particularly suitable, since they usually involve few changes to the
110+
feedstock itself and usually fail on CI when a broken package would be
111+
produced. Faster migrations will help to provide the latest features to
112+
downstream users and keep transition times to a minimum, helping to
113+
foster greater stability of the ecosystem and the seamless experience
114+
users have come to expect from `conda-forge`.

0 commit comments

Comments
 (0)