Skip to content

Commit 99590a9

Browse files
authored
First draft
1 parent 5cb6eed commit 99590a9

File tree

1 file changed

+85
-0
lines changed

1 file changed

+85
-0
lines changed
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
### Title
2+
Diagnose and improve performance for the Astropy package
3+
4+
### Project Team
5+
Manodeep Sinha, ...
6+
7+
### Project Description
8+
9+
There are currently 63 open issues tagged with the [Performance label](https://github.com/astropy/astropy/issues?q=is%3Aissue+is%3Aopen+label%3APerformance),
10+
with the oldest issue nearly 8 years old - highlighting that dedicated effort might be necessary to reduce the
11+
number of open issues. Even within the [Astropy roadmap](https://github.com/astropy/astropy-project/blob/main/roadmap/roadmap.md),
12+
`Hardware and Performance` are tagged yellow and red, once again showing that dedicated effort is required to improve
13+
how Astropy performs. This project aims to identify and mitigate high-impact performance issues within the Astropy codebase.
14+
15+
### Project / Work
16+
17+
The proposed project will consist of these two aspects:
18+
19+
- Performance within python code:
20+
There are odd things that can happen with performance. For example, during PyAstro 2018,
21+
we found out that left-multiplying by units was significantly faster than (the usual convention)
22+
right-multiply, i.e., `u.rad * x` was faster than `x * u.rad`. This would involve identifying
23+
alternate python code that speed up existing python code while remaining 100% backwards compatible.
24+
Micro-benchmarks would need to be added to the [benchmarking CI](https://github.com/astropy/astropy/pull/15779)
25+
to catch regressions in the future.
26+
27+
- C code review:
28+
The astropy source repo has about 30% C code. A quick look shows that most of
29+
the C code is written to be portable and not necessarily for performance. While this was a
30+
good choice previously, the compilers and the supported OS's have improved sufficiently that
31+
C code can now be written to be both *portable* and *performant*. For example, [this line](https://github.com/astropy/astropy/blob/47a8dac8f1dab6ce592be6048f70140adead4e1d/astropy/stats/src/compute_bounds.c#L23)
32+
could easily be vectorised by adding a `#pragma omp simd reduction(+:mean)`, and
33+
assuming that the compiler is not auto-vectorising magically, that change might lead to
34+
a dramatic speed boost, certainly for large N. There might be easy performance gains like
35+
these throughout the C code base. In the ~5 mins that I spent looking at the C code, I have
36+
noticed another performance issue that I have reported as a new [issue](https://github.com/astropy/astropy/issues/16136.
37+
38+
As an ancillary benefit, I am also happy to be tagged as a C code reviewer, particularly
39+
for code changes that (might) affect performance. I will note that such performance improvements
40+
works nicely with the [CI Benchmarks](https://github.com/astropy/astropy/pull/15779) PR
41+
from @MridulS and the performance improvements from @neutrinoceros.
42+
43+
I am saying performance optimisation in the broad sense, and that includes reducing memory requirements,
44+
e.g., issues like [this reported memory exhaustion](https://github.com/astropy/astropy/issues/14002).
45+
46+
##### What I am unsure about
47+
How to go about identifying which performance issues, when solved, will make the biggest impact
48+
for the Astropy end-users. For example, this could be identifying ones from the existing issues
49+
tagged with the performance label. Or are there potential issues that are as yet unknown (convolution,
50+
sky matching queries etc) that can have a major impact?
51+
52+
In the second case, it may be that even creating micro-benchmarks for computationally expensive
53+
functions like convolution/match-coords are useful additions to the benchmark CI jobs.
54+
55+
### Approximate Budget
56+
I have guess-timated 24 hours as the time required to solve one
57+
performance issue, broken down as 16 hours (i.e., 2 work-days) for finding
58+
potential solutions, 6 hours for response to feedback + iterating + getting
59+
to optimal solution by consensus, and finally 2 hours to get PR approved and
60+
merged.
61+
62+
I have tentatively estimated 20 hrs/yr to review existing (15 hrs) and new (5 hrs)
63+
C code for performance issues.
64+
65+
```
66+
Proposed budget: Two known performance issues + two unknown
67+
Salary (performance): 4 issues/yr * 24 hrs/issue * USD 150/hr = USD 14.4k/yr
68+
Salary (C code review): 20 hrs/yr * USD 150/hr = USD 3k/yr
69+
Travel to USA: International airfare + lodging + incidentals for 1 week = USD 5k/yr
70+
Partial salary during travel: 1 week/yr * 20 hrs/week * USD 150/hr = USD 3k/yr
71+
Total: USD 25.4k/yr
72+
```
73+
74+
```
75+
Minimum feasible budget: One known performance issue + one unknown
76+
Salary (performance issues): 2 issues/yr * 24 hrs/issue * USD 150/hr = USD 7.2k/yr
77+
Salary (C code review): 20 hrs/yr * USD 150/hr = USD 3k/yr
78+
Total: USD 10.2k/yr
79+
```
80+
81+
I am happy to work through and iterate on the budget as deemed necessary.
82+
83+
### Period of Performance
84+
85+
This project is proposed for 3 years (subject to funding availability), starting late-Apr/early-May, 2024.

0 commit comments

Comments
 (0)