Skip to content

Commit ebf3b44

Browse files
authored
Merge pull request #22 from JuliaAstro/multithreading
Initial implementation for OhMyThreads extension.
2 parents 8f0e14a + 2fc1712 commit ebf3b44

19 files changed

+1006
-18
lines changed

Project.toml

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
name = "SolarPosition"
22
uuid = "5b9d1343-a731-5a90-8730-7bf8d89bf3eb"
3-
authors = ["Stefan de Lange"]
43
version = "0.1.0"
4+
authors = ["Stefan de Lange"]
5+
6+
[workspace]
7+
projects = ["test", "docs"]
58

69
[deps]
710
Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
@@ -13,23 +16,26 @@ TimeZones = "f269a46b-ccf7-5d73-abea-4c690281aa53"
1316
[weakdeps]
1417
Makie = "ee78f7c6-11fb-53f2-987a-cfe4a2b5a57a"
1518
ModelingToolkit = "961ee093-0014-501f-94e3-6117800e7a78"
19+
OhMyThreads = "67456a42-1dca-4109-a031-0a68de7e3ad5"
1620
Symbolics = "0c5d862f-8b57-4792-8d23-62f2024744c7"
1721

1822
[extensions]
1923
SolarPositionMakieExt = "Makie"
2024
SolarPositionModelingToolkitExt = ["ModelingToolkit", "Symbolics"]
25+
SolarPositionOhMyThreadsExt = "OhMyThreads"
2126

2227
[compat]
2328
Aqua = "0.8"
2429
Dates = "1"
2530
DocStringExtensions = "0.8, 0.9"
2631
Makie = "0.24"
2732
ModelingToolkit = "10.3.0 - 10.26.1"
28-
StructArrays = "0.6, 0.7"
33+
OhMyThreads = "0.8"
34+
StructArrays = "0.7"
35+
Symbolics = "6,7"
2936
Tables = "1"
3037
Test = "1"
3138
TimeZones = "1.22.0"
32-
Symbolics = "6,7"
3339
julia = "1.10"
3440

3541
[extras]
@@ -38,6 +44,3 @@ Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
3844

3945
[targets]
4046
test = ["Aqua", "Test"]
41-
42-
[workspace]
43-
projects = ["test", "docs"]

docs/Project.toml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,15 @@ Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
77
DocumenterCitations = "daee34ce-89f3-4625-b898-19384cb65244"
88
LiveServer = "16fef848-5104-11e9-1b77-fb7a48bbb589"
99
ModelingToolkit = "961ee093-0014-501f-94e3-6117800e7a78"
10+
OhMyThreads = "67456a42-1dca-4109-a031-0a68de7e3ad5"
1011
OrdinaryDiffEq = "1dea7af3-3e70-54e6-95c3-0bf5283fa5ed"
1112
SolarPosition = "5b9d1343-a731-5a90-8730-7bf8d89bf3eb"
13+
StructArrays = "09ab397b-f2b6-538f-b94a-2f83cf4a842a"
1214
TimeZones = "f269a46b-ccf7-5d73-abea-4c690281aa53"
1315

16+
[sources]
17+
SolarPosition = {path = ".."}
18+
1419
[compat]
1520
Documenter = "1"
1621
DocumenterCitations = "1"

docs/make.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ makedocs(;
3737
"Examples" => [
3838
"examples/getting-started.md",
3939
"examples/plotting.md",
40+
"examples/parallel.md",
4041
"examples/modelingtoolkit.md",
4142
],
4243
"reference.md",

docs/src/examples/parallel.md

Lines changed: 269 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,269 @@
1+
# Parallel Computing with OhMyThreads.jl
2+
3+
SolarPosition.jl provides a parallel computing extension using [`OhMyThreads.jl`](https://github.com/JuliaFolds2/OhMyThreads.jl)
4+
for efficient multithreaded solar position calculations across large time series. This
5+
extension is particularly useful when processing thousands of timestamps, where
6+
parallelization can provide significant speedups.
7+
8+
## Installation
9+
10+
The OhMyThreads extension is loaded automatically when both [`SolarPosition.jl`](https://github.com/JuliaAstro/SolarPosition.jl) and [`OhMyThreads.jl`](https://github.com/JuliaFolds2/OhMyThreads.jl)
11+
are loaded:
12+
13+
```julia
14+
using SolarPosition
15+
using OhMyThreads
16+
```
17+
18+
!!! note "Thread Configuration"
19+
Julia must be started with multiple threads to benefit from parallelization. Use
20+
`julia --threads=auto` or set the `JULIA_NUM_THREADS` environment variable. Check
21+
the number of available threads with `Threads.nthreads()`.
22+
23+
## Quick Start
24+
25+
The extension adds new methods to [`solar_position`](@ref) and [`solar_position!`](@ref)
26+
that accept an `OhMyThreads.Scheduler` as the last argument. These methods automatically
27+
parallelize computations across the provided timestamp vector.
28+
29+
```@example parallel
30+
using SolarPosition
31+
using OhMyThreads
32+
using Dates
33+
using StructArrays
34+
35+
# Create observer location
36+
obs = Observer(51.5, -0.18, 15.0) # London
37+
38+
# Generate a year of minute timestamps
39+
times = collect(DateTime(2024, 1, 1):Minute(1):DateTime(2025, 1, 1))
40+
41+
# Parallel computation with DynamicScheduler
42+
t0 = time()
43+
positions = solar_position(obs, times, PSA(), NoRefraction(), DynamicScheduler())
44+
dt_parallel = time() - t0
45+
println("Time taken (parallel): $(round(dt_parallel, digits=5)) seconds")
46+
```
47+
48+
Now we compare this to the serial version:
49+
50+
```@example parallel
51+
# Serial computation (no scheduler argument)
52+
t0 = time()
53+
positions_serial = solar_position(obs, times, PSA(), NoRefraction())
54+
dt_serial = time() - t0
55+
println("Time taken (serial): $(round(dt_serial, digits=5)) seconds")
56+
```
57+
58+
We observe a speedup of:
59+
60+
```@example parallel
61+
speedup = dt_serial / dt_parallel
62+
println("Speedup: $(round(speedup, digits=2))×")
63+
```
64+
65+
### Simplified Syntax
66+
67+
You can also use the simplified syntax with the scheduler as the third argument, which
68+
uses the default algorithm (PSA) and no refraction correction:
69+
70+
```@example parallel
71+
# Simplified syntax with default algorithm
72+
positions = solar_position(obs, times, DynamicScheduler())
73+
@show first(positions, 3)
74+
```
75+
76+
## Available Schedulers
77+
78+
OhMyThreads.jl provides different scheduling strategies optimized for various workload
79+
characteristics:
80+
81+
### DynamicScheduler
82+
83+
The [`DynamicScheduler`](https://juliafolds2.github.io/OhMyThreads.jl/stable/refs/api/#OhMyThreads.DynamicScheduler)
84+
is the default and recommended scheduler for most workloads. It dynamically balances
85+
tasks among threads, making it suitable for non-uniform workloads where computation
86+
times may vary. Please visit the `OhMyThreads.jl` documentation for more details.
87+
88+
```@example parallel
89+
# Dynamic scheduling (recommended)
90+
positions = solar_position(obs, times, PSA(), NoRefraction(), DynamicScheduler());
91+
nothing # hide
92+
```
93+
94+
### StaticScheduler
95+
96+
The [`StaticScheduler`](https://juliafolds2.github.io/OhMyThreads.jl/stable/refs/api/#OhMyThreads.StaticScheduler)
97+
partitions work statically among threads. This can be more efficient for uniform
98+
workloads where all computations take approximately the same time.
99+
100+
```@example parallel
101+
# Static scheduling for uniform workloads
102+
positions = solar_position(obs, times, PSA(), NoRefraction(), StaticScheduler())
103+
nothing # hide
104+
```
105+
106+
## In-Place Computation
107+
108+
For maximum performance and minimal allocations, use the in-place version
109+
[`solar_position!`](@ref) with a pre-allocated [`StructVector`](https://github.com/JuliaArrays/StructArrays.jl):
110+
111+
```@example parallel
112+
using StructArrays
113+
114+
# Pre-allocate output array
115+
positions = StructVector{SolPos{Float64}}(undef, length(times))
116+
117+
# Compute in-place
118+
solar_position!(positions, obs, times, PSA(), NoRefraction(), DynamicScheduler())
119+
nothing # hide
120+
```
121+
122+
The in-place version avoids allocating the output array and minimizes intermediate
123+
allocations, making it ideal for repeated computations or memory-constrained
124+
environments.
125+
126+
## Performance Comparison
127+
128+
Here's a typical performance comparison between serial and parallel execution:
129+
130+
```julia
131+
using BenchmarkTools
132+
133+
### Serial execution (no scheduler argument)
134+
@benchmark solar_position($obs, $times, PSA(), NoRefraction())
135+
# BenchmarkTools.Trial: 57 samples with 1 evaluation per sample.
136+
# Range (min … max): 83.994 ms … 98.110 ms ┊ GC (min … max): 0.00% … 12.50%
137+
# Time (median): 87.907 ms ┊ GC (median): 0.66%
138+
# Time (mean ± σ): 88.194 ms ± 2.478 ms ┊ GC (mean ± σ): 1.39% ± 2.23%
139+
140+
# ▁ █
141+
# ▆▁▄▁▁▁▇▆▆▁▁▇▄█▄▁▇▆▇▇▄▄▁▄▄▆▆█▆▆▄▁▆▁▁▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▁
142+
# 84 ms Histogram: frequency by time 95.8 ms <
143+
144+
# Memory estimate: 12.06 MiB, allocs estimate: 9.
145+
146+
### Parallel execution with DynamicScheduler
147+
@benchmark solar_position($obs, $times, PSA(), NoRefraction(), DynamicScheduler())
148+
# BenchmarkTools.Trial: 312 samples with 1 evaluation per sample.
149+
# Range (min … max): 7.588 ms … 35.575 ms ┊ GC (min … max): 0.00% … 74.79%
150+
# Time (median): 14.718 ms ┊ GC (median): 6.16%
151+
# Time (mean ± σ): 16.026 ms ± 6.387 ms ┊ GC (mean ± σ): 23.51% ± 19.37%
152+
153+
# ▆▆█▁▃▂▅ ▁▁▁▄▁▃▅▁ ▁▄ ▁
154+
# █▇███████▄████████▇▄██▆█▇▆▇█▅▆▅▆▄▄▁▅▄▁▁▄▃▄▅▄▃▃▄▃▄▆▃▃▁▄▄▁▃▁▃ ▄
155+
# 7.59 ms Histogram: frequency by time 34.4 ms <
156+
157+
# Memory estimate: 66.59 MiB, allocs estimate: 468.
158+
159+
### In-place parallel execution
160+
pos = StructVector{SolPos{Float64}}(undef, length(times))
161+
@benchmark solar_position!($pos, $obs, $times, PSA(), NoRefraction(), DynamicScheduler())
162+
# BenchmarkTools.Trial: 908 samples with 1 evaluation per sample.
163+
# Range (min … max): 4.061 ms … 7.846 ms ┊ GC (min … max): 0.00% … 0.00%
164+
# Time (median): 5.532 ms ┊ GC (median): 0.00%
165+
# Time (mean ± σ): 5.501 ms ± 644.881 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
166+
167+
# ▃▅█▄▂▂ ▁
168+
# ▃▄▅▃▄▃▃▅▃▄▃▁▃▂▃▄▂▃▂▂▃▂▂▃▂▅▃▃████████▆▅▅▄▆▅██▅▅▆▄▅▄▃▂▃▃▃▃▃▂▂ ▃
169+
# 4.06 ms Histogram: frequency by time 6.72 ms <
170+
171+
# Memory estimate: 20.47 KiB, allocs estimate: 284.
172+
173+
### In-place parallel execution with StaticScheduler
174+
@benchmark solar_position!($pos, $obs, $times, PSA(), NoRefraction(), StaticScheduler())
175+
# BenchmarkTools.Trial: 902 samples with 1 evaluation per sample.
176+
# Range (min … max): 4.027 ms … 7.228 ms ┊ GC (min … max): 0.00% … 0.00%
177+
# Time (median): 5.842 ms ┊ GC (median): 0.00%
178+
# Time (mean ± σ): 5.537 ms ± 802.636 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
179+
180+
# ▃▁ ▁▄▃ ▁ ▁▅▆█▂▄▇▇▅▁▃ ▁
181+
# ██▃▃▅███▅▆▆▄▄▃▃▃▃▃▂▄▅▂▁▃▆▃▃▂▅▃▅█▇▆▂▃▅▅██████████████▇█▆▇▇▂▂ ▄
182+
# 4.03 ms Histogram: frequency by time 6.72 ms <
183+
184+
# Memory estimate: 15.97 KiB, allocs estimate: 220.
185+
```
186+
187+
On a system with 32 threads processing 527,041 timestamps (one year, minutely):
188+
189+
| Method | Time | Speedup | Allocations |
190+
|--------|------|---------|-------------|
191+
| Serial | 87.9 ms | 1.0× | 12.06 MiB |
192+
| Parallel (DynamicScheduler) | 14.7 ms | **6.0×** | 66.59 MiB |
193+
| In-place (DynamicScheduler) | 5.53 ms | **15.9×** | 20.47 KiB |
194+
| In-place (StaticScheduler) | 5.84 ms | **15.0×** | 15.97 KiB |
195+
196+
!!! tip "Performance Tips"
197+
For the best performance:
198+
- Use [`solar_position!`](@ref) with pre-allocated output for minimal allocations
199+
- Use `DynamicScheduler()` for most workloads
200+
- Ensure Julia is running with multiple threads (e.g., `--threads=auto`)
201+
- Process larger batches of timestamps to amortize threading overhead
202+
203+
## Working with Different Time Types
204+
205+
The parallel methods work with both [`DateTime`](https://docs.julialang.org/en/v1/stdlib/Dates/#Dates.DateTime) and [`ZonedDateTime`](https://juliatime.github.io/TimeZones.jl/stable/types/#TimeZones.ZonedDateTime):
206+
207+
```@example parallel
208+
using TimeZones
209+
210+
# Using ZonedDateTime (avoiding DST transitions)
211+
tz = tz"Europe/London"
212+
# Use a subset of times to avoid DST transition issues in documentation
213+
summer_times = collect(DateTime(2024, 6, 1):Hour(1):DateTime(2024, 7, 1))
214+
zoned_times = ZonedDateTime.(summer_times, tz)
215+
216+
# Parallel computation with time zone aware timestamps
217+
zoned_positions = solar_position(obs, zoned_times, PSA(), NoRefraction(), DynamicScheduler())
218+
219+
println("Computed $(length(zoned_positions)) positions with time zone awareness")
220+
```
221+
222+
## Algorithm Comparison
223+
224+
The parallel interface works with all solar position algorithms:
225+
226+
```@example parallel
227+
# Test different algorithms in parallel
228+
algorithms = [PSA(), NOAA(), SPA()]
229+
230+
for alg in algorithms
231+
pos = solar_position(obs, times[1:100], alg, NoRefraction(), DynamicScheduler())
232+
println("$(typeof(alg).name.name): azimuth=$(round(pos.azimuth[50], digits=5))°")
233+
end
234+
```
235+
236+
## Refraction Correction
237+
238+
Atmospheric refraction corrections can be applied in parallel computations:
239+
240+
```@example parallel
241+
# Parallel computation with Bennett refraction correction
242+
positions_refracted = solar_position(
243+
obs,
244+
times,
245+
PSA(),
246+
BENNETT(),
247+
DynamicScheduler()
248+
)
249+
250+
println("First position with refraction:")
251+
println(" Apparent elevation: $(round(positions_refracted.apparent_elevation[1], digits=2))°")
252+
```
253+
254+
## Implementation Details
255+
256+
The extension uses OhMyThreads' [`tmap`](https://juliafolds2.github.io/OhMyThreads.jl/stable/refs/api/#OhMyThreads.tmap)
257+
and [`tmap!`](https://juliafolds2.github.io/OhMyThreads.jl/stable/refs/api/#OhMyThreads.tmap!)
258+
for task-based parallelism. Each timestamp is processed independently, making the
259+
computation embarrassingly parallel with no inter-thread communication required.
260+
261+
The results from `tmap` are automatically converted to a [`StructVector`](https://github.com/JuliaArrays/StructArrays.jl)
262+
for efficient columnar storage compatible with the rest of SolarPosition.jl's API.
263+
264+
## See Also
265+
266+
- [Solar Positioning](@ref solar-positioning-algorithms) - Available positioning algorithms
267+
- [Refraction Correction](@ref refraction-correction) - Atmospheric refraction methods
268+
- [OhMyThreads.jl Documentation](https://juliafolds2.github.io/OhMyThreads.jl/stable/) - Task-based parallelism framework
269+
- [Julia Threading Documentation](https://docs.julialang.org/en/v1/manual/multi-threading/) - Julia's threading capabilities

docs/src/positioning.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,11 @@ Typically solar position algorithms can take the following set of inputs:
2121
- Date and time: in UTC or local time with timezone information
2222
- Optional atmospheric parameters: pressure and temperature (for refraction correction)
2323

24+
## Example: Solar Path Plotting
25+
26+
Solar positions can be calculated using [`solar_position`](@ref solar_position)
27+
and the in-place version [`solar_position!`](@ref solar_position!) functions.
28+
2429
As an example, we plot the longest day of the year solar path for an observer located
2530
at the Van Gogh museum in Amsterdam (52.35888°N, 4.88185°E) on June 21, 2023:
2631

examples/Project.toml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,12 @@
11
[deps]
2+
BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
23
Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
34
ModelingToolkit = "961ee093-0014-501f-94e3-6117800e7a78"
5+
OhMyThreads = "67456a42-1dca-4109-a031-0a68de7e3ad5"
46
OrdinaryDiffEq = "1dea7af3-3e70-54e6-95c3-0bf5283fa5ed"
57
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
68
SolarPosition = "5b9d1343-a731-5a90-8730-7bf8d89bf3eb"
9+
StructArrays = "09ab397b-f2b6-538f-b94a-2f83cf4a842a"
10+
11+
[sources]
12+
SolarPosition = { path = ".." }

0 commit comments

Comments
 (0)