Skip to content

Commit 0771e97

Browse files
Fix up docs
1 parent c97b6fe commit 0771e97

File tree

2 files changed

+34
-177
lines changed

2 files changed

+34
-177
lines changed

docs/src/tutorials/autotune.md

Lines changed: 33 additions & 176 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,12 @@
22

33
LinearSolve.jl includes an automatic tuning system that benchmarks all available linear algebra algorithms on your specific hardware and automatically selects optimal algorithms for different problem sizes and data types. This tutorial will show you how to use the `LinearSolveAutotune` sublibrary to optimize your linear solve performance.
44

5+
!!! warn
6+
7+
This is still in development. At this point the tuning will not result in different settings
8+
but it will run the benchmarking and create plots of the performance of the algorithms. A
9+
future version will use the results to set preferences for the algorithms.
10+
511
## Quick Start
612

713
The simplest way to use the autotuner is to run it with default settings:
@@ -16,7 +22,7 @@ results = autotune_setup()
1622

1723
This will:
1824
- Benchmark 4 element types: `Float32`, `Float64`, `ComplexF32`, `ComplexF64`
19-
- Test matrix sizes from small (4×4) to medium (500×500)
25+
- Test matrix sizes from small (4×4), medium (500×500), to large (10,000×10,000)
2026
- Create performance plots for each element type
2127
- Set preferences for optimal algorithm selection
2228
- Share results with the community (if desired)
@@ -81,7 +87,21 @@ results = autotune_setup(samples = 10, seconds = 2.0)
8187

8288
### Privacy and Telemetry
8389

84-
Control data sharing:
90+
!!! warn
91+
92+
Telemetry implementation is still in development.
93+
94+
The telemetry featrure of LinearSolveAutotune allows sharing performance results
95+
with the community to improve algorithm selection. Minimal data is collected, including:
96+
97+
- System information (OS, CPU, core count)
98+
- Algorithm performance results
99+
100+
and shared via public GitHub. This helps the community understand performance across
101+
different hardware configurations and further improve the default algorithm selection
102+
and research in improved algorithms.
103+
104+
However, if your system has privacy concerns or you prefer not to share data, you can disable telemetry:
85105

86106
```julia
87107
# Disable telemetry (no data shared)
@@ -96,7 +116,12 @@ results = autotune_setup(make_plot = false)
96116

97117
### Missing Algorithm Handling
98118

99-
By default, autotune is assertive about finding all expected algorithms:
119+
By default, autotune is assertive about finding all expected algorithms. This is because
120+
we want to ensure that all possible algorithms on a given hardware are tested in order for
121+
the autotuning histroy/telemetry to be as complete as possible. However, in some cases
122+
you may want to allow missing algorithms, such as when running on a system where the
123+
hardware may not have support due to driver versions or other issues. If that's the case,
124+
you can set `skip_missing_algs = true` to allow missing algorithms without failing the autotune setup:
100125

101126
```julia
102127
# Default behavior: error if expected algorithms are missing
@@ -106,31 +131,6 @@ results = autotune_setup() # Will error if RFLUFactorization missing
106131
results = autotune_setup(skip_missing_algs = true) # Will warn instead of error
107132
```
108133

109-
**When algorithms might be missing:**
110-
- RFLUFactorization should always be available (hard dependency)
111-
- GPU algorithms require CUDA.jl or Metal.jl to be loaded
112-
- Apple Accelerate should work on macOS systems
113-
- MKL algorithms require MKL.jl package
114-
115-
## Understanding Algorithm Compatibility
116-
117-
The autotuner automatically detects which algorithms work with which element types:
118-
119-
### Standard Types (Float32, Float64, ComplexF32, ComplexF64)
120-
- **LUFactorization**: Fast BLAS-based LU decomposition
121-
- **MKLLUFactorization**: Intel MKL optimized (if available)
122-
- **AppleAccelerateLUFactorization**: Apple Accelerate optimized (on macOS)
123-
- **RFLUFactorization**: Recursive factorization (cache-friendly)
124-
- **GenericLUFactorization**: Pure Julia implementation
125-
- **SimpleLUFactorization**: Simple pure Julia LU
126-
127-
### Arbitrary Precision Types (BigFloat, Rational, etc.)
128-
Only pure Julia algorithms work:
129-
- **GenericLUFactorization**: ✅ Compatible
130-
- **RFLUFactorization**: ✅ Compatible
131-
- **SimpleLUFactorization**: ✅ Compatible
132-
- **LUFactorization**: ❌ Excluded (requires BLAS)
133-
134134
## GPU Systems
135135

136136
On systems with CUDA or Metal GPU support, the autotuner will automatically detect and benchmark GPU algorithms:
@@ -176,6 +176,7 @@ results, plots = autotune_setup()
176176
for (eltype, plot) in plots
177177
println("Plot for $eltype available")
178178
# Plots are automatically saved as PNG and PDF files
179+
display(plot)
179180
end
180181
```
181182

@@ -200,61 +201,11 @@ custom_categories = Dict(
200201
LinearSolveAutotune.set_algorithm_preferences(custom_categories)
201202
```
202203

203-
## Real-World Examples
204-
205-
### High-Performance Computing
206-
207-
```julia
208-
# For HPC clusters with large problems
209-
results = autotune_setup(
210-
large_matrices = true,
211-
samples = 5,
212-
seconds = 1.0,
213-
eltypes = (Float64, ComplexF64),
214-
telemetry = false # Privacy on shared systems
215-
)
216-
```
217-
218-
### Workstation with GPU
219-
220-
```julia
221-
# Comprehensive benchmark including GPU algorithms
222-
results = autotune_setup(
223-
large_matrices = true,
224-
samples = 3,
225-
seconds = 0.5,
226-
eltypes = (Float32, Float64, ComplexF32, ComplexF64)
227-
)
228-
```
229-
230-
### Research with Arbitrary Precision
204+
## How Preferences Affect LinearSolve.jl
231205

232-
```julia
233-
# Testing arbitrary precision arithmetic
234-
results = autotune_setup(
235-
eltypes = (Float64, BigFloat),
236-
samples = 2,
237-
seconds = 0.2, # BigFloat is slow
238-
telemetry = false,
239-
large_matrices = false
240-
)
241-
```
206+
!!! warn
242207

243-
### Quick Development Testing
244-
245-
```julia
246-
# Fast benchmark for development/testing
247-
results = autotune_setup(
248-
samples = 1,
249-
seconds = 0.05,
250-
eltypes = (Float64,),
251-
make_plot = false,
252-
telemetry = false,
253-
set_preferences = false
254-
)
255-
```
256-
257-
## How Preferences Affect LinearSolve.jl
208+
Usage of autotune preferences is still in development.
258209

259210
After running autotune, LinearSolve.jl will automatically use the optimal algorithms:
260211

@@ -272,98 +223,4 @@ A_large = rand(300, 300) # Different size range
272223
b_large = rand(300)
273224
prob_large = LinearProblem(A_large, b_large)
274225
sol_large = solve(prob_large) # May use different algorithm
275-
```
276-
277-
## Best Practices
278-
279-
1. **Run autotune once per system**: Results are system-specific and should be rerun when hardware changes.
280-
281-
2. **Use appropriate matrix sizes**: Set `large_matrices=true` only if you regularly solve large systems.
282-
283-
3. **Consider element types**: Only benchmark the types you actually use to save time.
284-
285-
4. **Benchmark thoroughly for production**: Use higher `samples` and `seconds` values for production systems.
286-
287-
5. **Respect privacy**: Disable telemetry on sensitive or proprietary systems.
288-
289-
6. **Save results**: The DataFrame returned contains valuable performance data for analysis.
290-
291-
## Troubleshooting
292-
293-
### No Algorithms Available
294-
If you get "No algorithms found", ensure LinearSolve.jl is properly installed:
295-
```julia
296-
using Pkg
297-
Pkg.test("LinearSolve")
298-
```
299-
300-
### GPU Algorithms Missing
301-
GPU algorithms require additional packages:
302-
```julia
303-
# For CUDA
304-
using CUDA, LinearSolve
305-
306-
# For Metal (Apple Silicon)
307-
using Metal, LinearSolve
308-
```
309-
310-
### Preferences Not Applied
311-
Restart Julia after running autotune for preferences to take effect, or check:
312-
```julia
313-
LinearSolveAutotune.show_current_preferences()
314-
```
315-
316-
### Slow BigFloat Performance
317-
This is expected - arbitrary precision arithmetic is much slower than hardware floating point. Consider using `DoubleFloats.jl` or `MultiFloats.jl` for better performance if extreme precision isn't required.
318-
319-
## Community and Telemetry
320-
321-
By default, autotune results are shared with the LinearSolve.jl community via public GitHub gists to help improve algorithm selection for everyone. The shared data includes:
322-
323-
- System information (OS, CPU, core count, etc.)
324-
- Algorithm performance results
325-
- NO personal information or sensitive data
326-
327-
Results are uploaded as public gists that can be easily searched and viewed by the community.
328-
329-
### GitHub Authentication for Telemetry
330-
331-
When telemetry is enabled, the system will prompt you to set up GitHub authentication if not already configured:
332-
333-
```julia
334-
# This will prompt for GitHub token setup if GITHUB_TOKEN not found
335-
results = autotune_setup(telemetry = true)
336-
```
337-
338-
The system will wait for you to create and paste a GitHub token. This helps the community by sharing performance data across different hardware configurations via easily discoverable GitHub gists.
339-
340-
**Interactive Setup:**
341-
The autotune process will show step-by-step instructions and wait for you to:
342-
1. Create a GitHub token at the provided link
343-
2. Paste the token when prompted
344-
3. Proceed with benchmarking and automatic result sharing
345-
346-
**Alternative - Pre-setup Environment Variable**:
347-
```bash
348-
export GITHUB_TOKEN=your_token_here
349-
julia
350-
```
351-
352-
**Creating the GitHub Token:**
353-
1. Open [https://github.com/settings/tokens?type=beta](https://github.com/settings/tokens?type=beta)
354-
2. Click "Generate new token"
355-
3. Set name: "LinearSolve Autotune"
356-
4. Set expiration: 90 days
357-
5. Repository access: "Public Repositories (read-only)"
358-
6. Generate and copy the token
359-
360-
### Disabling Telemetry
361-
362-
You can disable telemetry completely:
363-
364-
```julia
365-
# No authentication required
366-
results = autotune_setup(telemetry = false)
367-
```
368-
369-
This helps the community understand performance across different hardware configurations and improves the default algorithm selection for future users, but participation is entirely optional.
226+
```

lib/LinearSolveAutotune/Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "LinearSolveAutotune"
22
uuid = "67398393-80e8-4254-b7e4-1b9a36a3c5b6"
33
authors = ["SciML"]
4-
version = "0.1.0"
4+
version = "1.0.0"
55

66
[deps]
77
LinearSolve = "7ed4a6bd-45f5-4d41-b270-4a48e9bafcae"

0 commit comments

Comments
 (0)