You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/learning-paths/mobile-graphics-and-gaming/android_halide/android.md
+1-36Lines changed: 1 addition & 36 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -346,42 +346,7 @@ The code defines three utility methods:
346
346
2. extractGrayScaleBytes - converts a Bitmap into a grayscale byte array suitable for native processing.
347
347
3. createBitmapFromGrayBytes - converts a grayscale byte array back into a Bitmap for display purposes.
348
348
349
-
Note that performing the grayscale conversion in Halide allows us to exploit operator fusion, further improving performance by avoiding intermediate memory accesses. This could be done as follows:
// Convert RGB to grayscale directly in Halide pipeline
358
-
Halide::Func grayscale("grayscale");
359
-
grayscale(x, y) = Halide::cast<uint8_t>(
360
-
0.299f * inputBuffer(x, y, 0) +
361
-
0.587f * inputBuffer(x, y, 1) +
362
-
0.114f * inputBuffer(x, y, 2)
363
-
);
364
-
365
-
// Continue pipeline: Gaussian blur (example)
366
-
Halide::Func blur("blur");
367
-
Halide::RDom r(-1, 3, -1, 3);
368
-
Halide::Expr kernel[3][3] = {
369
-
{1, 2, 1},
370
-
{2, 4, 2},
371
-
{1, 2, 1}
372
-
};
373
-
374
-
Halide::Expr blurSum = 0;
375
-
for (int i = 0; i < 3; ++i) {
376
-
for (int j = 0; j < 3; ++j) {
377
-
blurSum += grayscale(x + r.x, y + r.y) * kernel[i][j];
378
-
}
379
-
}
380
-
blur(x, y) = Halide::cast<uint8_t>(blurSum / 16);
381
-
382
-
// Fuse grayscale and blur operations
383
-
grayscale.compute_at(blur, x);
384
-
```
349
+
Note that performing the grayscale conversion in Halide allows us to exploit operator fusion, further improving performance by avoiding intermediate memory accesses. This could be done as in our examples before (processing-workflow).
385
350
386
351
The JNI integration occurs through an external method declaration, blurThresholdImage, loaded via the companion object at app startup. The native library (armhalideandroiddemo) containing this function is compiled separately and integrated into the application (native-lib.cpp).
Copy file name to clipboardExpand all lines: content/learning-paths/mobile-graphics-and-gaming/android_halide/fusion.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -528,4 +528,4 @@ Fusion isn’t always best. You’ll want to materialize an intermediate (comput
528
528
The fastest way to check whether fusion helps is to measure it. Our demo prints timing and throughput per frame, but Halide also includes a built-in profiler that reports per-stage runtimes. To learn how to enable and interpret the profiler, see the official [Halide profiling tutorial](https://halide-lang.org/tutorials/tutorial_lesson_21_auto_scheduler_generate.html#profiling).
529
529
530
530
## Summary
531
-
In this lesson, we learned about operation fusion in Halide, a powerful technique to reduce memory bandwidth and improve computational efficiency. We explored why fusion matters, identified scenarios where fusion is most effective, and demonstrated how Halide’s scheduling constructs (compute_at, store_at, fuse) enable you to apply fusion easily and effectively. By fusing the Gaussian blur and thresholding stages, we improved the performance of our real-time image processing pipeline.
531
+
In this lesson, we learned about operator fusion in Halide—a powerful technique for reducing memory bandwidth and improving computational efficiency. We explored why fusion matters, looked at scenarios where it is most effective, and saw how Halide’s scheduling constructs such as compute_root() and compute_at() let us control whether stages are fused or materialized. By experimenting with different schedules, including fusing the Gaussian blur and thresholding stages, we observed how fusion can significantly improve the performance of a real-time image processing pipeline
0 commit comments