File tree Expand file tree Collapse file tree 2 files changed +13
-5
lines changed
Expand file tree Collapse file tree 2 files changed +13
-5
lines changed Original file line number Diff line number Diff line change 1313
1414## Encoding/decoding is slow in first iteration
1515
16- Correct. GPUJPEG was intended to provide when running many times (ideally with
17- a equal-sized pictures, like a video). But using for few or even single image
18- will not pay off, because there is an initialization burden (let say 230 ms for
19- a 33 Mpix image).
16+ Correct. This is because the there is initialization of GPUJPEG internal
17+ structures, CUDA buffers, the initialization of GPU execution pipeline
18+ as well as kernel compilation for actual device capability. The last
19+ point can be eliminated by generating code for the particular device
20+ during the compilation:
21+
22+ cmake -DCMAKE_CUDA_ARCHITECTURES=native -DCMAKE_BUILD_TYPE=Release ...
23+
24+ (` all-major ` or ` all ` will also work but the compilation will take longer)
25+
26+ Ideal use case for GPUJPEG is to run for many images (ideally equal-sized).
2027
2128## What is a restart interval
2229
Original file line number Diff line number Diff line change @@ -12,7 +12,8 @@ for high-performance image encoding and decoding. The software runs also on
1212[ ZLUDA.md] ( ZLUDA.md ) ).
1313
1414This documents provides an introduction to the library and how to use it. You
15- can also look to [ FAQ.md] ( FAQ.md ) for additional information. To see _ latest changes_
15+ can also look to [ FAQ.md] ( FAQ.md ) for _ performance tuning_
16+ and additional information. To see _ latest changes_
1617you can display file [ NEWS.md] ( NEWS.md ) .
1718
1819Table of contents
You can’t perform that action at this time.
0 commit comments