+<h3><a class="DLtitleLink" title="Full Citation in the ACM Digital Library" referrerpolicy="no-referrer-when-downgrade" href="https://dl.acm.org/doi/10.1145/3720555.3721989">From OpenACC to OpenMP5 GPU Offloading: Performance Evaluation on NAS Parallel Benchmarks</a></h3><ul class="DLauthors"><li class="nameList">Yehonatan Fridman</li><li class="nameList">Yosef Goren</li><li class="nameList Last">Gal Oren</li></ul><div class="DLabstract"><div style="display:inline"><p>The NAS Parallel Benchmarks (NPB) are widely used to evaluate parallel programming models, yet lack a native OpenMP offloading implementation for GPUs. This gap is significant given OpenMP’s emergence as a versatile standard for heterogeneous systems, offering broad compatibility with both current and future GPU architectures. Existing solutions, such as those that directly translate OpenACC to a binary executable, are limited by OpenACC’s stagnation and vendor-specific constraints, while not exposing OpenMP, which is used internally as an intermediate representation.</p><p>This work addresses this limitation by developing a source-level translation of OpenACC-based NPB benchmarks into OpenMP5 offloading code. This translation employs a combination of automated source-to-source tool and manual optimization to ensure efficient execution across various GPU architectures. Performance evaluations indicate that the translated OpenMP versions deliver results comparable to the original OpenACC implementations, validating their reliability for GPU-based computations. Additionally, comparisons between GPU-accelerated OpenMP implementations and traditional CPU-based benchmarks reveal significant performance gains, especially in computationally intensive workloads. These findings highlight OpenMP’s potential as a unified programming model, offering superior portability and optimization capabilities across diverse hardware platforms.</p><p>The sources of this work are available at our repository.</p></div></div>
0 commit comments