Skip to content

Commit 7ff2eed

Browse files
authored
Merge pull request #1479 from yangxile/main
Add new LP article on how to Run vvenc (H.266 encoder) on Arm servers
2 parents c72ab29 + a390a81 commit 7ff2eed

File tree

4 files changed

+277
-0
lines changed

4 files changed

+277
-0
lines changed
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
---
2+
armips:
3+
- Neoverse
4+
author_primary: Willen Yang
5+
layout: learningpathall
6+
learning_objectives:
7+
- Build vvenc(H.266 encoder) project on Arm server
8+
- Run vvenc on Arm server to encode a real 1080p video file to measure the performance
9+
learning_path_main_page: 'yes'
10+
minutes_to_complete: 10
11+
operatingsystems:
12+
- Linux
13+
prerequisites:
14+
- An [Arm based instance](/learning-paths/servers-and-cloud-computing/csp/) from an appropriate
15+
cloud service provider. This Learning Path has been verified on Arm Neoverse N2 based Alibaba cloud ECS instance(g8y), running `Ubuntu Linux 22.04.`
16+
skilllevels: Introductory
17+
subjects: Libraries
18+
test_images:
19+
- ubuntu:22.04
20+
test_link: null
21+
test_maintenance: true
22+
test_status:
23+
- passed
24+
title: Run vvenc (H.266 encoder) on Arm servers
25+
tools_software_languages:
26+
- vvenc
27+
weight: 1
28+
who_is_this_for: This is an introductory topic for software developers who want to
29+
build and run vvenc(H.266 encoder) project on Arm servers and measure the performance.
30+
---
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
---
2+
# ================================================================================
3+
# Edit
4+
# ================================================================================
5+
6+
next_step_guidance: >
7+
You can continue learning about porting cloud applications to the Arm architecture for increased performance and cost savings. The Learning Path on MongoDB is a great next step.
8+
# 1-3 sentence recommendation outlining how the reader can generally keep learning about these topics, and a specific explanation of why the next step is being recommended.
9+
10+
recommended_path: "/learning-paths/servers-and-cloud-computing/mongodb/"
11+
# Link to the next learning path being recommended.
12+
13+
14+
# further_reading links to references related to this path. Can be:
15+
# Manuals for a tool / software mentioned (type: documentation)
16+
# Blog about related topics (type: blog)
17+
# General online references (type: website)
18+
19+
further_reading:
20+
- resource:
21+
title: vvenc Documentation
22+
link: https://github.com/fraunhoferhhi/vvenc/wiki/Usage
23+
type: documentation
24+
- resource:
25+
title: Delivering the best H.265 video experience on Arm Neoverse N2 Platform
26+
link: https://community.arm.com/arm-community-blogs/b/infrastructure-solutions-blog/posts/h265-video-on-neoverse-n2
27+
type: blog
28+
- resource:
29+
title: Optimized Video Encoding with FFmpeg on AWS Graviton Processors
30+
link: https://aws.amazon.com/blogs/opensource/optimized-video-encoding-with-ffmpeg-on-aws-graviton-processors/
31+
type: blog
32+
- resource:
33+
title: OCI Ampere A1 Compute instances can significantly reduce video encoding costs versus modern CPUs
34+
link: https://community.arm.com/arm-community-blogs/b/operating-systems-blog/posts/oracle-cloud-infrastructure-arm-based-a1
35+
type: blog
36+
37+
# ================================================================================
38+
# FIXED, DO NOT MODIFY
39+
# ================================================================================
40+
weight: 21 # set to always be larger than the content in this path, and one more than 'review'
41+
title: "Next Steps" # Always the same
42+
layout: "learningpathall" # All files under learning paths have this same wrapper
43+
---
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
---
2+
# ================================================================================
3+
# Edit
4+
# ================================================================================
5+
6+
# Always 3 questions. Should try to test the reader's knowledge, and reinforce the key points you want them to remember.
7+
# question: A one sentence question
8+
# answers: The correct answers (from 2-4 answer options only). Should be surrounded by quotes.
9+
# correct_answer: An integer indicating what answer is correct (index starts from 0)
10+
# explanation: A short (1-3 sentence) explanation of why the correct answer is correct. Can add additional context if desired
11+
12+
13+
review:
14+
- questions:
15+
question: >
16+
Does vvenc(H.266 encoder) run on Arm servers?
17+
answers:
18+
- "Yes"
19+
- "No"
20+
correct_answer: 1
21+
explanation: >
22+
H.266 codec is fully supported on 64-bit Arm servers running Linux.
23+
24+
- questions:
25+
question: >
26+
Does varying the preset settings on the images impact the codec performance?
27+
answers:
28+
- "Yes"
29+
- "No"
30+
correct_answer: 1
31+
explanation: >
32+
You can vary the preset settings on the different resolution images and measure the impact on performance.
33+
34+
35+
# ================================================================================
36+
# FIXED, DO NOT MODIFY
37+
# ================================================================================
38+
title: "Review" # Always the same title
39+
weight: 20 # Set to always be larger than the content in this path
40+
layout: "learningpathall" # All files under learning paths have this same wrapper
41+
---
Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
---
2+
layout: learningpathall
3+
title: Build and run vvenc (H.266 encoder) on Arm servers
4+
weight: 2
5+
---
6+
7+
## Install necessary software packages
8+
9+
`vvenc` is an open-source H.266/VVC encoder that offers very high compression efficiency and performance. There have been significant efforts ongoing to optimize the open-source implementation of the H.266 encoder on Arm Neoverse platforms which supports Neon and SVE/SVE2 instructions. The optimized code is available on [Github](https://github.com/fraunhoferhhi/vvenc)
10+
11+
Install `Cmake` and other dependencies:
12+
```bash
13+
sudo apt install git wget cmake p7zip-full -y
14+
```
15+
Install llvm compiler to compile the C++ code:
16+
```bash
17+
wget https://apt.llvm.org/llvm.sh
18+
chmod +x llvm.sh
19+
sudo ./llvm.sh 18 all
20+
```
21+
22+
## Download and build vvenc source
23+
24+
```bash
25+
git clone https://github.com/fraunhoferhhi/vvenc.git
26+
cd vvenc
27+
CXX=clang++-18 CC=clang-18 cmake -S . -B build/release-static -DVVENC_ENABLE_ARM_SIMD_SVE=1 -DVVENC_ENABLE_ARM_SIMD_SVE2=1
28+
```
29+
Make sure sve/sve2 has been enabled in the Makefile:
30+
```output
31+
root@iZuf61ixurqifmpxuji4viZ:~/vvenc-1.13.0# CXX=clang++-18 CC=clang-18 cmake -S . -B build/release-static -DVVENC_ENABLE_ARM_SIMD_SVE=1 -DVVENC_ENABLE_ARM_SIMD_SVE2=1
32+
-- The C compiler identification is Clang 18.1.8
33+
-- The CXX compiler identification is Clang 18.1.8
34+
-- Detecting C compiler ABI info
35+
-- Detecting C compiler ABI info - done
36+
-- Check for working C compiler: /usr/bin/clang-18 - skipped
37+
-- Detecting C compile features
38+
-- Detecting C compile features - done
39+
-- Detecting CXX compiler ABI info
40+
-- Detecting CXX compiler ABI info - done
41+
-- Check for working CXX compiler: /usr/bin/clang++-18 - skipped
42+
-- Detecting CXX compile features
43+
-- Detecting CXX compile features - done
44+
-- CMAKE_MODULE_PATH: updating module path to: /root/vvenc-1.13.0/cmake/modules
45+
-- normalized target architecture: AARCH64
46+
-- Performing Test SUPPORTED_Werror_unused_command_line_argument
47+
-- Performing Test SUPPORTED_Werror_unused_command_line_argument - Success
48+
-- Performing Test SUPPORTED_march=armv8_2_a+sve
49+
-- Performing Test SUPPORTED_march=armv8_2_a+sve - Success
50+
-- Performing Test SVE_COMPILATION_C_TEST_COMPILED
51+
-- Performing Test SVE_COMPILATION_C_TEST_COMPILED - Success
52+
-- Performing Test SVE_COMPILATION_CXX_TEST_COMPILED
53+
-- Performing Test SVE_COMPILATION_CXX_TEST_COMPILED - Success
54+
-- Performing Test SVE_HEADER_C_TEST_COMPILED
55+
-- Performing Test SVE_HEADER_C_TEST_COMPILED - Success
56+
-- Performing Test SVE_HEADER_CXX_TEST_COMPILED
57+
-- Performing Test SVE_HEADER_CXX_TEST_COMPILED - Success
58+
-- Performing Test SUPPORTED_march=armv9_a+sve2
59+
-- Performing Test SUPPORTED_march=armv9_a+sve2 - Success
60+
-- Performing Test SUPPORTED_msse4_1
61+
-- Performing Test SUPPORTED_msse4_1 - Failed
62+
-- Performing Test SUPPORTED_mavx
63+
-- Performing Test SUPPORTED_mavx - Failed
64+
-- Performing Test HAVE_INTRIN_mm_storeu_si16
65+
-- Performing Test HAVE_INTRIN_mm_storeu_si16 - Success
66+
-- Performing Test HAVE_INTRIN_mm_storeu_si32
67+
-- Performing Test HAVE_INTRIN_mm_storeu_si32 - Success
68+
-- Performing Test HAVE_INTRIN_mm_storeu_si64
69+
-- Performing Test HAVE_INTRIN_mm_storeu_si64 - Success
70+
-- Performing Test HAVE_INTRIN_mm_loadu_si32
71+
-- Performing Test HAVE_INTRIN_mm_loadu_si32 - Success
72+
-- Performing Test HAVE_INTRIN_mm_loadu_si64
73+
-- Performing Test HAVE_INTRIN_mm_loadu_si64 - Success
74+
-- Performing Test HAVE_INTRIN_mm_cvtsi128_si64
75+
-- Performing Test HAVE_INTRIN_mm_cvtsi128_si64 - Success
76+
-- Performing Test HAVE_INTRIN_mm_cvtsi64_si128
77+
-- Performing Test HAVE_INTRIN_mm_cvtsi64_si128 - Success
78+
-- Performing Test HAVE_INTRIN_mm_extract_epi64
79+
-- Performing Test HAVE_INTRIN_mm_extract_epi64 - Success
80+
-- Performing Test HAVE_INTRIN_mm256_zeroupper
81+
-- Performing Test HAVE_INTRIN_mm256_zeroupper - Failed
82+
-- Performing Test HAVE_INTRIN_mm256_loadu2_m128i
83+
-- Performing Test HAVE_INTRIN_mm256_loadu2_m128i - Success
84+
-- Performing Test HAVE_INTRIN_mm256_set_m128i
85+
-- Performing Test HAVE_INTRIN_mm256_set_m128i - Success
86+
-- x86 SIMD intrinsics enabled (using SIMDE for non-x86 targets)
87+
-- AArch64 Neon intrinsics enabled
88+
-- AArch64 SVE intrinsics enabled
89+
-- AArch64 SVE2 intrinsics enabled
90+
-- Looking for pthread.h
91+
-- Looking for pthread.h - found
92+
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
93+
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
94+
-- Found Threads: TRUE
95+
-- Performing Test SUPPORTED_mxsave
96+
-- Performing Test SUPPORTED_mxsave - Failed
97+
-- Performing Test SUPPORTED_msse4_2
98+
-- Performing Test SUPPORTED_msse4_2 - Failed
99+
-- Performing Test SUPPORTED_mavx2
100+
-- Performing Test SUPPORTED_mavx2 - Failed
101+
-- Configuring done
102+
-- Generating done
103+
-- Build files have been written to: /root/vvenc-1.13.0/build/release-static
104+
```
105+
Start the build process:
106+
```bash
107+
cmake --build build/release-static -j
108+
```
109+
110+
## Download video streams to run vvenc on and measure the performance
111+
112+
To benchmark the compression efficiency and performance of `vvenc`, you will need a set of video streams to run the codec on.
113+
114+
Download the `1080P` video files:
115+
```bash
116+
cd ../video
117+
wget http://ultravideo.cs.tut.fi/video/Bosphorus_1920x1080_120fps_420_8bit_YUV_Y4M.7z
118+
7z -x Bosphorus_1920x1080_120fps_420_8bit_YUV_Y4M.7z
119+
```
120+
121+
## Run vvenc on the sample video files
122+
123+
To benchmark the performance of `vvenc` over 100 frames of the `1080P` video file, run the command:
124+
```console
125+
numactl -C 0-3 bin/release-static/vvencFFapp --preset faster --BitstreamFile stream.266 --Threads 4 --InputFile ~/video/Bosphorus_1920x1080_120fps_420_8bit_YUV.y4m --InputBitDepth 8 --InputChromaFormat 420 --fps 30 --FramesToBeEncoded 100 --SourceWidth 1920 --SourceHeight 1080 --Qp 22 --IntraPeriod 256 --NumPasses 1 --InternalBitDepth 10 --stats 1 --Verbosity 3 --pools ','
126+
```
127+
128+
You can vary the preset settings and measure the impact on performance.
129+
130+
131+
## View Results
132+
133+
The encoding Frame Rate (Frames per second) for the video files is output at the end of each run.
134+
135+
Shown below is example output from running the vvenc h.266 encoding on the 1080P sample video file:
136+
137+
```output
138+
vvencFFapp: VVenC, the Fraunhofer H.266/VVC Encoder, version 1.13.0 [Linux][clang 18.1.8][64 bit][SIMD=SVE2]
139+
vvencFFapp [info]: started @ Wed Dec 25 16:06:03 2024
140+
vvenc [info]: Input File : /root/video/Bosphorus_1920x1080_120fps_420_8bit_YUV.y4m
141+
vvenc [info]: Bitstream File : stream.266
142+
vvenc [info]: Real Format : 1920x1080 yuv420p 30 Hz SDR 600 frames
143+
vvenc [info]: Frames : encode 100 frames
144+
vvenc [info]: Internal format : 1920x1080 30 Hz SDR
145+
vvenc [info]: Threads : 4 (parallel frames: 4)
146+
vvenc [info]: Rate control : QP 22
147+
vvenc [info]: Perceptual optimization : Disabled
148+
vvenc [info]: Intra period (keyframe) : 256
149+
vvenc [info]: Decoding refresh type : CRA
150+
151+
vvenc [info]: stats: 30.0% frame= 30/100 fps= 4.7 avg_fps= 4.7 bitrate= 3174.97 kbps avg_bitrate= 3174.97 kbps elapsed= 00h:00m:07s left= 00h:00mvvenc [info]: stats: 60.0% frame= 60/100 fps= 7.2 avg_fps= 5.7 bitrate= 2405.72 kbps avg_bitrate= 2790.34 kbps elapsed= 00h:00m:11s left= 00h:00mvvenc [info]: stats: 90.0% frame= 90/100 fps= 7.2 avg_fps= 6.1 bitrate= 2445.71 kbps avg_bitrate= 2675.47 kbps elapsed= 00h:00m:15s left= 00h:00mvvenc [info]: stats: 100.0% frame= 100/100 fps= 6.6 avg_fps= 6.6 bitrate= 2490.95 kbps avg_bitrate= 2490.95 kbps elapsed= 00h:00m:16s left= 00h:00m:00s
152+
vvenc [info]: stats summary: frame= 100/100 avg_fps= 6.6 avg_bitrate= 2490.95 kbps
153+
vvenc [info]: stats summary: frame I: 1, kbps: 32874.48, AvgQP: 17.00
154+
vvenc [info]: stats summary: frame P: 0, kbps: nan, AvgQP: nan
155+
vvenc [info]: stats summary: frame B: 99, kbps: 2184.04, AvgQP: 27.42
156+
157+
158+
vvenc [info]: Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR YUV-PSNR
159+
vvenc [info]: 100 a 2490.9480 43.7378 48.7096 47.9283 44.7472
160+
161+
vvencFFapp [info]: finished @ Wed Dec 25 16:06:19 2024
162+
vvencFFapp [info]: Total Time: 58.962 sec. [user] 15.209 sec. [elapsed]
163+
```

0 commit comments

Comments
 (0)