diff --git a/docs/LatencyMeasurement.md b/docs/LatencyMeasurement.md new file mode 100644 index 000000000..fbbace298 --- /dev/null +++ b/docs/LatencyMeasurement.md @@ -0,0 +1,297 @@ +# End-to-End Latency Measurement — Media Communications Mesh + +This document describes a simple solution for measuring end-to-end latency in Media Communications Mesh. + +## Overview + +The solution is based on the FFmpeg ability to print current timestamps on the sender side (Tx) and the receiver side (Rx), and the use of Optical Character Recognition (OCR) to read the timestamps out of each received video frame and calculate the delta. The choice of OCR is determined by the fact that the text can be effectively recognized even if the picture is affected by any sort of a lossy video compression algorithm somewhere in the transmission path in the Mesh. To achieve proper accuracy of the measurement, both Tx and Rx host machines should be synchronized using Precision Time Protocol (PTP). + +> Only video payload is supported. + +```mermaid +flowchart LR + tx-file((Input + video file)) + tx-ffmpeg(Tx + FFmpeg) + subgraph mesh [Media Communications Mesh] + direction LR + proxy1(Proxy1) + proxy2a(. . .) + proxy2b(. . .) + proxy2c(. . .) + proxy3(ProxyN) + proxy1 --> proxy2a --> proxy3 + proxy1 --> proxy2b --> proxy3 + proxy1 --> proxy2c --> proxy3 + end + rx-ffmpeg(Rx + FFmpeg) + rx-file((Output + video file)) + + tx-file --> tx-ffmpeg --> mesh --> rx-ffmpeg --> rx-file +``` + +## How it works + +1. Tx side – The user starts FFmpeg with special configuration to stream video via the Mesh. +1. Rx side – The user starts FFmpeg with special configuration to receive the video stream from the Mesh. +1. Tx side – FFmpeg prints the current timestamp as a huge text at the top of each video frame and transmits it via the Mesh. +1. Rx side – FFmpeg prints the current timestamp as a huge text at the bottom of each video frame received from the Mesh and saves it on the disk. +1. After transmission is done, there is a resulting MPEG video file on the disk on the Rx side. +1. The user runs the solution script against the MPEG file that recognizes the Tx and Rx timestamps in each frame, and calculates the average latency based on the difference between the timestamps. Additionally, the script generates a latency diagram and stores it in JPEG format on the disk. + +## Sample latency diagram + + + +## Important notice on latency measurement results + +> Please note the calculated average latency is highly dependent on the hardware configuration and CPU background load, and cannot be treated as an absolute value. The provided solution can only be used for comparing the latency in different Mesh configurations and video streaming parameters, as well as latency stability checks. + + +## Build and install steps + +> It is assumed that Media Communications Mesh is installed on the Tx and Rx host machines according to [Setup Guide](SetupGuid.md). + +If [FFmpeg Plugin](FFmpegPlugin.md) was installed earlier, remove its directory before proceeding with the following. + +1. Clone the FFmpeg 7.0 repository and apply patches. + + ```bash + ./clone-and-patch-ffmpeg.sh + ``` + +1. Run the FFmpeg configuration tool with special features enabled + + ```bash + ./configure-ffmpeg.sh 7.0 --enable-libfreetype --enable-libharfbuzz --enable-libfontconfig + ``` + +1. Build and install FFmpeg with the Media Communications Mesh FFmpeg plugin + + ```bash + ./build-ffmpeg.sh + ``` + +1. Install Tesseract OCR + ```bash + apt install tesseract-ocr + ``` +1. Install Python packages + ```bash + pip install opencv-python~=4.11.0 pytesseract~=0.3.13 matplotlib~=3.10.3 + ``` + +1. Setup time synchronization on host machines + + > Make sure `network_interface_1` and `network_interface_2` are connected to the same network. + + * __host-1 Controller clock__ + ```bash + sudo ptp4l -i -m 2 + sudo phc2sys -a -r -r -m + ``` + + * __host-2 Worker clock__ + ```bash + sudo ptp4l -i -m 2 -s + sudo phc2sys -a -r + ``` + +## Example – Measuring transmission latency between two FFmpeg instances on the same host + +This example demonstrates sending a video file from the 1st FFmpeg instance to the 2nd FFmpeg instance via Media Communications Mesh on the same host, and then calculation of transmission latency from the recorded video. + + +1. Run Mesh Agent + ```bash + mesh-agent + ``` + +1. Run Media Proxy + + ```bash + sudo media_proxy \ + -d 0000:32:01.1 \ + -i 192.168.96.11 \ + -r 192.168.97.11 \ + -p 9200-9299 \ + -t 8002 + ``` + +1. Start the Receiver side FFmpeg instance + + ```bash + sudo MCM_MEDIA_PROXY_PORT=8002 ffmpeg \ + -f mcm \ + -conn_type multipoint-group \ + -frame_rate 60 \ + -video_size 1920x1080 \ + -pixel_format yuv422p10le \ + -i - \ + -vf \ + "drawtext=fontsize=40: \ + text='Rx timestamp %{localtime\\:%H\\\\\:%M\\\\\:%S\\\\\:%3N}': \ + x=10: y=70: fontcolor=white: box=1: boxcolor=black: boxborderw=10" \ + -vcodec mpeg4 -qscale:v 3 recv.mp4 + ``` +1. Start the Sender side FFmpeg instance + + ```bash + sudo MCM_MEDIA_PROXY_PORT=8002 ffmpeg -i \ + -vf \ + "drawtext=fontsize=40: \ + text='Tx timestamp %{localtime\\:%H\\\\\:%M\\\\\:%S\\\\\:%3N}': \ + x=10: y=10: fontcolor=white: box=1: boxcolor=black: boxborderw=10" \ + -f mcm \ + -conn_type multipoint-group \ + -frame_rate 60 \ + -video_size 1920x1080 \ + -pixel_format yuv422p10le - + ``` + + When sending a raw video file, e.g. of the YUV format, you have to explicitly specify the file format `-f rawvideo`, the pixel format `-pix_fmt`, and the video resolution `-s WxH`: + + ```bash + ffmpeg -f rawvideo -pix_fmt yuv422p10le -s 1920x1080 -i ... + ``` + + It is also recommended to provide the read rate `-readrate` at which FFmpeg will read frames from the file: + + ```bash + ffmpeg -f rawvideo -readrate 2.4 -pix_fmt yuv422p10le -s 1920x1080 -i ... + ``` + + The `-readrate` value is calculated from the `-frame_rate` parameter value using the following equation: $readrate=framerate\div25$. Use the pre-calculated values from the table below. + + | frame_rate | readrate | + |------------|-------------------| + | 25 | 25 / 25 = 1 | + | 50 | 50 / 25 = 2 | + | 60 | 60 / 25 = 2.4 | + +1. Run the script against the recorded MPEG file. The first argument is the input video file path. The second argument is the optional latency diagram JPEG file path to be generated. + + ```bash + python text-detection.py recv.mp4 recv-latency.jpg + ``` + + Console output + ```bash + ... + Processing Frame: 235 + Processing Frame: 236 + Processing Frame: 237 + Processing Frame: 238 + Processing Frame: 239 + Processing Frame: 240 + Saving the latency chart to: recv-latency.jpg + File: recv.mp4 | Last modified: 2025-06-02 13:49:54 UTC + Resolution: 640x360 | FPS: 25.00 + Average End-to-End Latency: 564.61 ms + ``` + + See the [Sample latency diagram](#sample-latency-diagram). + + +## Example – Measuring transmission latency between two FFmpeg instances on different hosts + +This example demonstrates sending a video file from the 1st FFmpeg instance to the 2nd FFmpeg instance via Media Communications Mesh on the same host, and then calculation of transmission latency from the recorded video. + +1. Run Mesh Agent + ```bash + mesh-agent + ``` + +1. Start Media Proxy on the Receiver host machine + + ```bash + sudo media_proxy \ + -d 0000:32:01.1 \ + -i 192.168.96.11 \ + -r 192.168.97.11 \ + -p 9200-9299 \ + -t 8002 + ``` + +1. Start the Receiver side FFmpeg instance + + ```bash + sudo MCM_MEDIA_PROXY_PORT=8002 ffmpeg \ + -f mcm \ + -conn_type st2110 \ + -transport st2110-20 \ + -ip_addr 192.168.96.10 \ + -port 9001 \ + -frame_rate 60 \ + -video_size 1920x1080 \ + -pixel_format yuv422p10le \ + -i - \ + -vf \ + "drawtext=fontsize=40: \ + text='Rx timestamp %{localtime\\:%H\\\\\:%M\\\\\:%S\\\\\:%3N}': \ + x=10: y=70: fontcolor=white: box=1: boxcolor=black: boxborderw=10" \ + -vcodec mpeg4 -qscale:v 3 recv.mp4 + ``` + +1. Start Media Proxy on the Sender host machine + + ```bash + sudo media_proxy \ + -d 0000:32:01.0 \ + -i 192.168.96.10 \ + -r 192.168.97.10 \ + -p 9100-9199 \ + -t 8001 + ``` + +1. Start the Sender side FFmpeg instance + + ```bash + sudo MCM_MEDIA_PROXY_PORT=8001 ffmpeg -i \ + -vf \ + "drawtext=fontsize=40: \ + text='Tx timestamp %{localtime\\:%H\\\\\:%M\\\\\:%S\\\\\:%3N}': \ + x=10: y=10: fontcolor=white: box=1: boxcolor=black: boxborderw=10" \ + -f mcm \ + -conn_type st2110 \ + -transport st2110-20 \ + -ip_addr 192.168.96.11 \ + -port 9001 \ + -frame_rate 60 \ + -video_size 1920x1080 \ + -pixel_format yuv422p10le - + ``` + +1. Run the script against the recorded MPEG file. The first argument is the input video file path. The second argument is the optional latency diagram JPEG file path to be generated. + + ```bash + python text-detection.py recv.mp4 recv-latency.jpg + ``` + + Console output + ```bash + ... + Processing Frame: 235 + Processing Frame: 236 + Processing Frame: 237 + Processing Frame: 238 + Processing Frame: 239 + Processing Frame: 240 + Saving the latency chart to: recv-latency.jpg + File: recv.mp4 | Last modified: 2025-06-02 13:49:54 UTC + Resolution: 640x360 | FPS: 25.00 + Average End-to-End Latency: 564.61 ms + ``` + + See the [Sample latency diagram](#sample-latency-diagram). + +## Customization +When modifying FFmpeg commands if you change parameters of `drawtext` filter, especialy `fontsize`, `x`, `y` or `text`, you have to adjust python script __text-detection.py__ too, please refer to function `extract_text_from_region(image, x, y, font_size, length)` + + + +[license-img]: https://img.shields.io/badge/License-BSD_3--Clause-blue.svg +[license]: https://opensource.org/license/bsd-3-clause diff --git a/docs/_static/ffmpeg-based-latency-solution-diagram.jpg b/docs/_static/ffmpeg-based-latency-solution-diagram.jpg new file mode 100644 index 000000000..ccee5a89d Binary files /dev/null and b/docs/_static/ffmpeg-based-latency-solution-diagram.jpg differ diff --git a/scripts/text-detection.py b/scripts/text-detection.py new file mode 100644 index 000000000..e8460a402 --- /dev/null +++ b/scripts/text-detection.py @@ -0,0 +1,165 @@ +import sys +import pytesseract +import cv2 as cv +import numpy as np +from datetime import datetime +import re +import matplotlib.pyplot as plt +from concurrent.futures import ThreadPoolExecutor +import os + +def is_display_attached(): + # Check if the DISPLAY environment variable is set + return 'DISPLAY' in os.environ + +def extract_text_from_region(image, x, y, font_size, length): + """ + Extracts text from a specific region of the image. + :param image: The image to extract text from. + :param x: The x-coordinate of the top-left corner of the region. + :param y: The y-coordinate of the top-left corner of the region. + :param font_size: The font size of the text. + :param length: The length of the text to extract. + :return: The extracted text. + """ + margin = 5 + y_adjusted = max(0, y - margin) + x_adjusted = max(0, x - margin) + height = y + font_size + margin + width = x + length + margin + # Define the region of interest (ROI) for text extraction + roi = image[y_adjusted:height, x_adjusted:width] + + # Use Tesseract to extract text from the ROI + return pytesseract.image_to_string(roi, lang='eng') + +def process_frame(frame_idx, frame): + print("Processing Frame: ", frame_idx) + + timestamp_format = "%H:%M:%S:%f" + timestamp_pattern = r'\b\d{2}:\d{2}:\d{2}:\d{3}\b' + + # Convert frame to grayscale for better OCR performance + frame = cv.cvtColor(frame, cv.COLOR_BGR2GRAY) + + line_1 = extract_text_from_region(frame, 10, 10, 50, 700) + line_2 = extract_text_from_region(frame, 10, 70, 50, 700) + + # Find the timestamps(Type: string) in the extracted text using regex + tx_time = re.search(timestamp_pattern, line_1) + rx_time = re.search(timestamp_pattern, line_2) + + if tx_time is None or rx_time is None: + print("Error: Timestamp not found in the expected format.") + return 0 + + # Convert the timestamps(Type: string) to time (Type: datetime) + tx_time = datetime.strptime(tx_time.group(), timestamp_format) + rx_time = datetime.strptime(rx_time.group(), timestamp_format) + + if tx_time is None or rx_time is None: + print("Error: Timestamp not found in the expected format.") + return 0 + + if tx_time > rx_time: + print("Error: Transmit time is greater than receive time.") + return 0 + + time_difference = rx_time - tx_time + time_difference_ms = time_difference.total_seconds() * 1000 + return time_difference_ms + +def main(): + if len(sys.argv) < 2: + print("Usage: python text-detection.py ") + sys.exit(1) + + input_video_file = sys.argv[1] + cap = cv.VideoCapture(input_video_file) + if not cap.isOpened(): + print("Fatal: Could not open video file.") + sys.exit(1) + + frame_idx = 0 + time_differences = [] + + with ThreadPoolExecutor(max_workers=40) as executor: + futures = [] + while True: + ret, frame = cap.read() + if not ret: + break + + futures.append(executor.submit(process_frame, frame_idx, frame)) + frame_idx += 1 + + for future in futures: + time_differences.append(future.result()) + + # Filter out zero values from time_differences + non_zero_time_differences = [td for td in time_differences if td != 0] + + # Calculate the average latency excluding zero values + if non_zero_time_differences: + average_latency = np.mean(non_zero_time_differences) + + # Filter out anomaly peaks that differ more than 25% from the average for average calculation + filtered_time_differences = [ + td for td in non_zero_time_differences if abs(td - average_latency) <= 0.25 * average_latency + ] + + # Calculate the average latency using the filtered data + filtered_average_latency = np.mean(filtered_time_differences) + else: + print("Fatal: No timestamps recognized in the video. No data for calculating latency.") + sys.exit(1) + + # Plot the non-zero data + plt.plot(non_zero_time_differences, marker='o') + plt.title('End-to-End Latency — Media Communications Mesh') + plt.xlabel('Frame Index') + plt.ylabel('Latency, ms') + plt.grid(True) + + # Adjust the layout to create more space for the text + plt.subplots_adjust(bottom=0.5) + + # Prepare text for display and stdout + average_latency_text = f'Average End-to-End Latency: {filtered_average_latency:.2f} ms' + file_name = os.path.basename(input_video_file) + file_mod_time = datetime.fromtimestamp(os.path.getmtime(input_video_file)).strftime('%Y-%m-%d %H:%M:%S') + file_info_text = f'File: {file_name} | Last modified: {file_mod_time} UTC' + width = int(cap.get(cv.CAP_PROP_FRAME_WIDTH)) + height = int(cap.get(cv.CAP_PROP_FRAME_HEIGHT)) + fps = cap.get(cv.CAP_PROP_FPS) + video_properties_text = f'Resolution: {width}x{height} | FPS: {fps:.2f}' + + cap.release() + + # Display text on the plot + plt.text(0.5, -0.55, average_latency_text, + horizontalalignment='center', verticalalignment='center', + transform=plt.gca().transAxes) + plt.text(0.5, -0.85, file_info_text, + horizontalalignment='center', verticalalignment='center', + transform=plt.gca().transAxes) + plt.text(0.5, -1, video_properties_text, + horizontalalignment='center', verticalalignment='center', + transform=plt.gca().transAxes) + + if is_display_attached(): + plt.show() + + if len(sys.argv) == 3: + filename = sys.argv[2] + if not filename.endswith('.jpg'): + filename += '.jpg' + print("Saving the latency chart to: ", filename) + plt.savefig(filename, format='jpg', dpi=300) + + # Print text to stdout + print(file_info_text) + print(video_properties_text) + print(average_latency_text) + +main()