Skip to content

Configuration Considerations

Kolesnik, Gennadiy edited this page Jul 16, 2025 · 2 revisions

Streaming Over UDP and TCP

The AMD Streaming Transport can work over either UDP and TCP. While both UDP and TCP run over the IP protocol, there is a number of important differences between them that have practical implications for streaming:

  • TCP is a stream-oriented protocol. While TCP guarantees delivery of the data transmitted in the same order as it was sent, this reliability comes at a cost: every data packet sent over a TCP connection is acknowledhed by the receipient before the sender can send the next packet. This constant acknowledgement causes higher transmission latency.
  • Conversely, UDP is a block-oriented protocol, which does not guarantee delivery of all the data sent, nor does it guarantee that data packets will be received in the order of sending. This avoids acknowledgements of data packets, which allows for a lower transmission latency since data packets can be sent immediately rather than only after the receipt of the acknowledgement of the previous data packet. The AMD Streaming Transport splits larger messages into series of fragments that fit within a UDP datagram before sending them and rearranges the received fragments to restore the original message regardless of which order the fragments were received in. This is advantageous on slower, but reliable networks, as it allows to achieve a lower overall latency. However, this might become a liability when streaming over a network with high packet loss. The AMD Streaming Transport protocol uses the mechanism of negative acknowledgements which requests retransmission of packets that have not arrived when expected, however this mechanism is not bullet-proof as these retransmission requests can also be lost.
  • UDP traffic is often subject to more aggressive quality-of-service optimizations by routers, when datagrams of smaller size are prioritized over the larger ones, causing them to be dropped. The AMD Streaming Transport constantly monitors the traffic, collects extensive statistics and adjusts the datagram size accordingly. While this mechanism greatly improves transmission reliability, it cannot achieve zero packet loss as packets can be dropped by routers along the way. Packet loss will cause visible corruption of the video and audible distortion of the audio. Streaming SDK utilizes various techniques to speed up recovery after packet loss, however none of these techniques are 100% reliable.

Choose UDP when streaming over a generally reliable network, such as Ethernet, newer versions of WiFi (WiFi-5/6, aka 802.11ac and 802.11ax) and fiber in use cases where low latency is important, such as for example, game streaming. Choose TCP when streaming over less reliable networks such as DOCSIS (cable), 2.4GHz WiFi (802.11n) or when a slightly higher latency is acceptable, as in VDI use cases. Reliability of cellular connections (LTE and 5G) can vary greatly depending on the location, distance to the tower and the overall network load. It is therefore best to offer both TCP and UDP on the server in use cases when network reliability is unknown or can vary over time.

When streaming over UDP, keep in mind the following:

  • Smaller UDP datagrams introduce more overhead, but often offer better reliability. Try adjusting the datagram size anywhere between 508 and 65507 bytes to balance speed and latency vs reliability.
  • Avoid mesh networks where low latency is important. Connect all nodes of the mesh with Ethernet cables rather than relying on wireless links between nodes.
  • Whenever possible, choose a 5GHz access point over 2.4GHz.

Selecting Optimal Video Bitrate

Streaming SDK pipelines configure the video encoders' rate control to Variable Bitrate (VBR). This means that the encoder will produce a video stream with the specified average bitrate. The actual peak bitrate at any given moment may significantly exceed the specified target bitrate. To achieve reliable transmission, the available throughput of the network must be at least double the sum of the target video and audio bitrates.

Streaming SDK offers a Quality of Service (QoS) module, which adjusts video bitrate dynamically depending on the available bandwidth of the communication channel. QoS gradually lowers video bitrate when network quality deteriorates and gradually increases it up to the set maximum when conditions improve. Keep in mind, however, than QoS is designed to adjust the video stream to temporary changes in network quality, rather than being an automatic network profiler. You can experiement with QoS configuration parameters, such as bitrate adjustment step and the averaging interval to better suit your needs.

Selecting Optimal Video Resolution

Typically the images comprising the transmitted video stream are captured or rendered in the RGB color space, while most video codecs operate in the YUV color space, therefore color space conversion from RGB to YUV is required prior to encoding a video frame. Video encoders in AMD GPUs can perform color space conversion from certain pixel formats using a dedicated hardware accelerator. This conversion is optimal since it is performed by a dedicated hardware block and therefore does not interfere with other applications using the GPU at the same time. In game streaming scenarios this could be a critical advantage as games tend to be graphically intensive and otherwise might not leave enough room for additional tasks needed to facilitate streaming, causing the streaming solution to interfere with the content being streamed.

It is, however, important to remember that the color space converter embedded in the video encoder cannot perform any kind of scaling. Therefore, it can only be enabled when the resolution of the content being streamed matches the resolution of the encoded stream. In other cases color space conversion is performed by the AMF's AMFVideoConverter component It is always recommended to match the content resolution to the stream resolution or vice versa to reduce any potential interference between the streaming server and the application being streamed.

On Windows AMD GPUs can run the AMFVideoConverter component component's tasks on a separate Compute queue when DirectX 12 is being used for display capture, scaling, color space conversion and encoding, thus minimizing interference with the content being streamed.

Copy On Capture

When capturing the display image with AMD Direct Capture, the surface returned is associated with the actual frame buffer in the display's swap chain. In a typical server video pipeline this surface is then passed to the next component of the pipeline as input. The display capture component must not write to this surface while the next component in the pipeline is reading from it to avoid data corruption. In most cases this can be enforced through various synchronization mechanisms in the graphics driver, however, there is a limitation which prevents proper synchronization between display capture and video encoder on DX12, which results in visible image corruption. The only scenario when the captured surface is passed directly to the video encoder is when the resolution of the display being captured matches the resolution of the encoded video stream and the pixel format and the transfer characteristic are compatible with the video encoder input. In all other cases the server video pipeline includes the AMFVideoConverter component, which performs scaling and color space conversion using a shader running on the 3D or Compute queue, which don't suffer from the aforementioned limitation. The output of the AMFVideoConverter component is a new surface containing the scaled/converted video frame, which is then passed to the video encoder as input. Thus, the synchronization limitation described above is being avoided.

To work around this limitation, when the AMFVideoConverter component is not required, a copy of the captured frame buffer is created by the Direct Display Capture component and it is this copy that is being passed to the video encoder component. This results in a small additional overhead, which typically adds less than 1ms to the overall round trip latency, but allows to avoid visible artifacts in the captured video stream. The need for this copy needs to evaluated every time there is a change of the display resolution, pixel format or color characteristic. The logic that configures the Display Capture component is implemented in the AVStreamer::CaptureVideo() method in the samples/RemoteDesktopServer/AVStreamer.cpp file. Refer to the in-line comments for more details.

You can force this copy to always be made by adding the -CaptureCopy true to the RemoteDesktopServer.exe command line for debugging purposes.

Clone this wiki locally