|
1 | 1 | ---
|
2 |
| -title: Call flows in Azure Communication Services |
3 |
| -titleSuffix: An Azure Communication Services concept document |
4 |
| -description: Learn about call flows in Azure Communication Services. |
| 2 | +title: Call networking internals |
| 3 | +titleSuffix: An Azure Communication Services article |
| 4 | +description: This article describes call flows for different Azure Communication Services call types. |
5 | 5 | author: tophpalmer
|
6 | 6 | manager: chpalm
|
7 | 7 | services: azure-communication-services
|
8 | 8 | ms.author: chpalm
|
9 |
| -ms.date: 06/30/2021 |
| 9 | +ms.date: 06/20/2025 |
10 | 10 | ms.topic: conceptual
|
11 | 11 | ms.service: azure-communication-services
|
12 | 12 | ms.subservice: calling
|
13 | 13 | ---
|
14 |
| -# Call flow basics |
15 | 14 |
|
16 |
| -The section below gives an overview of the call flows in Azure Communication Services. Signaling and media flows depend on the types of calls your users are making. Examples of call types include one-to-one VoIP, one-to-one PSTN, and group calls containing a combination of VoIP and PSTN-connected participants. Review [Call types](./voice-video-calling/about-call-types.md). |
| 15 | +# Call networking internals |
17 | 16 |
|
18 |
| -## About signaling and media protocols |
| 17 | +The article describes the call flows in Azure Communication Services. Signaling and media flows depend on the types of calls your users are making. Examples of call types include one-to-one VoIP, one-to-one public switched telephone network (PSTN), and group calls containing a combination of VoIP and PSTN-connected participants. For more information, see [Call types](./voice-video-calling/about-call-types.md). |
19 | 18 |
|
20 |
| -When you establish a peer-to-peer or group call, two protocols are used behind the scenes - HTTPS (REST) for signaling and SRTP for media. |
| 19 | +## Signaling and media protocols |
21 | 20 |
|
22 |
| -Signaling between the SDKs or between SDKs and Communication Services Signaling Controllers is handled with HTTPS REST (TLS). Azure Communication Services uses TLS 1.2. For Real-Time Media Traffic (RTP), the User Datagram Protocol (UDP) is preferred. If the use of UDP is prevented by your firewall, the SDK will use the Transmission Control Protocol (TCP) for media. |
| 21 | +When you establish a peer-to-peer or group call, two protocols are used behind the scenes - HTTPS (REST) for signaling and Secure Real-time Transport Protocol (SRTP) for media. |
| 22 | + |
| 23 | +Signaling between the SDKs or between SDKs and Communication Services signaling controllers is handled with HTTPS REST (TLS). Azure Communication Services uses TLS 1.2. For real-time media traffic (RTP), we recommend user datagram protocol (UDP). If the firewall prevents use of UDP, the SDK uses the transmission control protocol (TCP) for media. |
23 | 24 |
|
24 | 25 | Let's review the signaling and media protocols in various scenarios.
|
25 | 26 |
|
26 | 27 | ## Call flow cases
|
27 | 28 |
|
28 |
| -### Case 1: VoIP where a direct connection between two devices is possible |
| 29 | +### Case 1: VoIP with a direct connection between two devices |
29 | 30 |
|
30 |
| -In one-to-one VoIP or video calls, traffic prefers the most direct path. "Direct path" means that if two SDKs can reach each other directly, they'll establish a direct connection. This is usually possible when two SDKs are in the same subnet (for example, in a subnet 192.168.1.0/24) or two when the devices each live in subnets that can see each other (SDKs in subnet 10.10.0.0/16 and 192.168.1.0/24 can reach out each other). |
| 31 | +In one-to-one VoIP or video calls, traffic prefers the most direct path. *Direct path* means that if two SDKs can reach each other directly, they establish a direct connection. Direct path is possible when two SDKs are in the same subnet (such as in a subnet 192.168.1.0/24) or two when the devices each live in subnets that can see each other (SDKs in subnet 10.10.0.0/16 and 192.168.1.0/24 can reach out each other). |
31 | 32 |
|
32 | 33 | :::image type="content" source="./media/call-flows/about-voice-case-1.png" alt-text="Diagram showing a Direct VOIP call between users and Communication Services.":::
|
33 | 34 |
|
34 |
| -### Case 2: VoIP where a direct connection between devices is not possible, but where connection between NAT devices is possible |
| 35 | +### Case 2: VoIP in which a direct connection between devices isn't possible, but a connection between NAT devices is possible |
35 | 36 |
|
36 |
| -If two devices are located in subnets that can't reach each other (for example, Alice works from a coffee shop and Bob works from his home office) but the connection between the NAT devices is possible, the client side SDKs will establish connectivity via NAT devices. |
| 37 | +If two devices are located in subnets that can't reach each other but the connection between the network address translation (NAT) devices is possible, the client side SDKs establish connectivity via NAT devices. For example, if Alice works from a coffee shop and Bob works from a home office. |
37 | 38 |
|
38 |
| -For Alice it will be the NAT of the coffee shop and for Bob it will be the NAT of the home office. Alice's device will send the external address of her NAT and Bob's will do the same. The SDKs learn the external addresses from a STUN (Session Traversal Utilities for NAT) service that Azure Communication Services provides free of charge. The logic that handles the handshake between Alice and Bob is embedded within the Azure Communication Services provided SDKs. (You don't need any additional configuration) |
| 39 | +For Alice, it's the NAT of the coffee shop and for Bob it's the NAT of the home office. Alice's device sends the external address of her NAT and Bob's does the same. The SDKs learn the external addresses from a session traversal utilities for NAT (STUN) service that Azure Communication Services provides free of charge. The logic that handles the handshake between Alice and Bob is embedded in the Azure Communication Services provided SDKs. You don't need any added configuration. |
39 | 40 |
|
40 |
| -:::image type="content" source="./media/call-flows/about-voice-case-2.png" alt-text="Diagram showing a VOIP call which utilizes a STUN connection."::: |
| 41 | +:::image type="content" source="./media/call-flows/about-voice-case-2.png" alt-text="Diagram showing a VOIP call, using a session traversal utilities for NAT (STUN) connection."::: |
41 | 42 |
|
42 |
| -### Case 3: VoIP where neither a direct nor NAT connection is possible |
| 43 | +### Case 3: VoIP in which a direct nor NAT connection is possible |
43 | 44 |
|
44 |
| -If one or both client devices are behind a symmetric NAT, a separate cloud service to relay the media between the two SDKs is required. This service is called TURN (Traversal Using Relays around NAT) and is also provided by the Communication Services. The Communication Services Calling SDK automatically uses TURN services based on detected network conditions. TURN charges are included in the price of the call. |
| 45 | +If one or both client devices are behind a symmetric NAT, a separate cloud service is required to relay the media between the two SDKs. This service is called traversal using relays around NAT (TURN) and is also provided by Azure Communication Services. The Communication Services Calling SDK automatically uses TURN services based on detected network conditions. TURN charges are included in the price of the call. |
45 | 46 |
|
46 |
| -:::image type="content" source="./media/call-flows/about-voice-case-3.png" alt-text="Diagram showing a VOIP call which utilizes a TURN connection."::: |
| 47 | +:::image type="content" source="./media/call-flows/about-voice-case-3.png" alt-text="Diagram showing a VOIP, over a traversal using relays around NAT (TURN) connection."::: |
47 | 48 |
|
48 | 49 | ### Case 4: Group calls with PSTN
|
49 | 50 |
|
50 | 51 | Both signaling and media for PSTN Calls use the Azure Communication Services telephony resource. This resource is interconnected with other carriers.
|
51 | 52 |
|
52 |
| -PSTN media traffic flows through a component called Media Processor. |
| 53 | +PSTN media traffic flows through a media processor component. |
53 | 54 |
|
54 | 55 | :::image type="content" source="./media/call-flows/about-voice-pstn.png" alt-text="Diagram showing a PSTN Group Call with Communication Services.":::
|
55 | 56 |
|
56 | 57 | > [!NOTE]
|
57 |
| -> For those familiar with media processing, our Media Processor is also a Back to Back User Agent, as defined in [RFC 3261 SIP: Session Initiation Protocol](https://tools.ietf.org/html/rfc3261), meaning it can translate codecs when handling calls between Microsoft and Carrier networks. The Azure Communication Services Signaling Controller is Microsoft's implementation of an SIP Proxy per the same RFC. |
| 58 | +> The media processor is also a back to back user agent, as defined in [RFC 3261 SIP: Session Initiation Protocol](https://tools.ietf.org/html/rfc3261), meaning it can translate codecs when handling calls between Microsoft and Carrier networks. The Azure Communication Services signaling controller is Microsoft's implementation of a SIP Proxy per the same RFC. |
58 | 59 |
|
59 |
| -For group calls, media and signaling always flow via the Azure Communication Services backend. The audio and/or video from all participants is mixed in the Media Processor component. All members of a group call send their audio and/or video streams to the media processor, which returns mixed media streams. |
| 60 | +For group calls, media and signaling always flow via the Azure Communication Services backend. The audio and/or video from all participants is mixed in the media processor. All members of a group call send their audio and video streams to the media processor, which returns mixed media streams. |
60 | 61 |
|
61 |
| -The default real-time protocol (RTP) for group calls is User Datagram Protocol (UDP). |
| 62 | +The default real-time protocol (RTP) for group calls is user datagram protocol (UDP). |
62 | 63 |
|
63 | 64 | > [!NOTE]
|
64 |
| -> The Media Processor can act as a Multipoint Control Unit (MCU) or Selective Forwarding Unit (SFU) |
| 65 | +> The Media Processor can act as a multipoint control unit (MCU) or selective forwarding unit (SFU). |
65 | 66 |
|
66 | 67 | :::image type="content" source="./media/call-flows/about-voice-group-calls.png" alt-text="Diagram showing UDP media process flow within Communication Services.":::
|
67 | 68 |
|
68 |
| -If the SDK can't use UDP for media due to firewall restrictions, an attempt will be made to use the Transmission Control Protocol (TCP). Note that the Media Processor component requires UDP, so when this happens, the Communication Services TURN service will be added to the group call to translate TCP to UDP. TURN charges are included in the price of the call. |
| 69 | +If the SDK can't use UDP for media due to firewall restrictions, it attempts to use the transmission control protocol (TCP). The media processor component requires UDP, so when in this case, the Communication Services TURN service is added to the group call to translate TCP to UDP. TURN charges are included in the price of the call. |
69 | 70 |
|
70 | 71 | :::image type="content" source="./media/call-flows/about-voice-group-calls-2.png" alt-text="Diagram showing TCP media process flow within Communication Services.":::
|
71 | 72 |
|
72 | 73 | ### Case 5: Communication Services SDK and Microsoft Teams in a scheduled Teams meeting
|
73 | 74 |
|
74 |
| -Signaling flows through the signaling controller. Media flows through the Media Processor. The signaling controller and Media Processor are shared between Communication Services and Microsoft Teams. |
| 75 | +Signaling flows through the signaling controller. Media flows through the media processor. The signaling controller and media processor are shared between Communication Services and Microsoft Teams. |
75 | 76 |
|
76 | 77 | :::image type="content" source="./media/call-flows/teams-communication-services-meeting.png" alt-text="Diagram showing Communication Services SDK and Teams Client in a scheduled Teams meeting.":::
|
77 | 78 |
|
78 | 79 | ### Case 6: Early media
|
79 | 80 |
|
80 |
| -Refers to media (e.g., audio and video) that is exchanged before a particular session is accepted by the called user. If there is early media flow, the SBC must latch to the first endpoint that starts streaming media; media flow can start before candidates are nominated. The SBC should have support for sending DTMF during this phase to enable IVR/voicemail scenarios. The SBC should use the highest priority path on which it has received checks if nominations have not completed. |
| 81 | +Refers to media that is exchanged, such as audio and video, before the callee accepts the session. For early media flow, the session border controller (SBC) must latch to the first endpoint that starts streaming media; media flow can start before candidates are nominated. The SBC must support sending dual tone multi-frequency (DTMF) during this phase to enable IVR/voicemail scenarios. The SBC should use the highest priority path on which it receives checks, if nominations aren't complete. |
81 | 82 |
|
82 | 83 | ## Next steps
|
83 | 84 |
|
84 | 85 | > [!div class="nextstepaction"]
|
85 | 86 | > [Get started with calling](../quickstarts/voice-video-calling/getting-started-with-calling.md)
|
86 | 87 |
|
87 |
| -The following documents may be interesting to you: |
| 88 | +## Related articles |
88 | 89 |
|
89 | 90 | - Learn more about [call types](../concepts/voice-video-calling/about-call-types.md)
|
90 | 91 | - Learn about [Client-server architecture](./client-and-server-architecture.md)
|
|
0 commit comments