This architecture demonstrates the connectivity architecture and traffic flows for migrating data using Azure Data Factory (ADF) using Self Hosted Integration Runtime (IR) and Private Endpoints. Using Self hosted IR the compute infrastructed provisioned in Azure VNET and can leverage public and private endpoints to securely connect to the target resources or data stores.
Download Multi-tab Visio and PDF
- Private Endpoints for Source (Azure sqlserver) and sink (Azure Blob storage)
- Private Endpoint created for Azure Data Factory PaaS service for the command and control (tcp/443) connection between self hosted IR and ADF. The traffic between the self-hosted integration runtime and the Azure Data Factory service goes through Private Link.
- Integration Runtime: Self Hosted IR in Azure VNET
- IP Routing between source and sink using Azure VNET.
- When using private endpoints, DNS infrastructure needs to be setup correctly. Spokes VNET is setup for Custom DNS pointing to 10.10.1.4 (DNS Server in the Hub VNET). The DNS server 10.10.1.4 has a server level DNS forwarder to Azure Provided DNS (168.63.129.16)
- Self Hosted Integration Runtime
- Private Endpoint for ADF
- DNS Configuration with Private Endpoints
- [Secure Communitcation between Self Hosted IR and ADF] (https://docs.microsoft.com/en-us/azure/data-factory/data-factory-private-link#secure-communication-between-customer-networks-and-azure-data-factory)
- Azure Data Factory terminology
- Support Data Stores and Formats
- Integration Runtime Concepts
- Linked Services
- Compute for self hosted IR is in Azure VNET.
- Supports connecting to targets or resources with private endpoints so no need for allowing IPs in the firewall or allowing Azure services.
- More suited for greenfield environments and would require prior network infrastructure planning for private endpoint subnet and routing.
- Note: If you want to perform data integration securely in a private network environment, which doesn't have a direct line-of-sight from the public cloud environment, you can install a self-hosted IR on premises environment behind your corporate firewall, or inside a virtual private network. The self-hosted integration runtime only makes outbound HTTP-based connections to open internet.
- Running copy activity between a cloud data stores (public endpoints) and a data store in private network (private endpoints)
- Security Consideration: Using Private endpoints with self hosted IR protects against data exfiltration
- DNS Considerations: When using private endpoints, DNS infrastructure needs to be setup correctly. Spokes VNET is setup for Custom DNS pointing to 10.10.1.4 (DNS Server in the Hub VNET). The DNS server 10.10.1.4 has a server level DNS forwarder to Azure Provided DNS (168.63.129.16)
From Azure Documentation link here

