You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/aks/concepts-network.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -108,6 +108,10 @@ For more information, see [Configure kubenet networking for an AKS cluster][aks-
108
108
109
109
With Azure CNI, every pod gets an IP address from the subnet and can be accessed directly. These IP addresses must be planned in advance and unique across your network space. Each node has a configuration parameter for the maximum number of pods it supports. The equivalent number of IP addresses per node are then reserved up front. This approach can lead to IP address exhaustion or the need to rebuild clusters in a larger subnet as your application demands grow, so it's important to plan properly. To avoid these planning challenges, it is possible to enable the feature [Azure CNI networking for dynamic allocation of IPs and enhanced subnet support][configure-azure-cni-dynamic-ip-allocation].
110
110
111
+
> [!NOTE]
112
+
> Due to Kubernetes limitations, the Resource Group name, the Virtual Network name and the subnet name must be 63 characters or less.
113
+
114
+
111
115
Unlike kubenet, traffic to endpoints in the same virtual network isn't NAT'd to the node's primary IP. The source address for traffic inside the virtual network is the pod IP. Traffic that's external to the virtual network still NATs to the node's primary IP.
112
116
113
117
Nodes use the [Azure CNI][cni-networking] Kubernetes plugin.
@@ -514,7 +514,7 @@ For more information, see the [source transformation](data-flow-source.md) and [
514
514
515
515
To use Microsoft Fabric Lakehouse Files dataset as a source or sink dataset in mapping data flow, go to the following sections for the detailed configurations.
516
516
517
-
#### Microsoft Fabric Lakehouse Files as a sink type
517
+
#### Microsoft Fabric Lakehouse Files as a source or sink type
518
518
519
519
Microsoft Fabric Lakehouse connector supports the following file formats. Refer to each article for format-based settings.
520
520
@@ -528,6 +528,12 @@ Microsoft Fabric Lakehouse connector supports the following file formats. Refer
528
528
529
529
To use Microsoft Fabric Lakehouse Table dataset as a source or sink dataset in mapping data flow, go to the following sections for the detailed configurations.
530
530
531
+
#### Microsoft Fabric Lakehouse Table as a source type
532
+
533
+
There are no configurable properties under source options.
534
+
> [!NOTE]
535
+
> CDC support for Lakehouse table source is currently not available.
536
+
531
537
#### Microsoft Fabric Lakehouse Table as a sink type
532
538
533
539
The following properties are supported in the Mapping Data Flows **sink** section:
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-apps-install-applications.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ description: Learn how to install third-party Apache Hadoop applications on Azur
4
4
ms.service: hdinsight
5
5
ms.custom: hdinsightactive
6
6
ms.topic: how-to
7
-
ms.date: 11/17/2022
7
+
ms.date: 12/08/2023
8
8
9
9
---
10
10
# Install third-party Apache Hadoop applications on Azure HDInsight
@@ -20,14 +20,14 @@ The following list shows the published applications:
20
20
|[AtScale Intelligence Platform](https://aws.amazon.com/marketplace/pp/AtScale-AtScale-Intelligence-Platform/B07BWWHH18)|Hadoop |AtScale turns your HDInsight cluster into a scale-out OLAP server, allowing you to query billions of rows of data interactively using the BI tools you already know, own, and love – from Microsoft Excel, Power BI, Tableau Software to QlikView. |
21
21
|[Datameer](https://azuremarketplace.microsoft.com/marketplace/apps/datameer.datameer)|Hadoop |Datameer's self-service scalable platform for preparing, exploring, and governing your data for analytics accelerates turning complex multisource data into valuable business-ready information, delivering faster, smarter insights at an enterprise-scale. |
22
22
|[Dataiku DSS on HDInsight](https://azuremarketplace.microsoft.com/marketplace/apps/dataiku.dataiku-data-science-studio)|Hadoop, Spark |Dataiku DSS in an enterprise data science platform that lets data scientists and data analysts collaborate to design and run new data products and services more efficiently, turning raw data into impactful predictions. |
23
-
|[WANdisco Fusion HDI App](https://community.wandisco.com/s/article/Use-WANdisco-Fusion-for-parallel-operation-of-ADLS-Gen1-and-Gen2)|Hadoop, Spark,HBase,Kafka |Keeping data consistent in a distributed environment is a massive data operations challenge. WANdisco Fusion, an enterprise-class software platform, solves this problem by enabling unstructured data consistency across any environment. |
23
+
|[WANdisco Fusion HDI App](https://community.wandisco.com/s/article/Use-WANdisco-Fusion-for-parallel-operation-of-ADLS-Gen1-and-Gen2)|Hadoop, Spark,HBase,Kafka |Keeping data consistent in a distributed environment is a massive data operations challenge. WANdisco Fusion, an enterprise-class software platform, solves this problem by enabling unstructured data consistency across any environment. |
24
24
|H2O SparklingWater for HDInsight |Spark |H2O Sparkling Water supports the following distributed algorithms: GLM, Naïve Bayes, Distributed Random Forest, Gradient Boosting Machine, Deep Neural Networks, Deep learning, K-means, PCA, Generalized Low Rank Models, Anomaly Detection, Autoencoders. |
25
-
|[Striim for Real-Time Data Integration to HDInsight](https://azuremarketplace.microsoft.com/marketplace/apps/striim.striimbyol)|Hadoop,HBase,Spark,Kafka |Striim (pronounced "stream") is an end-to-end streaming data integration + intelligence platform, enabling continuous ingestion, processing, and analytics of disparate data streams. |
25
+
|[Striim for Real-Time Data Integration to HDInsight](https://azuremarketplace.microsoft.com/marketplace/apps/striim.striimbyol)|Hadoop,HBase,Spark,Kafka |Striim (pronounced "stream") is an end-to-end streaming data integration + intelligence platform, enabling continuous ingestion, processing, and analytics of disparate data streams. |
26
26
|[Jumbune Enterprise-Accelerating BigData Analytics](https://azuremarketplace.microsoft.com/marketplace/apps/impetus-infotech-india-pvt-ltd.impetus_jumbune)|Hadoop, Spark |At a high level, Jumbune assists enterprises by, 1. Accelerating Tez, MapReduce & Spark engine based Hive, Java, Scala workload performance. 2. Proactive Hadoop Cluster Monitoring, 3. Establishing Data Quality management on distributed file system. |
27
-
|[Kyligence Enterprise](https://azuremarketplace.microsoft.com/marketplace/apps/kyligence.kyligence-cloud-saas)|Hadoop,HBase,Spark |Powered by Apache Kylin, Kyligence Enterprise Enables BI on Big Data. As an enterprise OLAP engine on Hadoop, Kyligence Enterprise empowers business analyst to architect BI on Hadoop with industry-standard data warehouse and BI methodology. |
28
-
|[StreamSets Data Collector for HDInsight Cloud](https://azuremarketplace.microsoft.com/marketplace/apps/streamsets.streamsets-data-collector-hdinsight)|Hadoop,HBase,Spark,Kafka |StreamSets Data Collector is a lightweight, powerful engine that streams data in real time. Use Data Collector to route and process data in your data streams. It comes with a 30 day trial license. |
29
-
|[Trifacta Wrangler Enterprise](https://azuremarketplace.microsoft.com/marketplace/apps/trifactainc1587522950142.trifactaazure)|Hadoop, Spark,HBase |Trifacta Wrangler Enterprise for HDInsight supports enterprise-wide data wrangling for any scale of data. The cost of running Trifacta on Azure is a combination of Trifacta subscription costs plus the Azure infrastructure costs for the virtual machines. |
30
-
|[Unifi Data Platform](https://www.crunchbase.com/organization/unifi-software)|Hadoop,HBase,Spark |The Unifi Data Platform is a seamlessly integrated suite of self-service data tools designed to empower the business user to tackle data challenges that drive incremental revenue, reduce costs or operational complexity. |
27
+
|[Kyligence Enterprise](https://azuremarketplace.microsoft.com/marketplace/apps/kyligence.kyligence-cloud-saas)|Hadoop,HBase,Spark |Powered by Apache `Kylin`, Kyligence Enterprise Enables BI on Big Data. As an enterprise OLAP engine on Hadoop, Kyligence Enterprise empowers business analyst to architect BI on Hadoop with industry-standard data warehouse and BI methodology. |
28
+
|[StreamSets Data Collector for HDInsight Cloud](https://azuremarketplace.microsoft.com/marketplace/apps/streamsets.streamsets-data-collector-hdinsight)|Hadoop,HBase,Spark,Kafka |StreamSets Data Collector is a lightweight, powerful engine that streams data in real time. Use Data Collector to route and process data in your data streams. It comes with a 30 day trial license. |
29
+
|[Trifacta Wrangler Enterprise](https://azuremarketplace.microsoft.com/marketplace/apps/trifactainc1587522950142.trifactaazure)|Hadoop, Spark,HBase |Trifacta Wrangler Enterprise for HDInsight supports enterprise-wide data wrangling for any scale of data. The cost of running Trifacta on Azure is a combination of Trifacta subscription costs plus the Azure infrastructure costs for the virtual machines. |
30
+
|[Unifi Data Platform](https://www.crunchbase.com/organization/unifi-software)|Hadoop,HBase,Spark |The `Unifi Data Platform` is a seamlessly integrated suite of self-service data tools designed to empower the business user to tackle data challenges that drive incremental revenue, reduce costs or operational complexity. |
31
31
32
32
The instructions provided in this article use Azure portal. You can also export the Azure Resource Manager template from the portal or obtain a copy of the Resource Manager template from vendors, and use Azure PowerShell and Azure Classic CLI to deploy the template. See [Create Apache Hadoop clusters on HDInsight using Resource Manager templates](hdinsight-hadoop-create-linux-clusters-arm-templates.md).
33
33
@@ -84,7 +84,7 @@ The portal shows a list of the installed HDInsight applications for a cluster, a
84
84
## Connect to the edge node
85
85
You can connect to the edge node using HTTP and SSH. The endpoint information can be found from the [portal](#list-installed-hdinsight-apps-and-properties). For information, see [Use SSH with HDInsight](hdinsight-hadoop-linux-use-ssh-unix.md).
86
86
87
-
The HTTP endpoint credentials are the HTTP user credentials that you have configured for the HDInsight cluster; the SSH endpoint credentials are the SSH credentials that you have configured for the HDInsight cluster.
87
+
The HTTP endpoint credentials are the HTTP user credentials configured for the HDInsight cluster. The SSH endpoint credentials are the SSH credentials configured for the HDInsight cluster.
88
88
89
89
## Troubleshoot
90
90
See [Troubleshoot the installation](hdinsight-apps-install-custom-applications.md#troubleshoot-the-installation).
@@ -35,18 +35,18 @@ For example, using these programmatic methods, you can configure options in thes
35
35
* yarn-site.xml
36
36
* server.properties (kafka-broker configuration)
37
37
38
-
For information on installing additional components on HDInsight cluster during the creation time, see [Customize HDInsight clusters using Script Action (Linux)](hdinsight-hadoop-customize-cluster-linux.md).
38
+
For information on installing more components on HDInsight cluster during the creation time, see [Customize HDInsight clusters using Script Action (Linux)](hdinsight-hadoop-customize-cluster-linux.md).
39
39
40
40
## Prerequisites
41
41
42
-
* If using PowerShell, you'll need the [Az Module](/powershell/azure/).
42
+
* If using PowerShell, you need the [Az Module](/powershell/azure/).
43
43
44
44
## Use Azure PowerShell
45
45
46
46
The following PowerShell code customizes an [Apache Hive](https://hive.apache.org/) configuration:
47
47
48
48
> [!IMPORTANT]
49
-
> The parameter `Spark2Defaults` may need to be used with [Add-AzHDInsightConfigValue](/powershell/module/az.hdinsight/add-azhdinsightconfigvalue). You can pass empty values to the parameter as shown in the code example below.
49
+
> The parameter `Spark2Defaults` may need to be used with [Add-AzHDInsightConfigValue](/powershell/module/az.hdinsight/add-azhdinsightconfigvalue). You can pass empty values to the parameter as shown in the following code example.
50
50
51
51
```powershell
52
52
# hive-site.xml configuration
@@ -117,7 +117,7 @@ You can use bootstrap in Resource Manager template:
description: Learn the resources available when you create an HDInsight cluster in an Azure Virtual Network.
4
4
ms.service: hdinsight
5
5
ms.topic: conceptual
6
-
ms.date: 11/17/2022
6
+
ms.date: 12/05/2023
7
7
---
8
8
9
9
# Azure HDInsight virtual network architecture
10
10
11
-
This article explains the resources that are present when you deploy an HDInsight cluster into a custom Azure Virtual Network. This information will help you to connect on-premises resources to your HDInsight cluster in Azure. For more information on Azure Virtual Networks, see [What is Azure Virtual Network?](../virtual-network/virtual-networks-overview.md).
11
+
This article explains the resources that are present when you deploy an HDInsight cluster into a custom Azure Virtual Network. This information helps you to connect on-premises resources to your HDInsight cluster in Azure. For more information on Azure Virtual Networks, see [What is Azure Virtual Network?](../virtual-network/virtual-networks-overview.md).
12
12
13
13
## Resource types in Azure HDInsight clusters
14
14
@@ -26,7 +26,7 @@ Use Fully Qualified Domain Names (FQDNs) when addressing nodes in your cluster.
26
26
27
27
These FQDNs will be of the form `<node-type-prefix><instance-number>-<abbreviated-clustername>.<unique-identifier>.cx.internal.cloudapp.net`.
28
28
29
-
The `<node-type-prefix>` will be *hn* for headnodes, *wn* for worker nodes and *zn* for zookeeper nodes.
29
+
The `<node-type-prefix>` will be `hn` for headnodes, `wn` for worker nodes and `zn` for zookeeper nodes.
30
30
31
31
If you need just the host name, use only the first part of the FQDN: `<node-type-prefix><instance-number>-<abbreviated-clustername>`
32
32
@@ -61,7 +61,7 @@ You can access your HDInsight cluster in three ways:
61
61
62
62
- An HTTPS endpoint outside of the virtual network at `CLUSTERNAME.azurehdinsight.net`.
63
63
- An SSH endpoint for directly connecting to the headnode at `CLUSTERNAME-ssh.azurehdinsight.net`.
64
-
- An HTTPS endpoint within the virtual network `CLUSTERNAME-int.azurehdinsight.net`. Notice the "`-int`" in this URL. This endpoint will resolve to a private IP in that virtual network and isn't accessible from the public internet.
64
+
- An HTTPS endpoint within the virtual network `CLUSTERNAME-int.azurehdinsight.net`. Notice the "`-int`" in this URL. This endpoint resolves to a private IP in that virtual network and isn't accessible from the public internet.
65
65
66
66
These three endpoints are each assigned a load balancer.
0 commit comments