You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/domain-joined/apache-domain-joined-introduction.md
+2-3Lines changed: 2 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,8 +18,7 @@ You can create an HDInsight cluster with Enterprise Security Package (ESP) that'
18
18
19
19
The enterprise admin can configure role-based access control (RBAC) for Apache Hive security by using [Apache Ranger](https://ranger.apache.org/). Configuring RBAC restricts data access to only what's needed. Finally, the admin can audit the data access by employees and any changes done to access control policies. The admin can then achieve a high degree of governance of their corporate resources.
20
20
21
-
> [!NOTE]
22
-
> Apache Oozie is now enabled on ESP clusters. To access the Oozie web UI, users should enable [tunneling](../hdinsight-linux-ambari-ssh-tunnel.md).
21
+
Apache Oozie is now enabled on ESP clusters. To access the Oozie web UI, users should enable [tunneling](../hdinsight-linux-ambari-ssh-tunnel.md).
23
22
24
23
Enterprise security contains four major pillars: perimeter security, authentication, authorization, and encryption.
25
24
@@ -50,7 +49,7 @@ A HDInsight cluster with ESP uses the familiar Apache Ranger UI to search audit
50
49
## Encryption
51
50
Protecting data is important for meeting organizational security and compliance requirements. Along with restricting access to data from unauthorized employees, you should encrypt it.
52
51
53
-
Both data stores for HDInsight clusters--Azure Blob storage and Azure Data Lake Storage Gen1/Gen2--support transparent server-side [encryption of data](../../storage/common/storage-service-encryption.md) at rest. Secure HDInsight clusters will seamlessly work with this capability of server-side encryption of data at rest.
52
+
Both data stores for HDInsight clusters, Azure Blob storage and Azure Data Lake Storage Gen1/Gen2, support transparent server-side [encryption of data](../../storage/common/storage-service-encryption.md) at rest. Secure HDInsight clusters will seamlessly work with this capability of server-side encryption of data at rest.
Copy file name to clipboardExpand all lines: articles/hdinsight/hbase/apache-hbase-overview.md
-2Lines changed: 0 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,8 +16,6 @@ ms.author: hrasheed
16
16
17
17
From user perspective, HBase is similar to a database. Data is stored in the rows and columns of a table, and data within a row is grouped by column family. HBase is a schemaless database in the sense that neither the columns nor the type of data stored in them need to be defined before using them. The open-source code scales linearly to handle petabytes of data on thousands of nodes. It can rely on data redundancy, batch processing, and other features that are provided by distributed applications in the Hadoop ecosystem.
## How is Apache HBase implemented in Azure HDInsight?
22
20
23
21
HDInsight HBase is offered as a managed cluster that is integrated into the Azure environment. The clusters are configured to store data directly in [Azure Storage](./../hdinsight-hadoop-use-blob-storage.md) which provides low latency and increased elasticity in performance and cost choices. This enables customers to build interactive websites that work with large datasets, to build services that store sensor and telemetry data from millions of end points, and to analyze this data with Hadoop jobs. HBase and Hadoop are good starting points for big data project in Azure; in particular, they can enable real-time applications to work with large datasets.
Copy file name to clipboardExpand all lines: articles/hdinsight/r-server/r-server-overview.md
-2Lines changed: 0 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,8 +14,6 @@ ms.date: 06/12/2019
14
14
15
15
Microsoft Machine Learning Server is available as a deployment option when you create HDInsight clusters in Azure. The cluster type that provides this option is called **ML Services**. This capability provides data scientists, statisticians, and R programmers with on-demand access to scalable, distributed methods of analytics on HDInsight.
ML Services on HDInsight provides the latest capabilities for R-based analytics on datasets of virtually any size, loaded to either Azure Blob or Data Lake storage. Since ML Services cluster is built on open-source R, the R-based applications you build can leverage any of the 8000+ open-source R packages. The routines in ScaleR, Microsoft’s big data analytics package are also available.
20
18
21
19
The edge node of a cluster provides a convenient place to connect to the cluster and to run your R scripts. With an edge node, you have the option of running the parallelized distributed functions of ScaleR across the cores of the edge node server. You can also run them across the nodes of the cluster by using ScaleR’s Hadoop Map Reduce or Apache Spark compute contexts.
Copy file name to clipboardExpand all lines: articles/hdinsight/storm/apache-storm-overview.md
+4-5Lines changed: 4 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,8 +16,6 @@ ms.author: hrasheed
16
16
17
17
[Apache Storm](https://storm.apache.org/) is a distributed, fault-tolerant, open-source computation system. You can use Storm to process streams of data in real time with [Apache Hadoop](https://hadoop.apache.org/). Storm solutions can also provide guaranteed processing of data, with the ability to replay data that was not successfully processed the first time.
Storm on HDInsight provides the following features:
@@ -34,8 +32,7 @@ Storm on HDInsight provides the following features:
34
32
35
33
***Dynamic scaling**: You can add or remove worker nodes with no impact to running Storm topologies.
36
34
37
-
> [!NOTE]
38
-
> You must deactivate and reactivate running topologies to take advantage of new nodes added through scaling operations.
35
+
* You must deactivate and reactivate running topologies to take advantage of new nodes added through scaling operations.
39
36
40
37
***Create streaming pipelines using multiple Azure services**: Storm on HDInsight integrates with other Azure services such as Event Hubs, SQL Database, Azure Storage, and Azure Data Lake Storage.
41
38
@@ -145,7 +142,9 @@ How data streams are joined varies between applications. For example, you can jo
145
142
146
143
In the following Java example, fieldsGrouping is used to route tuples that originate from components "1", "2", and "3" to the MyJoiner bolt:
147
144
148
-
builder.setBolt("join", new MyJoiner(), parallelism) .fieldsGrouping("1", new Fields("joinfield1", "joinfield2")) .fieldsGrouping("2", new Fields("joinfield1", "joinfield2")) .fieldsGrouping("3", new Fields("joinfield1", "joinfield2"));
0 commit comments