Skip to content

Commit cbb1255

Browse files
authored
Add descriptions (#580)
1 parent b6a95ff commit cbb1255

File tree

14 files changed

+69
-36
lines changed

14 files changed

+69
-36
lines changed

docs/modules/hdfs/pages/getting_started/first_steps.adoc

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
= First steps
2+
:description: Deploy and verify an HDFS cluster with Stackable by setting up Zookeeper and HDFS components, then test file operations using WebHDFS API.
23

3-
Once you have followed the steps in the xref:getting_started/installation.adoc[] section to install the operator and its dependencies, you will now deploy an HDFS cluster and its dependencies. Afterward, you can <<_verify_that_it_works, verify that it works>> by creating, verifying and deleting a test file in HDFS.
4+
Once you have followed the steps in the xref:getting_started/installation.adoc[] section to install the operator and its dependencies, you will now deploy an HDFS cluster and its dependencies.
5+
Afterward, you can <<_verify_that_it_works, verify that it works>> by creating, verifying and deleting a test file in HDFS.
46

57
== Setup
68

@@ -11,7 +13,8 @@ To deploy a Zookeeper cluster create one file called `zk.yaml`:
1113
[source,yaml]
1214
include::example$getting_started/zk.yaml[]
1315

14-
We also need to define a ZNode that will be used by the HDFS cluster to reference Zookeeper. Create another file called `znode.yaml`:
16+
We also need to define a ZNode that will be used by the HDFS cluster to reference Zookeeper.
17+
Create another file called `znode.yaml`:
1518

1619
[source,yaml]
1720
include::example$getting_started/znode.yaml[]
@@ -28,7 +31,8 @@ include::example$getting_started/getting_started.sh[tag=watch-zk-rollout]
2831

2932
=== HDFS
3033

31-
An HDFS cluster has three components: the `namenode`, the `datanode` and the `journalnode`. Create a file named `hdfs.yaml` defining 2 `namenodes` and one `datanode` and `journalnode` each:
34+
An HDFS cluster has three components: the `namenode`, the `datanode` and the `journalnode`.
35+
Create a file named `hdfs.yaml` defining 2 `namenodes` and one `datanode` and `journalnode` each:
3236

3337
[source,yaml]
3438
----
@@ -37,10 +41,12 @@ include::example$getting_started/hdfs.yaml[]
3741

3842
Where:
3943

40-
- `metadata.name` contains the name of the HDFS cluster
41-
- the HDFS version in the Docker image provided by Stackable must be set in `spec.image.productVersion`
44+
* `metadata.name` contains the name of the HDFS cluster
45+
* the HDFS version in the Docker image provided by Stackable must be set in `spec.image.productVersion`
4246

43-
NOTE: Please note that the version you need to specify for `spec.image.productVersion` is the desired version of Apache HDFS. You can optionally specify the `spec.image.stackableVersion` to a certain release like `23.11.0` but it is recommended to leave it out and use the default provided by the operator. For a list of available versions please check our https://repo.stackable.tech/#browse/browse:docker:v2%2Fstackable%2Fhadoop%2Ftags[image registry].
47+
NOTE: Please note that the version you need to specify for `spec.image.productVersion` is the desired version of Apache HDFS.
48+
You can optionally specify the `spec.image.stackableVersion` to a certain release like `24.7.0` but it is recommended to leave it out and use the default provided by the operator.
49+
For a list of available versions please check our https://repo.stackable.tech/#browse/browse:docker:v2%2Fstackable%2Fhadoop%2Ftags[image registry].
4450
It should generally be safe to simply use the latest image version that is available.
4551

4652
Create the actual HDFS cluster by applying the file:
@@ -57,7 +63,9 @@ include::example$getting_started/getting_started.sh[tag=watch-hdfs-rollout]
5763

5864
== Verify that it works
5965

60-
To test the cluster you can create a new file, check its status and then delete it. We will execute these actions from within a helper pod. Create a file called `webhdfs.yaml`:
66+
To test the cluster operation, create a new file, check its status and then delete it.
67+
You can execute these actions from within a helper Pod.
68+
Create a file called `webhdfs.yaml`:
6169

6270
[source,yaml]
6371
----
@@ -75,7 +83,8 @@ To begin with the cluster should be empty: this can be verified by listing all
7583
[source]
7684
include::example$getting_started/getting_started.sh[tag=file-status]
7785

78-
Creating a file in HDFS using the https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Create_and_Write_to_a_File[Webhdfs API] requires a two-step `PUT` (the reason for having a two-step create/append is to prevent clients from sending out data before the redirect). First, create a file with some text in it called `testdata.txt` and copy it to the `tmp` directory on the helper pod:
86+
Creating a file in HDFS using the https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Create_and_Write_to_a_File[Webhdfs API] requires a two-step `PUT` (the reason for having a two-step create/append is to prevent clients from sending out data before the redirect).
87+
First, create a file with some text in it called `testdata.txt` and copy it to the `tmp` directory on the helper pod:
7988

8089
[source]
8190
include::example$getting_started/getting_started.sh[tag=copy-file]

docs/modules/hdfs/pages/getting_started/index.adoc

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
= Getting started
2+
:description: Start with HDFS using the Stackable Operator. Install the Operator, set up your HDFS cluster, and verify its operation with this guide.
23

3-
This guide will get you started with HDFS using the Stackable Operator. It will guide you through the installation of the Operator and its dependencies, setting up your first HDFS cluster and verifying its operation.
4+
This guide will get you started with HDFS using the Stackable Operator.
5+
It will guide you through the installation of the Operator and its dependencies, setting up your first HDFS cluster and verifying its operation.
46

57
== Prerequisites
68

docs/modules/hdfs/pages/getting_started/installation.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
= Installation
2+
:description: Install the Stackable HDFS operator and dependencies using stackablectl or Helm. Follow steps for setup and verification in Kubernetes.
23

34
On this page you will install the Stackable HDFS operator and its dependency, the Zookeeper operator, as well as the
45
commons, secret and listener operators which are required by all Stackable operators.

docs/modules/hdfs/pages/index.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
= Stackable Operator for Apache HDFS
2-
:description: The Stackable Operator for Apache HDFS is a Kubernetes operator that can manage Apache HDFS clusters. Learn about its features, resources, dependencies and demos, and see the list of supported HDFS versions.
2+
:description: Manage Apache HDFS with the Stackable Operator for Kubernetes. Set up clusters, configure roles, and explore demos and supported versions.
33
:keywords: Stackable Operator, Hadoop, Apache HDFS, Kubernetes, k8s, operator, big data, metadata, storage, cluster, distributed storage
44
:hdfs-docs: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html
55
:github: https://github.com/stackabletech/hdfs-operator/

docs/modules/hdfs/pages/usage-guide/configuration-environment-overrides.adoc

Lines changed: 20 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,22 @@
1-
21
= Configuration & Environment Overrides
2+
:description: Override HDFS config properties and environment variables per role or role group. Manage settings like DNS cache and environment variables efficiently.
3+
:java-security-overview: https://docs.oracle.com/en/java/javase/11/security/java-security-overview1.html
34

45
The cluster definition also supports overriding configuration properties and environment variables, either per role or per role group, where the more specific override (role group) has precedence over the less specific one (role).
56

6-
IMPORTANT: Overriding certain properties can lead to faulty clusters. In general this means, do not change ports, hostnames or properties related to data dirs, high-availability or security.
7+
IMPORTANT: Overriding certain properties can lead to faulty clusters.
8+
In general this means, do not change ports, hostnames or properties related to data dirs, high-availability or security.
79

810
== Configuration Properties
911

1012
For a role or role group, at the same level of `config`, you can specify `configOverrides` for the following files:
1113

12-
- `hdfs-site.xml`
13-
- `core-site.xml`
14-
- `hadoop-policy.xml`
15-
- `ssl-server.xml`
16-
- `ssl-client.xml`
17-
- `security.properties`
18-
14+
* `hdfs-site.xml`
15+
* `core-site.xml`
16+
* `hadoop-policy.xml`
17+
* `ssl-server.xml`
18+
* `ssl-client.xml`
19+
* `security.properties`
1920

2021
For example, if you want to set additional properties on the namenode servers, adapt the `nameNodes` section of the cluster resource like so:
2122

@@ -51,13 +52,17 @@ nameNodes:
5152

5253
All override property values must be strings. The properties will be formatted and escaped correctly into the XML file.
5354

54-
For a full list of configuration options we refer to the Apache Hdfs documentation for https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml[hdfs-site.xml] and https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/core-default.xml[core-site.xml]
55+
For a full list of configuration options we refer to the Apache Hdfs documentation for https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml[hdfs-site.xml] and https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/core-default.xml[core-site.xml].
5556

5657
=== The security.properties file
5758

58-
The `security.properties` file is used to configure JVM security properties. It is very seldom that users need to tweak any of these, but there is one use-case that stands out, and that users need to be aware of: the JVM DNS cache.
59+
The `security.properties` file is used to configure JVM security properties.
60+
It is very seldom that users need to tweak any of these, but there is one use-case that stands out, and that users need to be aware of: the JVM DNS cache.
5961

60-
The JVM manages it's own cache of successfully resolved host names as well as a cache of host names that cannot be resolved. Some products of the Stackable platform are very sensible to the contents of these caches and their performance is heavily affected by them. As of version 3.3.4 HDFS performs poorly if the positive cache is disabled. To cache resolved host names, and thus speeding up Hbase queries you can configure the TTL of entries in the positive cache like this:
62+
The JVM manages it's own cache of successfully resolved host names as well as a cache of host names that cannot be resolved.
63+
Some products of the Stackable platform are very sensible to the contents of these caches and their performance is heavily affected by them.
64+
As of version 3.3.4 HDFS performs poorly if the positive cache is disabled.
65+
To cache resolved host names, and thus speeding up Hbase queries you can configure the TTL of entries in the positive cache like this:
6166

6267
[source,yaml]
6368
----
@@ -80,12 +85,13 @@ The JVM manages it's own cache of successfully resolved host names as well as a
8085

8186
NOTE: The operator configures DNS caching by default as shown in the example above.
8287

83-
For details on the JVM security see https://docs.oracle.com/en/java/javase/11/security/java-security-overview1.html
88+
For details on the JVM security consult the {java-security-overview}[Java Security overview documentation].
8489

8590

8691
== Environment Variables
8792

88-
In a similar fashion, environment variables can be (over)written. For example per role group:
93+
In a similar fashion, environment variables can be (over)written.
94+
For example per role group:
8995

9096
[source,yaml]
9197
----

docs/modules/hdfs/pages/usage-guide/fuse.adoc

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,15 @@
11
= FUSE
2+
:description: Use HDFS FUSE driver to mount HDFS filesystems into Linux environments via a Kubernetes Pod with necessary privileges and configurations.
23

34
Our images of Apache Hadoop do contain the necessary binaries and libraries to use the HDFS FUSE driver.
45

56
FUSE is short for _Filesystem in Userspace_ and allows a user to export a filesystem into the Linux kernel, which can then be mounted.
67
HDFS contains a native FUSE driver/application, which means that an existing HDFS filesystem can be mounted into a Linux environment.
78

89
To use the FUSE driver you can either copy the required files out of the image and run it on a host outside of Kubernetes or you can run it in a Pod.
9-
This pod, however, will need some extra capabilities.
10+
This Pod, however, will need some extra capabilities.
1011

11-
This is an example pod that will work _as long as the host system that is running the kubelet does support FUSE_:
12+
This is an example Pod that will work _as long as the host system that is running the kubelet does support FUSE_:
1213

1314
[source,yaml]
1415
----
Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,7 @@
11
= Usage guide
2+
:description: Learn to configure and use the Stackable Operator for Apache HDFS. Ensure basic setup knowledge from the Getting Started guide before proceeding.
23
:page-aliases: ROOT:usage.adoc
34

4-
This Section will help you to use and configure the Stackable Operator for Apache HDFS in various ways. You should already be familiar with how to set up a basic instance. Follow the xref:getting_started/index.adoc[] guide to learn how to set up a basic instance with all the required dependencies (for example ZooKeeper).
5+
This Section will help you to use and configure the Stackable Operator for Apache HDFS in various ways.
6+
You should already be familiar with how to set up a basic instance.
7+
Follow the xref:getting_started/index.adoc[] guide to learn how to set up a basic instance with all the required dependencies (for example ZooKeeper).

docs/modules/hdfs/pages/usage-guide/listenerclass.adoc

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
= Service exposition with ListenerClasses
2+
:description: Configure HDFS service exposure using ListenerClasses to control internal and external access for DataNodes and NameNodes.
23

3-
The operator deploys a xref:listener-operator:listener.adoc[Listener] for each DataNode and NameNode pod. They both default to only being accessible from within the Kubernetes cluster, but this can be changed by setting `.spec.{data,name}Nodes.config.listenerClass`.
4+
The operator deploys a xref:listener-operator:listener.adoc[Listener] for each DataNode and NameNode pod.
5+
They both default to only being accessible from within the Kubernetes cluster, but this can be changed by setting `.spec.{data,name}Nodes.config.listenerClass`.
46

57
Note that JournalNodes are not accessible from outside the Kubernetes cluster.
68

docs/modules/hdfs/pages/usage-guide/logging-log-aggregation.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
= Logging & log aggregation
2+
:description: The logs can be forwarded to a Vector log aggregator by providing a discovery ConfigMap for the aggregator and by enabling the log agent.
23

3-
The logs can be forwarded to a Vector log aggregator by providing a discovery
4-
ConfigMap for the aggregator and by enabling the log agent:
4+
The logs can be forwarded to a Vector log aggregator by providing a discovery ConfigMap for the aggregator and by enabling the log agent:
55

66
[source,yaml]
77
----
Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
11
= Monitoring
2+
:description: The HDFS cluster can be monitored with Prometheus from inside or outside the K8S cluster.
23

34
The cluster can be monitored with Prometheus from inside or outside the K8S cluster.
45

5-
All services (with the exception of the Zookeeper daemon on the node names) run with the JMX exporter agent enabled and expose metrics on the `metrics` port. This port is available from the container level up to the NodePort services.
6+
All services (with the exception of the Zookeeper daemon on the node names) run with the JMX exporter agent enabled and expose metrics on the `metrics` port.
7+
This port is available from the container level up to the NodePort services.
68

7-
The metrics endpoints are also used as liveliness probes by K8S.
9+
The metrics endpoints are also used as liveliness probes by Kubernetes.
810

911
See xref:operators:monitoring.adoc[] for more details.

0 commit comments

Comments
 (0)