Skip to content

Commit 69766d0

Browse files
replace asciidocalypse elasticsearch-hadoop links
1 parent dbeda00 commit 69766d0

File tree

7 files changed

+21
-20
lines changed

7 files changed

+21
-20
lines changed

deploy-manage/deploy/elastic-cloud/differences-from-other-elasticsearch-offerings.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,7 @@ The following features are planned for future support in all {{serverless-full}}
153153
The following features are not available in {{es-serverless}} and are not planned for future support:
154154

155155
* [Custom plugins and bundles](/deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md)
156-
* [{{es}} for Apache Hadoop](asciidocalypse://docs/elasticsearch-hadoop/docs/reference/elasticsearch-for-apache-hadoop.md)
156+
* [{{es}} for Apache Hadoop](elasticsearch-hadoop://reference/index.md)
157157
* [Scripted metric aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md)
158158
* Managed web crawler: You can use the [self-managed web crawler](https://github.com/elastic/crawler) instead.
159159
* Managed Search connectors: You can use [self-managed Search connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/self-managed-connectors.md) instead.

deploy-manage/security/secure-clients-integrations.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ You will need to update the configuration for several [clients](httprest-clients
99

1010
The {{es}} {{security-features}} enable you to secure your {{es}} cluster. But {{es}} itself is only one product within the {{stack}}. It is often the case that other products in the {{stack}} are connected to the cluster and therefore need to be secured as well, or at least communicate with the cluster in a secured way:
1111

12-
* [Apache Hadoop](asciidocalypse://docs/elasticsearch-hadoop/docs/reference/security.md)
12+
* [Apache Hadoop](elasticsearch-hadoop://reference/security.md)
1313
* [Auditbeat](asciidocalypse://docs/beats/docs/reference/auditbeat/securing-auditbeat.md)
1414
* [Filebeat](asciidocalypse://docs/beats/docs/reference/filebeat/securing-filebeat.md)
1515
* [{{fleet}} & {{agent}}](asciidocalypse://docs/docs-content/docs/reference/ingestion-tools/fleet/secure.md)

docset.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ cross_links:
2727
- eland
2828
- elastic-serverless-forwarder
2929
- elasticsearch
30+
- elasticsearch-hadoop
3031
- elasticsearch-java
3132
- elasticsearch-js
3233
- elasticsearch-net

raw-migrated-files/docs-content/serverless/elasticsearch-differences.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -147,7 +147,7 @@ The following features are planned for future support in all {{serverless-full}}
147147
The following features are not available in {{es-serverless}} and are not planned for future support:
148148

149149
* [Custom plugins and bundles](/deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md)
150-
* [{{es}} for Apache Hadoop](asciidocalypse://docs/elasticsearch-hadoop/docs/reference/elasticsearch-for-apache-hadoop.md)
150+
* [{{es}} for Apache Hadoop](elasticsearch-hadoop://reference/index.md)
151151
* [Scripted metric aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md)
152152
* Managed web crawler: You can use the [self-managed web crawler](https://github.com/elastic/crawler) instead.
153153
* Managed Search connectors: You can use [self-managed Search connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/self-managed-connectors.md) instead.

raw-migrated-files/stack-docs/elastic-stack/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ The products in the [{{stack}}](https://www.elastic.co/products) are designed to
55
* [Beats master](asciidocalypse://docs/beats/docs/reference/index.md)
66
* [APM master](https://www.elastic.co/guide/en/apm/guide/current/index.html)
77
* [Elasticsearch master](/get-started/index.md)
8-
* [Elasticsearch Hadoop master](asciidocalypse://docs/elasticsearch-hadoop/docs/reference/preface.md)
8+
* [Elasticsearch Hadoop master](elasticsearch-hadoop://reference/index.md)
99
* [Kibana master](/get-started/the-stack.md)
1010
* [Logstash master](asciidocalypse://docs/logstash/docs/reference/index.md)
1111

raw-migrated-files/stack-docs/elastic-stack/upgrading-elastic-stack-on-prem.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Once you are [prepared to upgrade](../../../deploy-manage/upgrade/deployment-or-
55
1. Consider closing {{ml}} jobs before you start the upgrade process. While {{ml}} jobs can continue to run during a rolling upgrade, it increases the overhead on the cluster during the upgrade process.
66
2. Upgrade the components of your Elastic Stack in the following order:
77

8-
1. {{es}} Hadoop: [install instructions](asciidocalypse://docs/elasticsearch-hadoop/docs/reference/installation.md)
8+
1. {{es}} Hadoop: [install instructions](elasticsearch-hadoop://reference/installation.md)
99
2. {{es}}: [upgrade instructions](../../../deploy-manage/upgrade/deployment-or-cluster.md)
1010
3. Kibana: [upgrade instructions](../../../deploy-manage/upgrade/deployment-or-cluster.md)
1111
4. Java API Client: [dependency configuration](asciidocalypse://docs/elasticsearch-java/docs/reference/installation.md#maven)

troubleshoot/elasticsearch/elasticsearch-hadoop/elasticsearch-for-apache-hadoop.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -9,59 +9,59 @@ mapped_pages:
99
Unfortunately, sometimes things do not go as expected and your elasticsearch-hadoop job execution might go awry: incorrect data might be read or written, the job might take significantly longer than expected or you might face some exception. This section tries to provide help and tips for doing your own diagnostics, identifying the problem and hopefully fixing it.
1010

1111

12-
### `EsHadoopNoNodesLeftException` [_eshadoopnonodesleftexception]
12+
### `EsHadoopNoNodesLeftException` [_eshadoopnonodesleftexception]
1313

14-
Test that {{es}} is reacheable from the Spark/Hadoop cluster where the job is running. Your machine might reach it but that is not where the actual code will be running. If ES is accessible, minimize the number of tasks and their bulk size; if {{es}} is overloaded, it will keep falling behind, GC will kick in and eventually its nodes will become unresponsive causing clients to think the machines have died. See the [*Performance considerations*](asciidocalypse://docs/elasticsearch-hadoop/docs/reference/performance-considerations.md) section for more details.
14+
Test that {{es}} is reacheable from the Spark/Hadoop cluster where the job is running. Your machine might reach it but that is not where the actual code will be running. If ES is accessible, minimize the number of tasks and their bulk size; if {{es}} is overloaded, it will keep falling behind, GC will kick in and eventually its nodes will become unresponsive causing clients to think the machines have died. See the [*Performance considerations*](elasticsearch-hadoop://reference/performance-considerations.md) section for more details.
1515

1616

17-
### Test your network [_test_your_network]
17+
### Test your network [_test_your_network]
1818

1919
Way too many times, folks use their local, development settings in a production environment. Double check that {{es}} is accessible from your production environments, check the host address and port and that the machines where the Hadoop/Spark job is running can access {{es}} (use `curl`, `telnet` or whatever tool you have available).
2020

2121
Using `localhost` (aka the default) in a production environment is simply a misconfiguration.
2222

2323

24-
### Triple check the classpath [_triple_check_the_classpath]
24+
### Triple check the classpath [_triple_check_the_classpath]
2525

2626
Make sure to use only one version of elasticsearch-hadoop in your classpath. While it might not be obvious, the classpath in Hadoop/Spark is assembled from multiple folders; furthermore, there are no guarantees what version is going to be picked up first by the JVM. To avoid obscure issues, double check your classpath and make sure there is only one version of the library in there, the one you are interested in.
2727

2828

29-
### Isolate the issue [_isolate_the_issue]
29+
### Isolate the issue [_isolate_the_issue]
3030

3131
When encountering a problem, do your best to isolate it. This can be quite tricky and many times, it is the hardest part so take your time with it. Take baby steps and try to eliminate unnecessary code or settings in small chunks until you end up with a small, tiny example that exposes your problem.
3232

3333

34-
### Use a speedy, local environment [_use_a_speedy_local_environment]
34+
### Use a speedy, local environment [_use_a_speedy_local_environment]
3535

3636
A lot of Hadoop jobs are batch in nature which means they take a long time to execute. To track down the issue faster, use whatever means possible to speed-up the feedback loop: use a small/tiny dataset (no need to load millions of records, some dozens will do) and use a local/pseudo-distributed Hadoop cluster alongside an Elasticsearch node running on your development machine.
3737

3838

39-
### Check your settings [_check_your_settings]
39+
### Check your settings [_check_your_settings]
4040

4141
Double check your settings and use constants or replicate configurations wherever possible. It is easy to make typos so try to reduce manual configuration by using properties files or constant interfaces/classes. If you are not sure what a setting is doing, remove it or change its value and see whether it affects your job output.
4242

4343

44-
### Verify the input and output [_verify_the_input_and_output]
44+
### Verify the input and output [_verify_the_input_and_output]
4545

4646
Take a close eye at your input and output; this is typically easier to do with Elasticsearch (the service out-lives the job/script, is real-time and can be accessed right away in a flexible meaner, including the command-line). If your data is not persisted (either in Hadoop or Elasticsearch), consider doing that temporarily to validate each step of your work-flow.
4747

4848

49-
### Monitor [_monitor]
49+
### Monitor [_monitor]
5050

5151
While logging helps with bugs and errors, for runtime behavior we strongly recommend doing proper monitoring of your Hadoop and {{es}} cluster. Both are outside the scope of this chapter however there are several popular, free solutions out there that are worth investigating. For {{es}}, we recommend [Marvel](https://www.elastic.co/products/marvel), a free monitoring tool (for development) created by the team behind {{es}}. Monitoring gives insight into how the cluster is actually behaving and helps you correlate behavior. If a monitoring solution is not possible, use the metrics provided by Hadoop, {{es}} and elasticsearch-hadoop to evaluate the runtime behavior.
5252

5353

54-
### Increase logging [_increase_logging]
54+
### Increase logging [_increase_logging]
5555

56-
Logging gives you a lot of insight into what is going on. Hadoop, Spark and {{es}} have extensive logging mechanisms as [does](asciidocalypse://docs/elasticsearch-hadoop/docs/reference/logging.md) elasticsearch-hadoop however use that judiciously: too much logging can hide the actual issue so again, do it in small increments.
56+
Logging gives you a lot of insight into what is going on. Hadoop, Spark and {{es}} have extensive logging mechanisms as [does](elasticsearch-hadoop://reference/logging.md) elasticsearch-hadoop however use that judiciously: too much logging can hide the actual issue so again, do it in small increments.
5757

5858

59-
### Measure, do not assume [_measure_do_not_assume]
59+
### Measure, do not assume [_measure_do_not_assume]
6060

6161
When encountering a performance issue, do some benchmarking first, in as much isolation as possible. Do not simply assume a certain component is slow; make sure/prove it actually is. Otherwise, more often than not, one might find herself fixing the wrong problem (and typically creating a new one).
6262

6363

64-
### Find a baseline [_find_a_baseline]
64+
### Find a baseline [_find_a_baseline]
6565

6666
Indexing performance depends *heavily* on the type of data being targeted and its mapping. Same goes for searching but add the query definition to the mix. As mentioned before, experiment and measure the various parts of your dataset to find the sweet-spot of your environment before importing/searching big amounts of data.
6767

@@ -77,7 +77,7 @@ If something is not working, there are two possibilities:
7777
Whichever it is, a **clear** description of the problem will help other users to help you. The more complete your report is, the quickest you will receive help from users!
7878

7979

80-
### What information is useful? [_what_information_is_useful]
80+
### What information is useful? [_what_information_is_useful]
8181

8282
* OS & JVM version
8383
* Hadoop / Spark version / distribution
@@ -91,7 +91,7 @@ Whichever it is, a **clear** description of the problem will help other users to
9191
If you don’t provide all of the information, then it may be difficult for others to figure out where the issue is.
9292

9393

94-
### Where do I post my information? [_where_do_i_post_my_information]
94+
### Where do I post my information? [_where_do_i_post_my_information]
9595

9696
Please don’t paste long lines of code in the mailing list or the IRC – it is difficult to read, and people will be less likely to take the time to help.
9797

0 commit comments

Comments
 (0)