Skip to content

Commit 0a951f2

Browse files
authored
SOLR-17958 Deprecate TikaLanguageIdentifierUpdateProcessor (#3776) (#3783)
(cherry picked from commit 6b33f38)
1 parent 00a390c commit 0a951f2

File tree

6 files changed

+14
-0
lines changed

6 files changed

+14
-0
lines changed

solr/CHANGES.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,8 @@ Other Changes
6464
* SOLR-17541: Deprecate `CloudHttp2SolrClient.Builder#withHttpClient` in favor of `CloudHttp2SolrClient.Builder#withInternalClientBuilder`.
6565
Deprecate `LBHttp2SolrClient.Builder#withListenerFactory` in favor of `LBHttp2SolrClient.Builder#withListenerFactories`. (James Dyer)
6666

67+
* SOLR-17958: The Tika Language Identifier is deprecated. Use one of the other detectors instead. (Jan Høydahl)
68+
6769
* SOLR-17952: Stream decorator test refactoring - use underscore rather than dot in aliases (Andy Webb)
6870

6971
* SOLR-17956: XLSXResponseWriter has been deprecated and will be removed in a future release. (Jan Høydahl)

solr/modules/langid/src/java/org/apache/solr/update/processor/TikaLanguageIdentifierUpdateProcessor.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,9 @@
3434
* href="https://solr.apache.org/guide/solr/latest/indexing-guide/language-detection.html#configuring-tika-language-detection">https://solr.apache.org/guide/solr/latest/indexing-guide/language-detection.html#configuring-tika-language-detection</a>
3535
*
3636
* @since 3.5
37+
* @deprecated Since 9.10, use {@link OpenNLPLangDetectUpdateProcessor} instead.
3738
*/
39+
@Deprecated(since = "9.10")
3840
public class TikaLanguageIdentifierUpdateProcessor extends LanguageIdentifierUpdateProcessor {
3941

4042
private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());

solr/modules/langid/src/java/org/apache/solr/update/processor/TikaLanguageIdentifierUpdateProcessorFactory.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,9 @@
4343
* href="https://solr.apache.org/guide/solr/latest/indexing-guide/language-detection.html#configuring-tika-language-detection">https://solr.apache.org/guide/solr/latest/indexing-guide/language-detection.html#configuring-tika-language-detection</a>
4444
*
4545
* @since 3.5
46+
* @deprecated Since 9.10, use {@link OpenNLPLangDetectUpdateProcessorFactory} instead.
4647
*/
48+
@Deprecated(since = "9.10")
4749
public class TikaLanguageIdentifierUpdateProcessorFactory extends UpdateRequestProcessorFactory
4850
implements SolrCoreAware, LangIdParams {
4951

solr/modules/langid/src/test/org/apache/solr/update/processor/TikaLanguageIdentifierUpdateProcessorFactoryTest.java

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
import org.apache.solr.common.params.ModifiableSolrParams;
2121
import org.junit.Test;
2222

23+
@SuppressWarnings("deprecation")
2324
public class TikaLanguageIdentifierUpdateProcessorFactoryTest
2425
extends LanguageIdentifierUpdateProcessorFactoryTestCase {
2526
@Override

solr/solr-ref-guide/modules/indexing-guide/pages/language-detection.adoc

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,11 @@ Here is an example of a minimal Tika `langid` configuration in `solrconfig.xml`:
5555
</processor>
5656
----
5757

58+
[IMPORTANT]
59+
====
60+
This detector is deprecated and may be removed in a future version.
61+
====
62+
5863
=== Configuring LangDetect Language Detection
5964

6065
Here is an example of a minimal LangDetect `langid` configuration in `solrconfig.xml`:

solr/solr-ref-guide/modules/upgrade-notes/pages/major-changes-in-solr-9.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,8 @@ Java has removed support for the Security Manager starting with Java 24; therefo
9090

9191
The `XLSXResponseWriter` is now deprecated.
9292

93+
The Tika Language Identifier is deprecated. Use one of the other detectors instead.
94+
9395
The Extraction module can now extract documents using an external Tika Server.
9496
The local in-process Tika 1.x extractor backend is deprecated and will go away in 10.0.
9597

0 commit comments

Comments
 (0)