Skip to content

Commit d8546d4

Browse files
janhoyepugh
andauthored
SOLR-17961 Remove deprecated Tika Extraction Backend (#3784)
Co-authored-by: Eric Pugh <[email protected]>
1 parent 9d0a652 commit d8546d4

File tree

210 files changed

+297
-11475
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

210 files changed

+297
-11475
lines changed

NOTICE.txt

Lines changed: 8 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,10 @@ This project includes the Malihu Custom Scrollbar Plugin
9999
Copyright (c) Manos Malihutsakis, http://manos.malihu.gr/
100100
License: MIT https://github.com/malihu/malihu-custom-scrollbar-plugin/blob/master/LICENSE.txt
101101

102+
This project includes encryption software "bouncy castle".
103+
Copyright (c) 2000-2006 The Legion Of The Bouncy Castle
104+
(http://www.bouncycastle.org)
105+
102106
=========================================================================
103107
== Antlr2 Notice ==
104108
=========================================================================
@@ -403,39 +407,15 @@ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
403407
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
404408

405409
=========================================================================
406-
== Apache Tika Notices ==
410+
== Extraction Module Notices ==
407411
=========================================================================
408412

409413
The following notices apply to modules/extraction:
410414

411-
This product includes software developed by the following copyright owners:
412-
413-
Copyright (c) 2000-2006 The Legion Of The Bouncy Castle
414-
(http://www.bouncycastle.org)
415-
416-
Copyright (c) 2003-2005, www.pdfbox.org
417-
418-
Copyright (c) 2003-2005, www.fontbox.org
419-
420-
Copyright (c) 1995-2005 International Business Machines Corporation and others
421-
422-
Copyright 2001-2005 (C) MetaStuff, Ltd. All Rights Reserved.
423-
424-
Copyright 2004 Sun Microsystems, Inc. (Rome JAR)
425-
426-
Copyright 2002-2008 by John Cowan (TagSoup -- http://ccil.org/~cowan/XML/tagsoup/)
427-
428-
Copyright (C) 1994-2007 by the Xiph.org Foundation, http://www.xiph.org/ (OggVorbis)
429-
430-
Copyright 2012 Kohei Taketa juniversalchardet (http://code.google.com/p/juniversalchardet/)
431-
432-
Lasse Collin and others, XZ for Java (http://tukaani.org/xz/java.html)
433-
434-
java-libpst is a pure java library for the reading of Outlook PST and OST files.
435-
https://github.com/rjohnsondev/java-libpst
415+
This product includes Apache Tika Core.
436416

437-
JMatIO is a JAVA library to read/write/manipulate with Matlab binary MAT-files.
438-
http://www.sourceforge.net/projects/jmatio
417+
Apache Tika
418+
Copyright 2007-2024 The Apache Software Foundation
439419

440420
=========================================================================
441421
== Language Detection Notices ==
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
title: Removed LocalTikaExtractionBackend from the extraction module (SolrCell). Extraction
2+
using a remote Tika Server is now the only and default option. Tika-core is upgraded
3+
to v3.2.3 and still used for some SAX parsing
4+
type: removed
5+
authors:
6+
- name: Jan Høydahl
7+
links:
8+
- name: SOLR-17961
9+
url: https://issues.apache.org/jira/browse/SOLR-17961

gradle/libs.versions.toml

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -39,9 +39,8 @@ apache-kafka = "3.9.1"
3939
apache-log4j = "2.21.0"
4040
apache-lucene = "10.3.1"
4141
apache-opennlp = "2.5.6"
42-
apache-poi = "5.2.2"
4342
apache-rat = "0.15"
44-
apache-tika = "1.28.5"
43+
apache-tika = "3.2.3"
4544
apache-tomcat = "6.0.53"
4645
apache-zookeeper = "3.9.4"
4746
# @keep for version alignment
@@ -207,7 +206,6 @@ thetaphi-forbiddenapis = "3.10"
207206
thisptr-jacksonjq = "0.0.13"
208207
threeten-bp = "1.6.8"
209208
undercouch-download = "5.6.0"
210-
xerces = "2.12.2"
211209
xerial-snappy = "1.1.10.8"
212210

213211
[plugins]
@@ -298,11 +296,8 @@ apache-lucene-suggest = { module = "org.apache.lucene:lucene-suggest", version.r
298296
apache-lucene-testframework = { module = "org.apache.lucene:lucene-test-framework", version.ref = "apache-lucene" }
299297
apache-opennlp-dl = { module = "org.apache.opennlp:opennlp-dl", version.ref = "apache-opennlp" }
300298
apache-opennlp-tools = { module = "org.apache.opennlp:opennlp-tools", version.ref = "apache-opennlp" }
301-
apache-poi-ooxml = { module = "org.apache.poi:poi-ooxml", version.ref = "apache-poi" }
302-
apache-poi-poi = { module = "org.apache.poi:poi", version.ref = "apache-poi" }
303299
apache-rat-rat = { module = "org.apache.rat:apache-rat", version.ref = "apache-rat" }
304300
apache-tika-core = { module = "org.apache.tika:tika-core", version.ref = "apache-tika" }
305-
apache-tika-parsers = { module = "org.apache.tika:tika-parsers", version.ref = "apache-tika" }
306301
apache-tomcat-annotationsapi = { module = "org.apache.tomcat:annotations-api", version.ref = "apache-tomcat" }
307302
apache-zookeeper-jute = { module = "org.apache.zookeeper:zookeeper-jute", version.ref = "apache-zookeeper" }
308303
apache-zookeeper-zookeeper = { module = "org.apache.zookeeper:zookeeper", version.ref = "apache-zookeeper" }
@@ -528,5 +523,4 @@ tdunning-tdigest = { module = "com.tdunning:t-digest", version.ref = "tdunning-t
528523
testcontainers = { module = "org.testcontainers:testcontainers", version.ref = "testcontainers" }
529524
thisptr-jacksonjq = { module = "net.thisptr:jackson-jq", version.ref = "thisptr-jacksonjq" }
530525
threeten-bp = { module = "org.threeten:threetenbp", version.ref = "threeten-bp" }
531-
xerces-impl = { module = "xerces:xercesImpl", version.ref = "xerces" }
532526
xerial-snappy-java = { module = "org.xerial.snappy:snappy-java", version.ref = "xerial-snappy" }

solr/licenses/SparseBitSet-1.2.jar.sha1

Lines changed: 0 additions & 1 deletion
This file was deleted.

0 commit comments

Comments
 (0)