@@ -54,7 +54,7 @@ Testing it out
5454
5555Parser Interface (backwards compat prior to REST)
5656-------------------------------------------------
57- ```
57+ ``` python
5858# !/usr/bin/env python
5959import tika
6060tika.initVM()
7575The parser interface needs the following environment variable set on the console for printing of the extracted content.
7676``` export PYTHONIOENCODING=utf8 ```
7777
78- ```
78+ ``` python
7979# !/usr/bin/env python
8080import tika
8181from tika import parser
@@ -87,7 +87,7 @@ print(parsed["content"])
8787Optionally, you can pass Tika server URL along with the call
8888what's useful for multi-instance execution or when Tika is dockerzed/linked.
8989
90- ```
90+ ``` python
9191parsed = parser.from_file(' /path/to/file' , ' http://tika:9998/tika' )
9292string_parsed = parser.from_buffer(' Good evening, Dave' , ' http://tika:9998/tika' )
9393```
@@ -101,7 +101,7 @@ Note:
101101The parser interface needs the following environment variable set on the console for printing of the extracted content.
102102``` export PYTHONIOENCODING=utf8 ```
103103
104- ```
104+ ``` python
105105# !/usr/bin/env python
106106import tika
107107from tika import parser
@@ -118,7 +118,7 @@ The unpack interface handles both metadata and text extraction in a single
118118call and internally returns back a tarball of metadata and text entries that
119119is internally unpacked, reducing the wire load for extraction.
120120
121- ```
121+ ``` python
122122# !/usr/bin/env python
123123import tika
124124from tika import unpack
@@ -130,7 +130,7 @@ Detect Interface
130130The detect interface provides a IANA MIME type classification for the
131131provided file.
132132
133- ```
133+ ``` python
134134# !/usr/bin/env python
135135import tika
136136from tika import detector
@@ -143,7 +143,7 @@ The config interface allows you to inspect the Tika Server environment's
143143configuration including what parsers, mime types, and detectors the
144144server has been configured with.
145145
146- ```
146+ ``` python
147147# !/usr/bin/env python
148148import tika
149149from tika import config
@@ -157,7 +157,7 @@ Language Detection Interface
157157The language detection interface provides a 2 character language
158158code texted based on the text in provided file.
159159
160- ```
160+ ``` python
161161# !/usr/bin/env python
162162from tika import language
163163print (language.from_file(' /path/to/file' ))
@@ -168,7 +168,7 @@ Translate Interface
168168The translate interface translates the text automatically extracted
169169by Tika from the source language to the destination language.
170170
171- ```
171+ ``` python
172172# !/usr/bin/env python
173173from tika import translate
174174print (translate.from_file(' /path/to/spanish' , ' es' , ' en' ))
@@ -225,7 +225,7 @@ Customizing the Tika Server Request
225225---------------------------
226226You may customize the outgoing HTTP request to Tika server by setting ` requestOptions ` on the ` .from_file ` and ` .from_buffer ` methods (Parser, Unpack , Detect, Config, Language, Translate). It should be a dictionary of arguments that will be passed to the request method. The [ request method documentation] ( https://requests.kennethreitz.org/en/master/api/#requests.request ) specifies valid arguments. This will override any defaults except for ` url ` and ` params ` /` data ` .
227227
228- ```
228+ ``` python
229229from tika import parser
230230parsed = parser.from_file(' /path/to/file' , requestOptions = {' timeout' : 120 })
231231```
@@ -240,7 +240,7 @@ The options and help for the command line tool can be seen by typing
240240` tika-python ` without any arguments. This will also download a copy of
241241the tika-server jar and start it if you haven't done so already.
242242
243- ```
243+ ``` bash
244244tika.py [-v] [-o < outputDir> ] [--server < TikaServerEndpoint> ] [--install < UrlToTikaServerJar> ] [--port < portNumber> ] < command> < option> < urlOrPathToFile>
245245
246246tika.py parse all test.pdf test2.pdf (write output JSON metadata files for test1.pdf_meta.json and test2.pdf_meta.json)
0 commit comments