Skip to content

Commit 119e99f

Browse files
author
rgaudin
authored
Merge pull request #124 from openzim/contrib
Contributors-friendly README
2 parents 5186278 + 159089e commit 119e99f

File tree

5 files changed

+148
-123
lines changed

5 files changed

+148
-123
lines changed

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,3 +32,8 @@ libzim/libzim_api.h
3232
Pipfile
3333
.dev
3434
.env
35+
36+
libzim.so
37+
libzim.so.*
38+
libzim.dylib
39+
libzim.*.dylib

MANIFEST.in

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,9 @@ include LICENSE
22
include README.md
33
include tests/*.py
44
include pyproject.toml
5+
include setup.cfg
6+
include requirements-dev.txt
7+
include tasks.py
58

69
include libzim/libzim.7.dylib
710
include libzim/libzim.so.7

README.md

Lines changed: 91 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,10 @@
11
# python-libzim
22

3-
The Python-libzim package allows you to read/write [ZIM
4-
files](https://openzim.org) in Python. It provides a shallow Python
5-
interface on top of the [`libzim`](https://github.com/openzim/libzim)
6-
C++ library.
3+
`libzim` module allows you to read and write [ZIM
4+
files](https://openzim.org) in Python. It provides a shallow python
5+
interface on top of the [C++ `libzim` library](https://github.com/openzim/libzim).
76

8-
It is primarily used in openZIM scrapers like for example
9-
[`Sotoki`](https://github.com/openzim/sotoki) or
10-
[`Youtube2zim`](https://github.com/openzim/youtube).
11-
12-
Read [CONTRIBUTING.md](./CONTRIBUTING.md) to know more about
13-
Python-libzim development.
7+
It is primarily used in [openZIM](https://github.com/openzim/) scrapers like [`sotoki`](https://github.com/openzim/sotoki) or [`youtube2zim`](https://github.com/openzim/youtube).
148

159
[![Build Status](https://github.com/openzim/python-libzim/workflows/test/badge.svg?query=branch%3Amaster)](https://github.com/openzim/python-libzim/actions?query=branch%3Amaster)
1610
[![CodeFactor](https://www.codefactor.io/repository/github/openzim/python-libzim/badge)](https://www.codefactor.io/repository/github/openzim/python-libzim)
@@ -20,52 +14,114 @@ Python-libzim development.
2014

2115
## Installation
2216

23-
The [PyPI package](https://pypi.org/project/libzim/) is bundled with a
24-
recent version of the libzim for macOS and GNU/Linux (x86_64
25-
architecture). For other OSes, the latest libzim has to be
26-
compiled manually, See [Setup hints](#setup-hints) to know more.
17+
```sh
18+
pip install libzim
19+
```
20+
21+
The [PyPI package](https://pypi.org/project/libzim/) is available for x86_64 macOS and GNU/Linux only. It bundles a [recent release](http://download.openzim.org/release/libzim/) of the C++ libzim.
22+
23+
On other platforms, you'd have to [compile C++ libzim from
24+
source](https://github.com/openzim/libzim) first then build this one, adjusting `LD_LIBRARY_PATH`.
2725

28-
```bash
29-
pip3 install libzim
26+
## Contributions
27+
28+
``` sh
29+
git clone [email protected]:openzim/python-libzim.git && cd python-libzim
30+
# python -m venv env && source env/bin/activate
31+
pip install -U setuptools invoke
32+
invoke download-libzim install-dev build-ext test
33+
# invoke --list for available development helpers
3034
```
3135

32-
## Quickstart
36+
See [CONTRIBUTING.md](./CONTRIBUTING.md) for additional details then [Open a ticket](https://github.com/openzim/python-libzim/issues/new) or submit a Pull Request on Github 🤗!
37+
38+
## Usage
3339

34-
### Read a ZIM
40+
### Read a ZIM file
3541

3642
```python
3743
from libzim.reader import Archive
44+
from libzim.search import Query, Searcher
45+
from libzim.suggestion import SuggestionSearcher
3846

3947
zim = Archive("test.zim")
4048
print(f"Main entry is at {zim.main_entry.get_item().path}")
41-
entry = zim.get_entry_by_path("path/to/my-article")
42-
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b:")
49+
entry = zim.get_entry_by_path("home/fr")
50+
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b.")
4351
print(bytes(entry.get_item().content).decode("UTF-8"))
52+
53+
# searching using full-text index
54+
search_string = "Welcome"
55+
query = Query().set_query(search_string)
56+
searcher = Searcher(zim)
57+
search = searcher.search(query)
58+
search_count = search.getEstimatedMatches()
59+
print(f"there are {search_count} matches for {search_string}")
60+
print(list(search.getResults(0, search_count)))
61+
62+
# accessing suggestions
63+
search_string = "kiwix"
64+
suggestion_searcher = SuggestionSearcher(zim)
65+
suggestion = suggestion_searcher.suggest(search_string)
66+
suggestion_count = suggestion.getEstimatedMatches()
67+
print(f"there are {suggestion_count} matches for {search_string}")
68+
print(list(suggestion.getResults(0, suggestion_count)))
4469
```
4570

46-
### Write a ZIM
71+
### Write a ZIM file
4772

48-
See [example](examples/basic_writer.py) for a basic usage of the
49-
writer API.
73+
```py
74+
from libzim.writer import Creator, Item, StringProvider, FileProvider, Hint
5075

51-
## Setup hints
5276

53-
### Installing the `libzim` dylib and headers manually
77+
class MyItem(Item):
78+
def __init__(self, title, path, content = "", fpath = None):
79+
super().__init__()
80+
self.path = path
81+
self.title = title
82+
self.content = content
83+
self.fpath = fpath
5484

55-
If you have to install the libzim manually, you will have to [compile
56-
`libzim` from
57-
source](https://github.com/openzim/libzim). This binding has been designed
58-
to work with the latest version of the libzim, we only recommend to
59-
use it with latest released version.
85+
def get_path(self):
86+
return self.path
6087

61-
If you have not installed libzim in standard directory, you will have
62-
to set `LD_LIBRARY_PATH` to allow python to find the library. Assuming
63-
you have extracted (or installed) the library if LIBZIM_DIR:
88+
def get_title(self):
89+
return self.title
6490

65-
```bash
66-
export LD_LIBRARY_PATH="${LIBZIM_DIR}/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH"
91+
def get_mimetype(self):
92+
return "text/html"
93+
94+
def get_contentprovider(self):
95+
if self.fpath is not None:
96+
return FileProvider(self.fpath)
97+
return StringProvider(self.content)
98+
99+
def get_hints(self):
100+
return {Hint.FRONT_ARTICLE: True}
101+
102+
103+
content = """<html><head><meta charset="UTF-8"><title>Web Page Title</title></head>
104+
<body><h1>Welcome to this ZIM</h1><p>Kiwix</p></body></html>"""
105+
106+
item = MyItem("Hello Kiwix", "home", content)
107+
item2 = MyItem("Bonjour Kiwix", "home/fr", None, "home-fr.html")
108+
109+
with Creator("test.zim").config_indexing(True, "eng") as creator:
110+
creator.set_mainpath("home")
111+
creator.add_item(item)
112+
creator.add_item(item2)
113+
for name, value in {
114+
"creator": "python-libzim",
115+
"description": "Created in python",
116+
"name": "my-zim",
117+
"publisher": "You",
118+
"title": "Test ZIM",
119+
}.items():
120+
121+
creator.add_metadata(name.title(), value)
67122
```
68123

124+
69125
## License
70126

71127
[GPLv3](https://www.gnu.org/licenses/gpl-3.0) or later, see

examples/basic_writer.py

Lines changed: 0 additions & 88 deletions
This file was deleted.

tasks.py

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,58 @@
55
A description file for invoke (https://www.pyinvoke.org/)
66
"""
77

8+
import pathlib
9+
import platform
10+
import re
11+
import urllib.request
12+
813
from invoke import task
914

1015

16+
@task
17+
def download_libzim(c, version="7.0.0"):
18+
"""download C++ libzim binary"""
19+
20+
if platform.machine() != "x86_64" or platform.system() not in ("Linux", "Darwin"):
21+
raise NotImplementedError(f"Platform {platform.platform()} not supported")
22+
23+
is_nightly = re.match(r"^\d{4}-\d{2}-\d{2}$", version)
24+
25+
if not is_nightly and not re.match(r"^\d\.\d\.\d$", version):
26+
raise ValueError(
27+
f"Unrecognised version {version}. "
28+
"Must be either a x.x.x release or a Y-M-D date to use a nightly"
29+
)
30+
31+
fname = pathlib.Path(
32+
"libzim_{os}-x86_64-{version}.tar.gz".format(
33+
os={"Linux": "linux", "Darwin": "macos"}.get(platform.system()),
34+
version=version,
35+
)
36+
)
37+
url = (
38+
f"https://download.openzim.org/nightly/{version}/{fname.name}"
39+
if is_nightly
40+
else f"https://download.openzim.org/release/libzim/{fname.name}"
41+
)
42+
print("Downloading from", url)
43+
44+
with urllib.request.urlopen(url) as response, open(fname, "wb") as fh: # nosec
45+
fh.write(response.read())
46+
c.run(f"tar -xvf {fname.name}")
47+
c.run(f"rm -vf {fname.name}")
48+
49+
dname = fname.with_suffix("").stem
50+
c.run(f"mv -v {dname}/include/* ./include/")
51+
c.run(f"mv -v {dname}/lib/* ./lib/")
52+
c.run(f"rmdir {dname}/lib {dname}/include/ {dname}")
53+
54+
if platform.system() == "Darwin":
55+
c.run(f"ln -svf ./lib/libzim.{version[0]}.dylib ./")
56+
else:
57+
c.run(f"ln -svf ./lib/libzim.so.{version[0]} ./")
58+
59+
1160
@task
1261
def build_ext(c):
1362
c.run("PROFILE=1 python setup.py build_ext -i")

0 commit comments

Comments
 (0)