Skip to content

Commit e56db90

Browse files
authored
Merge pull request #147 from pycompression/release_1.2.0
Release 1.2.0
2 parents de1b55d + 49ddf44 commit e56db90

File tree

15 files changed

+302
-112
lines changed

15 files changed

+302
-112
lines changed

.github/workflows/ci.yml

Lines changed: 30 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ jobs:
6161
- "3.8"
6262
- "3.9"
6363
- "3.10"
64-
- "3.11-dev"
64+
- "3.11"
6565
- "pypy-3.7"
6666
- "pypy-3.8"
6767
- "pypy-3.9"
@@ -87,7 +87,7 @@ jobs:
8787
- name: Install build dependencies (Macos)
8888
# Install yasm because nasm does not work when building wheels.
8989
# Probably because of nasm-filter.sh not filtering all flags that can not be used.
90-
run: brew install nasm automake autoconf
90+
run: brew install nasm
9191
if: runner.os == 'macOS'
9292
- name: Set MSVC developer prompt
9393
uses: ilammy/[email protected]
@@ -105,22 +105,21 @@ jobs:
105105
runs-on: "ubuntu-latest"
106106
strategy:
107107
matrix:
108-
distro: [ "ubuntu_latest" ]
109-
arch: ["aarch64"]
108+
python_version:
109+
- "3.7"
110110
steps:
111111
- uses: actions/[email protected]
112112
with:
113113
submodules: recursive
114-
- uses: uraimo/run-on-arch-action@v2.2.0
114+
- uses: uraimo/run-on-arch-action@v2.5.0
115115
name: Build & run test
116116
with:
117-
arch: ${{ matrix.arch }}
118-
distro: ${{ matrix.distro }}
119-
install: |
120-
apt-get update -q -y
121-
apt-get install -q -y python3 python3-pip gcc binutils automake autoconf libtool tox
122-
run: |
123-
tox
117+
arch: none
118+
distro: none
119+
base_image: "--platform=linux/arm64 quay.io/pypa/manylinux2014_aarch64"
120+
run: |-
121+
CFLAGS="-DNDEBUG -g0" python${{matrix.python_version}} -m pip install . pytest
122+
python${{matrix.python_version}} -m pytest tests
124123
125124
# Test if the python-isal conda package can be build. Which is linked
126125
# dynamically to the conda isa-l package.
@@ -195,7 +194,7 @@ jobs:
195194
- name: Install cibuildwheel twine wheel
196195
run: python -m pip install cibuildwheel twine wheel
197196
- name: Install build dependencies (Macos)
198-
run: brew install nasm automake autoconf
197+
run: brew install nasm
199198
if: runner.os == 'macOS'
200199
- name: Set MSVC developer prompt
201200
uses: ilammy/[email protected]
@@ -216,11 +215,26 @@ jobs:
216215
CIBW_BEFORE_ALL_LINUX: ${{ matrix.cibw_before_all_linux }}
217216
# Fully test the build wheels again.
218217
CIBW_TEST_REQUIRES: "pytest"
219-
# Simple test that requires the project to be build correctly
220-
CIBW_TEST_COMMAND: >-
218+
# Simple tests that requires the project to be build correctly
219+
# Skip extensive compatibility testing which is slow.
220+
CIBW_TEST_COMMAND_LINUX: >-
221+
pytest -v {project}/tests/test_igzip.py
222+
{project}/tests/test_gzip_compliance.py
223+
{project}/tests/test_zlib_compliance.py
224+
{project}/tests/test_igzip_lib.py
225+
-k 'not test_compress_decompress'
226+
CIBW_TEST_COMMAND_MACOS: >-
227+
pytest -v {project}/tests/test_igzip.py
228+
{project}/tests/test_gzip_compliance.py
229+
{project}/tests/test_zlib_compliance.py
230+
{project}/tests/test_igzip_lib.py
231+
-k 'not test_compress_decompress'
232+
# Windows does not have the test module apparently. Do more expensive
233+
# tests to verify build.
234+
CIBW_TEST_COMMAND_WINDOWS: >-
221235
pytest {project}/tests/test_igzip.py
222-
{project}/tests/test_compat.py
223236
{project}/tests/test_igzip_lib.py
237+
{project}/tests/test_compat.py
224238
CIBW_ENVIRONMENT_LINUX: >-
225239
PYTHON_ISAL_BUILD_CACHE=True
226240
PYTHON_ISAL_BUILD_CACHE_FILE=/tmp/build_cache

CHANGELOG.rst

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,24 @@ Changelog
77
.. This document is user facing. Please word the changes in such a way
88
.. that users understand how the changes affect the new version.
99
10+
version 1.2.0
11+
-----------------
12+
+ Bgzip files are now detected and a smaller reading buffer is used to
13+
accomodate the fact that bgzip blocks are typically less than 64K. (Unlike
14+
normal gzip files that consist of one block that spans the entire file.)
15+
This has reduced decompression time for bgzip files by roughly 12%.
16+
+ Speed-up source build by using ISA-L Unix-specific makefile rather than the
17+
autotools build.
18+
+ Simplify build setup. ISA-L release flags are now used and not
19+
overwritten with python release flags when building the included static
20+
library.
21+
+ Fix bug where zdict's could not be set for ``isal_zlib.decompressobj`` and
22+
``igzip_lib.IgzipDecompressor``.
23+
+ Escape GIL when calling inflate, deflate, crc32 and adler32 functions just
24+
like in CPython. This allows for utilising more CPU cores in combination
25+
with the threading module. This comes with a very slight cost in efficiency
26+
for strict single-threaded applications.
27+
1028
version 1.1.0
1129
-----------------
1230
+ Added tests and support for Python 3.11.

README.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -111,12 +111,12 @@ your project please list a python-isal dependency as follows.
111111
``setup.cfg``::
112112

113113
install_requires =
114-
isal; platform.machine == "x86_64" or platform.machine == "AMD64"
114+
isal; platform.machine == "x86_64" or platform.machine == "AMD64" or platform.machine == "aarch64"
115115

116116
``setup.py``::
117117

118118
extras_require={
119-
":platform.machine == 'x86_64' or platform.machine == 'AMD64'": ['isal']
119+
":platform.machine == 'x86_64' or platform.machine == 'AMD64' or platform.machine == 'aarch64'": ['isal']
120120
},
121121

122122
.. dependency end

docs/index.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -82,8 +82,6 @@ python-isal is available on conda-forge and can be installed with::
8282
This will automatically install the ISA-L library dependency as well, since
8383
it is available on conda-forge.
8484

85-
.. _differences-with-zlib-and-gzip-modules:
86-
8785
===========================================
8886
python-isal as a dependency in your project
8987
===========================================
@@ -92,6 +90,7 @@ python-isal as a dependency in your project
9290
:start-after: .. dependency start
9391
:end-before: .. dependency end
9492

93+
.. _differences-with-zlib-and-gzip-modules:
9594

9695
======================================
9796
Differences with zlib and gzip modules

setup.py

Lines changed: 22 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,9 @@
66
# This file is part of python-isal which is distributed under the
77
# PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2.
88

9-
import copy
109
import functools
1110
import os
11+
import platform
1212
import shutil
1313
import subprocess
1414
import sys
@@ -72,32 +72,17 @@ def build_extension(self, ext):
7272
raise NotImplementedError(
7373
f"Unsupported platform: {sys.platform}")
7474
else:
75-
if self.compiler.compiler_type == "msvc":
76-
compiler = copy.deepcopy(self.compiler)
77-
if not compiler.initialized:
78-
compiler.initialize()
79-
compiler_command = f'"{compiler.cc}"'
80-
compiler_args = compiler.compile_options
81-
elif self.compiler.compiler_type == "unix":
82-
compiler_command = self.compiler.compiler[0]
83-
compiler_args = self.compiler.compiler[1:]
84-
else:
85-
raise NotImplementedError("Unknown compiler")
86-
isa_l_prefix_dir = build_isa_l(compiler_command,
87-
" ".join(compiler_args))
75+
isa_l_build_dir = build_isa_l()
8876
if SYSTEM_IS_UNIX:
8977
ext.extra_objects = [
90-
os.path.join(isa_l_prefix_dir, "lib", "libisal.a")]
78+
os.path.join(isa_l_build_dir, "bin", "isa-l.a")]
9179
elif SYSTEM_IS_WINDOWS:
9280
ext.extra_objects = [
93-
os.path.join(isa_l_prefix_dir, "isa-l_static.lib")]
81+
os.path.join(isa_l_build_dir, "isa-l_static.lib")]
9482
else:
9583
raise NotImplementedError(
9684
f"Unsupported platform: {sys.platform}")
97-
ext.include_dirs = [os.path.join(isa_l_prefix_dir,
98-
"include")]
99-
# -fPIC needed for proper static linking
100-
ext.extra_compile_args = ["-fPIC"]
85+
ext.include_dirs = [isa_l_build_dir]
10186
super().build_extension(ext)
10287

10388

@@ -106,62 +91,51 @@ def build_extension(self, ext):
10691
# 'cache' is only available from python 3.9 onwards.
10792
# see: https://docs.python.org/3/library/functools.html#functools.cache
10893
@functools.lru_cache(maxsize=None)
109-
def build_isa_l(compiler_command: str, compiler_options: str):
94+
def build_isa_l():
11095
# Check for cache
11196
if BUILD_CACHE:
11297
if BUILD_CACHE_FILE.exists():
11398
cache_path = Path(BUILD_CACHE_FILE.read_text())
114-
if (cache_path / "include" / "isa-l").exists():
99+
if (cache_path / "isa-l.h").exists():
115100
return str(cache_path)
116101

117102
# Creating temporary directories
118103
build_dir = tempfile.mktemp()
119-
temp_prefix = tempfile.mkdtemp()
120104
shutil.copytree(ISA_L_SOURCE, build_dir)
121105

122106
# Build environment is a copy of OS environment to allow user to influence
123107
# it.
124108
build_env = os.environ.copy()
125-
# Add -fPIC flag to allow static compilation
126-
build_env["CC"] = compiler_command
127109
if SYSTEM_IS_UNIX:
128-
build_env["CFLAGS"] = compiler_options + " -fPIC"
129-
elif SYSTEM_IS_WINDOWS:
130-
# The nmake file has CLFAGS_REL for all the compiler options.
131-
# This is added to CFLAGS with all the necessary include options.
132-
build_env["CFLAGS_REL"] = compiler_options
110+
build_env["CFLAGS"] = build_env.get("CFLAGS", "") + " -fPIC"
133111
if hasattr(os, "sched_getaffinity"):
134112
cpu_count = len(os.sched_getaffinity(0))
135113
else: # sched_getaffinity not available on all platforms
136114
cpu_count = os.cpu_count() or 1 # os.cpu_count() can return None
137115
run_args = dict(cwd=build_dir, env=build_env)
138116
if SYSTEM_IS_UNIX:
139-
subprocess.run(os.path.join(build_dir, "autogen.sh"), **run_args)
140-
subprocess.run([os.path.join(build_dir, "configure"),
141-
"--prefix", temp_prefix], **run_args)
142-
subprocess.run(["make", "-j", str(cpu_count)], **run_args)
143-
subprocess.run(["make", "-j", str(cpu_count), "install"], **run_args)
117+
if platform.machine() == "aarch64":
118+
cflags_param = "CFLAGS_aarch64"
119+
else:
120+
cflags_param = "CFLAGS_"
121+
subprocess.run(["make", "-j", str(cpu_count), "-f", "Makefile.unx",
122+
"isa-l.h", "bin/isa-l.a",
123+
f"{cflags_param}={build_env.get('CFLAGS', '')}"],
124+
**run_args)
144125
elif SYSTEM_IS_WINDOWS:
145-
subprocess.run(["nmake", "/E", "/f", "Makefile.nmake"], **run_args)
146-
Path(temp_prefix, "include").mkdir()
147-
print(temp_prefix, file=sys.stderr)
148-
shutil.copytree(os.path.join(build_dir, "include"),
149-
Path(temp_prefix, "include", "isa-l"))
150-
shutil.copy(os.path.join(build_dir, "isa-l_static.lib"),
151-
os.path.join(temp_prefix, "isa-l_static.lib"))
152-
shutil.copy(os.path.join(build_dir, "isa-l.h"),
153-
os.path.join(temp_prefix, "include", "isa-l.h"))
126+
subprocess.run(["nmake", "/f", "Makefile.nmake"], **run_args)
154127
else:
155128
raise NotImplementedError(f"Unsupported platform: {sys.platform}")
156-
shutil.rmtree(build_dir)
129+
shutil.copytree(os.path.join(build_dir, "include"),
130+
os.path.join(build_dir, "isa-l"))
157131
if BUILD_CACHE:
158-
BUILD_CACHE_FILE.write_text(temp_prefix)
159-
return temp_prefix
132+
BUILD_CACHE_FILE.write_text(build_dir)
133+
return build_dir
160134

161135

162136
setup(
163137
name="isal",
164-
version="1.1.0",
138+
version="1.2.0",
165139
description="Faster zlib and gzip compatible compression and "
166140
"decompression by providing python bindings for the ISA-L "
167141
"library.",

src/isal/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,4 +27,4 @@
2727
"__version__"
2828
]
2929

30-
__version__ = "1.1.0"
30+
__version__ = "1.2.0"

src/isal/igzip.py

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,22 @@ def write(self, data):
220220
return length
221221

222222

223+
def detect_bgzip(header: bytes) -> bool:
224+
if len(header) < 18:
225+
return False
226+
magic, method, flags, mtime, xfl, os, xlen, si1, si2, slen, bsize = \
227+
struct.unpack("<HBBIBBHBBHH", header[:18])
228+
return (
229+
method == 8 and # Deflate method used
230+
flags & 4 and # There are extra fields
231+
xlen == 6 and # The extra field should be of length 6
232+
si1 == 66 and # BGZIP magic number one
233+
si2 == 67 and # BGZIP magic number two
234+
slen == 2 # The length of the 16 bit integer that stores
235+
# the size of the block
236+
)
237+
238+
223239
class _PaddedFile(gzip._PaddedFile):
224240
# Overwrite _PaddedFile from gzip as its prepend method assumes that
225241
# the prepended data is always read from its _buffer. Unfortunately in
@@ -249,6 +265,15 @@ def __init__(self, fp):
249265
# Set flag indicating start of a new member
250266
self._new_member = True
251267
self._last_mtime = None
268+
self._read_buffer_size = READ_BUFFER_SIZE
269+
if hasattr(fp, "peek") and detect_bgzip(fp.peek(18)):
270+
# bgzip consists of puny little blocks of max 64K uncompressed data
271+
# so in practice probably more around 16K in compressed size. A
272+
# 128K buffer is a massive overshoot and slows down the
273+
# decompression.
274+
# bgzip stores the block size, so it can be unpacked more
275+
# efficiently but this is outside scope for python-isal.
276+
self._read_buffer_size = 16 * 1024
252277

253278
def read(self, size=-1):
254279
if size < 0:
@@ -282,7 +307,7 @@ def read(self, size=-1):
282307

283308
# Read a chunk of data from the file
284309
if self._decompressor.needs_input:
285-
buf = self._fp.read(READ_BUFFER_SIZE)
310+
buf = self._fp.read(self._read_buffer_size)
286311
uncompress = self._decompressor.decompress(buf, size)
287312
else:
288313
uncompress = self._decompressor.decompress(b"", size)
@@ -449,9 +474,7 @@ def _argument_parser():
449474
"timestamp")
450475
parser.add_argument("-f", "--force", action="store_true",
451476
help="Overwrite output without prompting")
452-
# -b flag not taken by either gzip or igzip. Hidden attribute. Above 32K
453-
# diminishing returns hit. _compression.BUFFER_SIZE = 8k. But 32K is about
454-
# ~6% faster.
477+
# -b flag not taken by either gzip or igzip. Hidden attribute.
455478
parser.add_argument("-b", "--buffer-size",
456479
default=READ_BUFFER_SIZE, type=int,
457480
help=argparse.SUPPRESS)

0 commit comments

Comments
 (0)