@@ -4,7 +4,110 @@ brevity. Much more detail can be found in the git revision history:
44
55 https://github.com/jemalloc/jemalloc
66
7- * 5.1.0 (May 4th, 2018)
7+ * 5.2.0 (April 2, 2019)
8+
9+ This release includes a few notable improvements, which are summarized below:
10+ 1) improved fast-path performance from the optimizations by @djwatson; 2)
11+ reduced virtual memory fragmentation and metadata usage; and 3) bug fixes on
12+ setting the number of background threads. In addition, peak / spike memory
13+ usage is improved with certain allocation patterns. As usual, the release and
14+ prior dev versions have gone through large-scale production testing.
15+
16+ New features:
17+ - Implement oversize_threshold, which uses a dedicated arena for allocations
18+ crossing the specified threshold to reduce fragmentation. (@interwq)
19+ - Add extents usage information to stats. (@tyleretzel)
20+ - Log time information for sampled allocations. (@tyleretzel)
21+ - Support 0 size in sdallocx. (@djwatson)
22+ - Output rate for certain counters in malloc_stats. (@zinoale)
23+ - Add configure option --enable-readlinkat, which allows the use of readlinkat
24+ over readlink. (@davidtgoldblatt)
25+ - Add configure options --{enable,disable}-{static,shared} to allow not
26+ building unwanted libraries. (@Ericson2314)
27+ - Add configure option --disable-libdl to enable fully static builds.
28+ (@interwq)
29+ - Add mallctl interfaces:
30+ + opt.oversize_threshold (@interwq)
31+ + stats.arenas.<i>.extent_avail (@tyleretzel)
32+ + stats.arenas.<i>.extents.<j>.n{dirty,muzzy,retained} (@tyleretzel)
33+ + stats.arenas.<i>.extents.<j>.{dirty,muzzy,retained}_bytes
34+ (@tyleretzel)
35+
36+ Portability improvements:
37+ - Update MSVC builds. (@maksqwe, @rustyx)
38+ - Workaround a compiler optimizer bug on s390x. (@rkmisra)
39+ - Make use of pthread_set_name_np(3) on FreeBSD. (@trasz)
40+ - Implement malloc_getcpu() to enable percpu_arena for windows. (@santagada)
41+ - Link against -pthread instead of -lpthread. (@paravoid)
42+ - Make background_thread not dependent on libdl. (@interwq)
43+ - Add stringify to fix a linker directive issue on MSVC. (@daverigby)
44+ - Detect and fall back when 8-bit atomics are unavailable. (@interwq)
45+ - Fall back to the default pthread_create if dlsym(3) fails. (@interwq)
46+
47+ Optimizations and refactors:
48+ - Refactor the TSD module. (@davidtgoldblatt)
49+ - Avoid taking extents_muzzy mutex when muzzy is disabled. (@interwq)
50+ - Avoid taking large_mtx for auto arenas on the tcache flush path. (@interwq)
51+ - Optimize ixalloc by avoiding a size lookup. (@interwq)
52+ - Implement opt.oversize_threshold which uses a dedicated arena for requests
53+ crossing the threshold, also eagerly purges the oversize extents. Default
54+ the threshold to 8 MiB. (@interwq)
55+ - Clean compilation with -Wextra. (@gnzlbg, @jasone)
56+ - Refactor the size class module. (@davidtgoldblatt)
57+ - Refactor the stats emitter. (@tyleretzel)
58+ - Optimize pow2_ceil. (@rkmisra)
59+ - Avoid runtime detection of lazy purging on FreeBSD. (@trasz)
60+ - Optimize mmap(2) alignment handling on FreeBSD. (@trasz)
61+ - Improve error handling for THP state initialization. (@jsteemann)
62+ - Rework the malloc() fast path. (@djwatson)
63+ - Rework the free() fast path. (@djwatson)
64+ - Refactor and optimize the tcache fill / flush paths. (@djwatson)
65+ - Optimize sync / lwsync on PowerPC. (@chmeeedalf)
66+ - Bypass extent_dalloc() when retain is enabled. (@interwq)
67+ - Optimize the locking on large deallocation. (@interwq)
68+ - Reduce the number of pages committed from sanity checking in debug build.
69+ (@trasz, @interwq)
70+ - Deprecate OSSpinLock. (@interwq)
71+ - Lower the default number of background threads to 4 (when the feature
72+ is enabled). (@interwq)
73+ - Optimize the trylock spin wait. (@djwatson)
74+ - Use arena index for arena-matching checks. (@interwq)
75+ - Avoid forced decay on thread termination when using background threads.
76+ (@interwq)
77+ - Disable muzzy decay by default. (@djwatson, @interwq)
78+ - Only initialize libgcc unwinder when profiling is enabled. (@paravoid,
79+ @interwq)
80+
81+ Bug fixes (all only relevant to jemalloc 5.x):
82+ - Fix background thread index issues with max_background_threads. (@djwatson,
83+ @interwq)
84+ - Fix stats output for opt.lg_extent_max_active_fit. (@interwq)
85+ - Fix opt.prof_prefix initialization. (@davidtgoldblatt)
86+ - Properly trigger decay on tcache destroy. (@interwq, @amosbird)
87+ - Fix tcache.flush. (@interwq)
88+ - Detect whether explicit extent zero out is necessary with huge pages or
89+ custom extent hooks, which may change the purge semantics. (@interwq)
90+ - Fix a side effect caused by extent_max_active_fit combined with decay-based
91+ purging, where freed extents can accumulate and not be reused for an
92+ extended period of time. (@interwq, @mpghf)
93+ - Fix a missing unlock on extent register error handling. (@zoulasc)
94+
95+ Testing:
96+ - Simplify the Travis script output. (@gnzlbg)
97+ - Update the test scripts for FreeBSD. (@devnexen)
98+ - Add unit tests for the producer-consumer pattern. (@interwq)
99+ - Add Cirrus-CI config for FreeBSD builds. (@jasone)
100+ - Add size-matching sanity checks on tcache flush. (@davidtgoldblatt,
101+ @interwq)
102+
103+ Incompatible changes:
104+ - Remove --with-lg-page-sizes. (@davidtgoldblatt)
105+
106+ Documentation:
107+ - Attempt to build docs by default, however skip doc building when xsltproc
108+ is missing. (@interwq, @cmuellner)
109+
110+ * 5.1.0 (May 4, 2018)
8111
9112 This release is primarily about fine-tuning, ranging from several new features
10113 to numerous notable performance and portability enhancements. The release and
0 commit comments