@@ -4,6 +4,123 @@ brevity. Much more detail can be found in the git revision history:
44
55 https://github.com/jemalloc/jemalloc
66
7+ * 5.1.0 (May 4th, 2018)
8+
9+ This release is primarily about fine-tuning, ranging from several new features
10+ to numerous notable performance and portability enhancements. The release and
11+ prior dev versions have been running in multiple large scale applications for
12+ months, and the cumulative improvements are substantial in many cases.
13+
14+ Given the long and successful production runs, this release is likely a good
15+ candidate for applications to upgrade, from both jemalloc 5.0 and before. For
16+ performance-critical applications, the newly added TUNING.md provides
17+ guidelines on jemalloc tuning.
18+
19+ New features:
20+ - Implement transparent huge page support for internal metadata. (@interwq)
21+ - Add opt.thp to allow enabling / disabling transparent huge pages for all
22+ mappings. (@interwq)
23+ - Add maximum background thread count option. (@djwatson)
24+ - Allow prof_active to control opt.lg_prof_interval and prof.gdump.
25+ (@interwq)
26+ - Allow arena index lookup based on allocation addresses via mallctl.
27+ (@lionkov)
28+ - Allow disabling initial-exec TLS model. (@davidtgoldblatt, @KenMacD)
29+ - Add opt.lg_extent_max_active_fit to set the max ratio between the size of
30+ the active extent selected (to split off from) and the size of the requested
31+ allocation. (@interwq, @davidtgoldblatt)
32+ - Add retain_grow_limit to set the max size when growing virtual address
33+ space. (@interwq)
34+ - Add mallctl interfaces:
35+ + arena.<i>.retain_grow_limit (@interwq)
36+ + arenas.lookup (@lionkov)
37+ + max_background_threads (@djwatson)
38+ + opt.lg_extent_max_active_fit (@interwq)
39+ + opt.max_background_threads (@djwatson)
40+ + opt.metadata_thp (@interwq)
41+ + opt.thp (@interwq)
42+ + stats.metadata_thp (@interwq)
43+
44+ Portability improvements:
45+ - Support GNU/kFreeBSD configuration. (@paravoid)
46+ - Support m68k, nios2 and SH3 architectures. (@paravoid)
47+ - Fall back to FD_CLOEXEC when O_CLOEXEC is unavailable. (@zonyitoo)
48+ - Fix symbol listing for cross-compiling. (@tamird)
49+ - Fix high bits computation on ARM. (@davidtgoldblatt, @paravoid)
50+ - Disable the CPU_SPINWAIT macro for Power. (@davidtgoldblatt, @marxin)
51+ - Fix MSVC 2015 & 2017 builds. (@rustyx)
52+ - Improve RISC-V support. (@EdSchouten)
53+ - Set name mangling script in strict mode. (@nicolov)
54+ - Avoid MADV_HUGEPAGE on ARM. (@marxin)
55+ - Modify configure to determine return value of strerror_r.
56+ (@davidtgoldblatt, @cferris1000)
57+ - Make sure CXXFLAGS is tested with CPP compiler. (@nehaljwani)
58+ - Fix 32-bit build on MSVC. (@rustyx)
59+ - Fix external symbol on MSVC. (@maksqwe)
60+ - Avoid a printf format specifier warning. (@jasone)
61+ - Add configure option --disable-initial-exec-tls which can allow jemalloc to
62+ be dynamically loaded after program startup. (@davidtgoldblatt, @KenMacD)
63+ - AArch64: Add ILP32 support. (@cmuellner)
64+ - Add --with-lg-vaddr configure option to support cross compiling.
65+ (@cmuellner, @davidtgoldblatt)
66+
67+ Optimizations and refactors:
68+ - Improve active extent fit with extent_max_active_fit. This considerably
69+ reduces fragmentation over time and improves virtual memory and metadata
70+ usage. (@davidtgoldblatt, @interwq)
71+ - Eagerly coalesce large extents to reduce fragmentation. (@interwq)
72+ - sdallocx: only read size info when page aligned (i.e. possibly sampled),
73+ which speeds up the sized deallocation path significantly. (@interwq)
74+ - Avoid attempting new mappings for in place expansion with retain, since
75+ it rarely succeeds in practice and causes high overhead. (@interwq)
76+ - Refactor OOM handling in newImpl. (@wqfish)
77+ - Add internal fine-grained logging functionality for debugging use.
78+ (@davidtgoldblatt)
79+ - Refactor arena / tcache interactions. (@davidtgoldblatt)
80+ - Refactor extent management with dumpable flag. (@davidtgoldblatt)
81+ - Add runtime detection of lazy purging. (@interwq)
82+ - Use pairing heap instead of red-black tree for extents_avail. (@djwatson)
83+ - Use sysctl on startup in FreeBSD. (@trasz)
84+ - Use thread local prng state instead of atomic. (@djwatson)
85+ - Make decay to always purge one more extent than before, because in
86+ practice large extents are usually the ones that cross the decay threshold.
87+ Purging the additional extent helps save memory as well as reduce VM
88+ fragmentation. (@interwq)
89+ - Fast division by dynamic values. (@davidtgoldblatt)
90+ - Improve the fit for aligned allocation. (@interwq, @edwinsmith)
91+ - Refactor extent_t bitpacking. (@rkmisra)
92+ - Optimize the generated assembly for ticker operations. (@davidtgoldblatt)
93+ - Convert stats printing to use a structured text emitter. (@davidtgoldblatt)
94+ - Remove preserve_lru feature for extents management. (@djwatson)
95+ - Consolidate two memory loads into one on the fast deallocation path.
96+ (@davidtgoldblatt, @interwq)
97+
98+ Bug fixes (most of the issues are only relevant to jemalloc 5.0):
99+ - Fix deadlock with multithreaded fork in OS X. (@davidtgoldblatt)
100+ - Validate returned file descriptor before use. (@zonyitoo)
101+ - Fix a few background thread initialization and shutdown issues. (@interwq)
102+ - Fix an extent coalesce + decay race by taking both coalescing extents off
103+ the LRU list. (@interwq)
104+ - Fix potentially unbound increase during decay, caused by one thread keep
105+ stashing memory to purge while other threads generating new pages. The
106+ number of pages to purge is checked to prevent this. (@interwq)
107+ - Fix a FreeBSD bootstrap assertion. (@strejda, @interwq)
108+ - Handle 32 bit mutex counters. (@rkmisra)
109+ - Fix a indexing bug when creating background threads. (@davidtgoldblatt,
110+ @binliu19)
111+ - Fix arguments passed to extent_init. (@yuleniwo, @interwq)
112+ - Fix addresses used for ordering mutexes. (@rkmisra)
113+ - Fix abort_conf processing during bootstrap. (@interwq)
114+ - Fix include path order for out-of-tree builds. (@cmuellner)
115+
116+ Incompatible changes:
117+ - Remove --disable-thp. (@interwq)
118+ - Remove mallctl interfaces:
119+ + config.thp (@interwq)
120+
121+ Documentation:
122+ - Add TUNING.md. (@interwq, @davidtgoldblatt, @djwatson)
123+
7124* 5.0.1 (July 1, 2017)
8125
9126 This bugfix release fixes several issues, most of which are obscure enough
@@ -22,7 +139,7 @@ brevity. Much more detail can be found in the git revision history:
22139 unlikely to be an issue with other libc implementations. (@interwq)
23140 - Mask signals during background thread creation. This prevents signals from
24141 being inadvertently delivered to background threads. (@jasone,
25- @davidgoldblatt , @interwq)
142+ @davidtgoldblatt , @interwq)
26143 - Avoid inactivity checks within background threads, in order to prevent
27144 recursive mutex acquisition. (@interwq)
28145 - Fix extent_grow_retained() to use the specified hooks when the
@@ -515,7 +632,7 @@ brevity. Much more detail can be found in the git revision history:
515632 these fixes, xallocx() now tries harder to partially fulfill requests for
516633 optional extra space. Note that a couple of minor heap profiling
517634 optimizations are included, but these are better thought of as performance
518- fixes that were integral to disovering most of the other bugs.
635+ fixes that were integral to discovering most of the other bugs.
519636
520637 Optimizations:
521638 - Avoid a chunk metadata read in arena_prof_tctx_set(), since it is in the
0 commit comments