Skip to content

Commit aae8fd9

Browse files
committed
Update ChangeLog for 5.0.0.
1 parent bff8db4 commit aae8fd9

File tree

1 file changed

+187
-0
lines changed

1 file changed

+187
-0
lines changed

ChangeLog

Lines changed: 187 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,193 @@ brevity. Much more detail can be found in the git revision history:
44

55
https://github.com/jemalloc/jemalloc
66

7+
* 5.0.0 (June 13, 2017)
8+
9+
Unlike all previous jemalloc releases, this release does not use naturally
10+
aligned "chunks" for virtual memory management, and instead uses page-aligned
11+
"extents". This change has few externally visible effects, but the internal
12+
impacts are... extensive. Many other internal changes combine to make this
13+
the most cohesively designed version of jemalloc so far, with ample
14+
opportunity for further enhancements.
15+
16+
Continuous integration is now an integral aspect of development thanks to the
17+
efforts of @davidtgoldblatt, and the dev branch tends to remain reasonably
18+
stable on the tested platforms (Linux, FreeBSD, macOS, and Windows). As a
19+
side effect the official release frequency may decrease over time.
20+
21+
New features:
22+
- Implement optional per-CPU arena support; threads choose which arena to use
23+
based on current CPU rather than on fixed thread-->arena associations.
24+
(@interwq)
25+
- Implement two-phase decay of unused dirty pages. Pages transition from
26+
dirty-->muzzy-->clean, where the first phase transition relies on
27+
madvise(... MADV_FREE) semantics, and the second phase transition discards
28+
pages such that they are replaced with demand-zeroed pages on next access.
29+
(@jasone)
30+
- Increase decay time resolution from seconds to milliseconds. (@jasone)
31+
- Implement opt-in per CPU background threads, and use them for asynchronous
32+
decay-driven unused dirty page purging. (@interwq)
33+
- Add mutex profiling, which collects a variety of statistics useful for
34+
diagnosing overhead/contention issues. (@interwq)
35+
- Add C++ new/delete operator bindings. (@djwatson)
36+
- Support manually created arena destruction, such that all data and metadata
37+
are discarded. Add MALLCTL_ARENAS_DESTROYED for accessing merged stats
38+
associated with destroyed arenas. (@jasone)
39+
- Add MALLCTL_ARENAS_ALL as a fixed index for use in accessing
40+
merged/destroyed arena statistics via mallctl. (@jasone)
41+
- Add opt.abort_conf to optionally abort if invalid configuration options are
42+
detected during initialization. (@interwq)
43+
- Add opt.stats_print_opts, so that e.g. JSON output can be selected for the
44+
stats dumped during exit if opt.stats_print is true. (@jasone)
45+
- Add --with-version=VERSION for use when embedding jemalloc into another
46+
project's git repository. (@jasone)
47+
- Add --disable-thp to support cross compiling. (@jasone)
48+
- Add --with-lg-hugepage to support cross compiling. (@jasone)
49+
- Add mallctl interfaces (various authors):
50+
+ background_thread
51+
+ opt.abort_conf
52+
+ opt.retain
53+
+ opt.percpu_arena
54+
+ opt.background_thread
55+
+ opt.{dirty,muzzy}_decay_ms
56+
+ opt.stats_print_opts
57+
+ arena.<i>.initialized
58+
+ arena.<i>.destroy
59+
+ arena.<i>.{dirty,muzzy}_decay_ms
60+
+ arena.<i>.extent_hooks
61+
+ arenas.{dirty,muzzy}_decay_ms
62+
+ arenas.bin.<i>.slab_size
63+
+ arenas.nlextents
64+
+ arenas.lextent.<i>.size
65+
+ arenas.create
66+
+ stats.background_thread.{num_threads,num_runs,run_interval}
67+
+ stats.mutexes.{ctl,background_thread,prof,reset}.
68+
{num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
69+
num_owner_switch}
70+
+ stats.arenas.<i>.{dirty,muzzy}_decay_ms
71+
+ stats.arenas.<i>.uptime
72+
+ stats.arenas.<i>.{pmuzzy,base,internal,resident}
73+
+ stats.arenas.<i>.{dirty,muzzy}_{npurge,nmadvise,purged}
74+
+ stats.arenas.<i>.bins.<j>.{nslabs,reslabs,curslabs}
75+
+ stats.arenas.<i>.bins.<j>.mutex.
76+
{num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
77+
num_owner_switch}
78+
+ stats.arenas.<i>.lextents.<j>.{nmalloc,ndalloc,nrequests,curlextents}
79+
+ stats.arenas.i.mutexes.{large,extent_avail,extents_dirty,extents_muzzy,
80+
extents_retained,decay_dirty,decay_muzzy,base,tcache_list}.
81+
{num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
82+
num_owner_switch}
83+
84+
Portability improvements:
85+
- Improve reentrant allocation support, such that deadlock is less likely if
86+
e.g. a system library call in turn allocates memory. (@davidtgoldblatt,
87+
@interwq)
88+
- Support static linking of jemalloc with glibc. (@djwatson)
89+
90+
Optimizations and refactors:
91+
- Organize virtual memory as "extents" of virtual memory pages, rather than as
92+
naturally aligned "chunks", and store all metadata in arbitrarily distant
93+
locations. This reduces virtual memory external fragmentation, and will
94+
interact better with huge pages (not yet explicitly supported). (@jasone)
95+
- Fold large and huge size classes together; only small and large size classes
96+
remain. (@jasone)
97+
- Unify the allocation paths, and merge most fast-path branching decisions.
98+
(@davidtgoldblatt, @interwq)
99+
- Embed per thread automatic tcache into thread-specific data, which reduces
100+
conditional branches and dereferences. Also reorganize tcache to increase
101+
fast-path data locality. (@interwq)
102+
- Rewrite atomics to closely model the C11 API, convert various
103+
synchronization from mutex-based to atomic, and use the explicit memory
104+
ordering control to resolve various hypothetical races without increasing
105+
synchronization overhead. (@davidtgoldblatt)
106+
- Extensively optimize rtree via various methods:
107+
+ Add multiple layers of rtree lookup caching, since rtree lookups are now
108+
part of fast-path deallocation. (@interwq)
109+
+ Determine rtree layout at compile time. (@jasone)
110+
+ Make the tree shallower for common configurations. (@jasone)
111+
+ Embed the root node in the top-level rtree data structure, thus avoiding
112+
one level of indirection. (@jasone)
113+
+ Further specialize leaf elements as compared to internal node elements,
114+
and directly embed extent metadata needed for fast-path deallocation.
115+
(@jasone)
116+
+ Ignore leading always-zero address bits (architecture-specific).
117+
(@jasone)
118+
- Reorganize headers (ongoing work) to make them hermetic, and disentangle
119+
various module dependencies. (@davidtgoldblatt)
120+
- Convert various internal data structures such as size class metadata from
121+
boot-time-initialized to compile-time-initialized. Propagate resulting data
122+
structure simplifications, such as making arena metadata fixed-size.
123+
(@jasone)
124+
- Simplify size class lookups when constrained to size classes that are
125+
multiples of the page size. This speeds lookups, but the primary benefit is
126+
complexity reduction in code that was the source of numerous regressions.
127+
(@jasone)
128+
- Lock individual extents when possible for localized extent operations,
129+
rather than relying on a top-level arena lock. (@davidtgoldblatt, @jasone)
130+
- Use first fit layout policy instead of best fit, in order to improve
131+
packing. (@jasone)
132+
- If munmap(2) is not in use, use an exponential series to grow each arena's
133+
virtual memory, so that the number of disjoint virtual memory mappings
134+
remains low. (@jasone)
135+
- Implement per arena base allocators, so that arenas never share any virtual
136+
memory pages. (@jasone)
137+
- Automatically generate private symbol name mangling macros. (@jasone)
138+
139+
Incompatible changes:
140+
- Replace chunk hooks with an expanded/normalized set of extent hooks.
141+
(@jasone)
142+
- Remove ratio-based purging. (@jasone)
143+
- Remove --disable-tcache. (@jasone)
144+
- Remove --disable-tls. (@jasone)
145+
- Remove --enable-ivsalloc. (@jasone)
146+
- Remove --with-lg-size-class-group. (@jasone)
147+
- Remove --with-lg-tiny-min. (@jasone)
148+
- Remove --disable-cc-silence. (@jasone)
149+
- Remove --enable-code-coverage. (@jasone)
150+
- Remove --disable-munmap (replaced by opt.retain). (@jasone)
151+
- Remove Valgrind support. (@jasone)
152+
- Remove quarantine support. (@jasone)
153+
- Remove redzone support. (@jasone)
154+
- Remove mallctl interfaces (various authors):
155+
+ config.munmap
156+
+ config.tcache
157+
+ config.tls
158+
+ config.valgrind
159+
+ opt.lg_chunk
160+
+ opt.purge
161+
+ opt.lg_dirty_mult
162+
+ opt.decay_time
163+
+ opt.quarantine
164+
+ opt.redzone
165+
+ opt.thp
166+
+ arena.<i>.lg_dirty_mult
167+
+ arena.<i>.decay_time
168+
+ arena.<i>.chunk_hooks
169+
+ arenas.initialized
170+
+ arenas.lg_dirty_mult
171+
+ arenas.decay_time
172+
+ arenas.bin.<i>.run_size
173+
+ arenas.nlruns
174+
+ arenas.lrun.<i>.size
175+
+ arenas.nhchunks
176+
+ arenas.hchunk.<i>.size
177+
+ arenas.extend
178+
+ stats.cactive
179+
+ stats.arenas.<i>.lg_dirty_mult
180+
+ stats.arenas.<i>.decay_time
181+
+ stats.arenas.<i>.metadata.{mapped,allocated}
182+
+ stats.arenas.<i>.{npurge,nmadvise,purged}
183+
+ stats.arenas.<i>.huge.{allocated,nmalloc,ndalloc,nrequests}
184+
+ stats.arenas.<i>.bins.<j>.{nruns,reruns,curruns}
185+
+ stats.arenas.<i>.lruns.<j>.{nmalloc,ndalloc,nrequests,curruns}
186+
+ stats.arenas.<i>.hchunks.<j>.{nmalloc,ndalloc,nrequests,curhchunks}
187+
188+
Bug fixes:
189+
- Improve interval-based profile dump triggering to dump only one profile when
190+
a single allocation's size exceeds the interval. (@jasone)
191+
- Use prefixed function names (as controlled by --with-jemalloc-prefix) when
192+
pruning backtrace frames in jeprof. (@jasone)
193+
7194
* 4.5.0 (February 28, 2017)
8195

9196
This is the first release to benefit from much broader continuous integration

0 commit comments

Comments
 (0)