Skip to content

Commit 48ec896

Browse files
committed
jemalloc: Import 5.3.0 54eaed1d8b56b1aa528be3bdd1877e59c56fa90c
Import jemalloc 5.3.0. This import changes how manage the jemalloc vendor branch (which was just started anyway). Starting with 5.3.0, we import a clean tree from the upstream github, removing all the old files that are no longer upstream, or that we've kept around for some reason. We do this because we merge from this raw version of jemalloc into the FreeBSD contrib/jemalloc, then we run autogen stuff, generate all the generated .h files with gmake, then finally remove much of the generated files in contrib/jemalloc using an update script. Sponsored by: Netflix
1 parent d28d7fb commit 48ec896

File tree

399 files changed

+45626
-28408
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

399 files changed

+45626
-28408
lines changed

ChangeLog

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,106 @@ brevity. Much more detail can be found in the git revision history:
44

55
https://github.com/jemalloc/jemalloc
66

7+
* 5.3.0 (May 6, 2022)
8+
9+
This release contains many speed and space optimizations, from micro
10+
optimizations on common paths to rework of internal data structures and
11+
locking schemes, and many more too detailed to list below. Multiple percent
12+
of system level metric improvements were measured in tested production
13+
workloads. The release has gone through large-scale production testing.
14+
15+
New features:
16+
- Add the thread.idle mallctl which hints that the calling thread will be
17+
idle for a nontrivial period of time. (@davidtgoldblatt)
18+
- Allow small size classes to be the maximum size class to cache in the
19+
thread-specific cache, through the opt.[lg_]tcache_max option. (@interwq,
20+
@jordalgo)
21+
- Make the behavior of realloc(ptr, 0) configurable with opt.zero_realloc.
22+
(@davidtgoldblatt)
23+
- Add 'make uninstall' support. (@sangshuduo, @Lapenkov)
24+
- Support C++17 over-aligned allocation. (@marksantaniello)
25+
- Add the thread.peak mallctl for approximate per-thread peak memory tracking.
26+
(@davidtgoldblatt)
27+
- Add interval-based stats output opt.stats_interval. (@interwq)
28+
- Add prof.prefix to override filename prefixes for dumps. (@zhxchen17)
29+
- Add high resolution timestamp support for profiling. (@tyroguru)
30+
- Add the --collapsed flag to jeprof for flamegraph generation.
31+
(@igorwwwwwwwwwwwwwwwwwwww)
32+
- Add the --debug-syms-by-id option to jeprof for debug symbols discovery.
33+
(@DeannaGelbart)
34+
- Add the opt.prof_leak_error option to exit with error code when leak is
35+
detected using opt.prof_final. (@yunxuo)
36+
- Add opt.cache_oblivious as an runtime alternative to config.cache_oblivious.
37+
(@interwq)
38+
- Add mallctl interfaces:
39+
+ opt.zero_realloc (@davidtgoldblatt)
40+
+ opt.cache_oblivious (@interwq)
41+
+ opt.prof_leak_error (@yunxuo)
42+
+ opt.stats_interval (@interwq)
43+
+ opt.stats_interval_opts (@interwq)
44+
+ opt.tcache_max (@interwq)
45+
+ opt.trust_madvise (@azat)
46+
+ prof.prefix (@zhxchen17)
47+
+ stats.zero_reallocs (@davidtgoldblatt)
48+
+ thread.idle (@davidtgoldblatt)
49+
+ thread.peak.{read,reset} (@davidtgoldblatt)
50+
51+
Bug fixes:
52+
- Fix the synchronization around explicit tcache creation which could cause
53+
invalid tcache identifiers. This regression was first released in 5.0.0.
54+
(@yoshinorim, @davidtgoldblatt)
55+
- Fix a profiling biasing issue which could cause incorrect heap usage and
56+
object counts. This issue existed in all previous releases with the heap
57+
profiling feature. (@davidtgoldblatt)
58+
- Fix the order of stats counter updating on large realloc which could cause
59+
failed assertions. This regression was first released in 5.0.0. (@azat)
60+
- Fix the locking on the arena destroy mallctl, which could cause concurrent
61+
arena creations to fail. This functionality was first introduced in 5.0.0.
62+
(@interwq)
63+
64+
Portability improvements:
65+
- Remove nothrow from system function declarations on macOS and FreeBSD.
66+
(@davidtgoldblatt, @fredemmott, @leres)
67+
- Improve overcommit and page alignment settings on NetBSD. (@zoulasc)
68+
- Improve CPU affinity support on BSD platforms. (@devnexen)
69+
- Improve utrace detection and support. (@devnexen)
70+
- Improve QEMU support with MADV_DONTNEED zeroed pages detection. (@azat)
71+
- Add memcntl support on Solaris / illumos. (@devnexen)
72+
- Improve CPU_SPINWAIT on ARM. (@AWSjswinney)
73+
- Improve TSD cleanup on FreeBSD. (@Lapenkov)
74+
- Disable percpu_arena if the CPU count cannot be reliably detected. (@azat)
75+
- Add malloc_size(3) override support. (@devnexen)
76+
- Add mmap VM_MAKE_TAG support. (@devnexen)
77+
- Add support for MADV_[NO]CORE. (@devnexen)
78+
- Add support for DragonFlyBSD. (@devnexen)
79+
- Fix the QUANTUM setting on MIPS64. (@brooksdavis)
80+
- Add the QUANTUM setting for ARC. (@vineetgarc)
81+
- Add the QUANTUM setting for LoongArch. (@wangjl-uos)
82+
- Add QNX support. (@jqian-aurora)
83+
- Avoid atexit(3) calls unless the relevant profiling features are enabled.
84+
(@BusyJay, @laiwei-rice, @interwq)
85+
- Fix unknown option detection when using Clang. (@Lapenkov)
86+
- Fix symbol conflict with musl libc. (@georgthegreat)
87+
- Add -Wimplicit-fallthrough checks. (@nickdesaulniers)
88+
- Add __forceinline support on MSVC. (@santagada)
89+
- Improve FreeBSD and Windows CI support. (@Lapenkov)
90+
- Add CI support for PPC64LE architecture. (@ezeeyahoo)
91+
92+
Incompatible changes:
93+
- Maximum size class allowed in tcache (opt.[lg_]tcache_max) now has an upper
94+
bound of 8MiB. (@interwq)
95+
96+
Optimizations and refactors (@davidtgoldblatt, @Lapenkov, @interwq):
97+
- Optimize the common cases of the thread cache operations.
98+
- Optimize internal data structures, including RB tree and pairing heap.
99+
- Optimize the internal locking on extent management.
100+
- Extract and refactor the internal page allocator and interface modules.
101+
102+
Documentation:
103+
- Fix doc build with --with-install-suffix. (@lawmurray, @interwq)
104+
- Add PROFILING_INTERNALS.md. (@davidtgoldblatt)
105+
- Ensure the proper order of doc building and installation. (@Mingli-Yu)
106+
7107
* 5.2.1 (August 5, 2019)
8108

9109
This release is primarily about Windows. A critical virtual memory leak is

INSTALL.md

Lines changed: 15 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -9,14 +9,17 @@ If building from unpackaged developer sources, the simplest command sequence
99
that might work is:
1010

1111
./autogen.sh
12-
make dist
1312
make
1413
make install
1514

16-
Note that documentation is not built by the default target because doing so
17-
would create a dependency on xsltproc in packaged releases, hence the
18-
requirement to either run 'make dist' or avoid installing docs via the various
19-
install_* targets documented below.
15+
You can uninstall the installed build artifacts like this:
16+
17+
make uninstall
18+
19+
Notes:
20+
- "autoconf" needs to be installed
21+
- Documentation is built by the default target only when xsltproc is
22+
available. Build will warn but not stop if the dependency is missing.
2023

2124

2225
## Advanced configuration
@@ -188,13 +191,13 @@ any of the following arguments (not a definitive list) to 'configure':
188191

189192
* `--disable-cache-oblivious`
190193

191-
Disable cache-oblivious large allocation alignment for large allocation
192-
requests with no alignment constraints. If this feature is disabled, all
193-
large allocations are page-aligned as an implementation artifact, which can
194-
severely harm CPU cache utilization. However, the cache-oblivious layout
195-
comes at the cost of one extra page per large allocation, which in the
196-
most extreme case increases physical memory usage for the 16 KiB size class
197-
to 20 KiB.
194+
Disable cache-oblivious large allocation alignment by default, for large
195+
allocation requests with no alignment constraints. If this feature is
196+
disabled, all large allocations are page-aligned as an implementation
197+
artifact, which can severely harm CPU cache utilization. However, the
198+
cache-oblivious layout comes at the cost of one extra page per large
199+
allocation, which in the most extreme case increases physical memory usage
200+
for the 16 KiB size class to 20 KiB.
198201

199202
* `--disable-syscall`
200203

0 commit comments

Comments
 (0)