Skip to content
This repository was archived by the owner on Feb 5, 2019. It is now read-only.

update jemalloc to master #4

Closed
wants to merge 100 commits into from
Closed

update jemalloc to master #4

wants to merge 100 commits into from

Conversation

thestinger
Copy link

No description provided.

thestinger and others added 30 commits May 7, 2014 18:48
By default, git will coerce LF to CRLF when files are checked out on
Windows. This causes hard to diagnose errors when compiling with
mingw-w64 from Windows rather than cross-compiling.
fix git handling of newlines on windows
Add new mallctl endpoints "arena<i>.chunk.alloc" and
"arena<i>.chunk.dealloc" to allow userspace to configure
jemalloc's chunk allocator and deallocator on a per-arena
basis.
Refactor huge allocation to be managed by arenas (though the global
red-black tree of huge allocations remains for lookup during
deallocation).  This is the logical conclusion of recent changes that 1)
made per arena dss precedence apply to huge allocation, and 2) made it
possible to replace the per arena chunk allocation/deallocation
functions.

Remove the top level huge stats, and replace them with per arena huge
stats.

Normalize function names and types to *dalloc* (some were *dealloc*).

Remove the --enable-mremap option.  As jemalloc currently operates, this
is a performace regression for some applications, but planned work to
logarithmically space huge size classes should provide similar amortized
performance.  The motivation for this change was that mremap-based huge
reallocation forced leaky abstractions that prevented refactoring.
test/integration/aligned_alloc.c needs it.
Sets `STATIC_PAGE_SHIFT` for cross-compiling jemalloc to 12. A
shift of 12 represents a page size of 4k for practically all
platforms.
Use nallocx() rather than mallctl() to trigger initialization, because
nallocx() has no side effects other than initialization, whereas
mallctl() does a bunch of internal memory allocation.
Add size class computation capability, currently used only as validation
of the size class lookup tables.  Generalize the size class spacing used
for bins, for eventual use throughout the full range of allocation
sizes.
Fix KZI() and KQI() to append LL rather than ULL.
Jason Evans and others added 26 commits September 4, 2014 22:27
Optimize [nmd]alloc() fast paths such that the (flags == 0) case is
streamlined, flags decoding only happens to the minimum degree
necessary, and no conditionals are repeated.
Move typedefs from jemalloc_protos.h.in to jemalloc_typedefs.h.in, so
that typedefs aren't redefined when compiling stress tests.
It hits a compilation error with glibc 2.19 without a rename.
avoid conflict with the POSIX timer_t type
This adds a new `sdallocx` function to the external API, allowing the
size to be passed by the caller.  It avoids some extra reads in the
thread cache fast path.  In the case where stats are enabled, this
avoids the work of calculating the size from the pointer.

An assertion validates the size that's passed in, so enabling debugging
will allow users of the API to debug cases where an incorrect size is
passed in.

The performance win for a contrived microbenchmark doing an allocation
and immediately freeing it is ~10%.  It may have a different impact on a
real workload.

Closes jemalloc#28
fix isqalloct (should call isdalloct)
    - Add a --thread N option to select profile for thread N (otherwise, all
      threads will be printed)
    - The $profile map now has a {threads} element that is a map from thread id to
      a profile that has the same format as the {profile} element
    - Refactor ReadHeapProfile into smaller components and use them to implement
      ReadThreadedHeapProfile
Refactor sdallocx() and nallocx() to share inallocx(), and fix an
sdallocx() assertion to check usize rather than size.
Fix ReadThreadedHeapProfile to pass the correct parameters to
AdjustSamples.
Fix prof_tdata_get() to avoid dereferencing an invalid tdata pointer
(when it's PROF_TDATA_STATE_{REINCARNATED,PURGATORY}).

Fix prof_tdata_get() callers to check for invalid results besides NULL
(PROF_TDATA_STATE_{REINCARNATED,PURGATORY}).

These regressions were caused by
602c8e0 (Implement per thread heap
profiling.), which did not make it into any releases prior to these
fixes.
Fix a profile sampling race that was due to preparing to sample, yet
doing nothing to assure that the context remains valid until the stats
are updated.

These regressions were caused by
602c8e0 (Implement per thread heap
profiling.), which did not make it into any releases prior to these
fixes.
* assertion failure
* malloc_init failure
* malloc not already initialized (in malloc_init)
* running in valgrind
* thread cache disabled at runtime

Clang and GCC already consider a comparison with NULL or -1 to be cold,
so many branches (out-of-memory) are already correctly considered as
cold and marking them is not important.
Fix irallocx_prof() sample logic to only update the threshold counter
after it knows what size the allocation ended up being.  This regression
was caused by 6e73dc1 (Fix a profile
sampling race.), which did not make it into any releases prior to this
fix.
Don't use atomic_add_uint64(), because it isn't available on 32-bit
platforms.

Fix forking support functions to manage all prof-related mutexes.

These regressions were introduced by
602c8e0 (Implement per thread heap
profiling.), which did not make it into any releases prior to these
fixes.
@alexcrichton
Copy link
Member

@alexcrichton alexcrichton mentioned this pull request Sep 12, 2014
@thestinger thestinger deleted the rust-2014-09-12-do-not-delete branch October 2, 2014 04:52
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.