This page has been accessed 38,388 times since the 1st January 2006.
| View this page in: |
English |
Any language: |
Chinese |
French |
German |
Japanese |
Portuguese |
Russian |
Spanish |
Translation to non-English languages provided by Google Language
You are connecting to the IPv4 version of this website from the IP address 38.107.191.105. You can try the IPv6-only version if you want.
|
|
nedmalloc is a VERY fast, VERY scalable, multithreaded memory allocator with little
memory fragmentation. It is faster in real world code than Hoard,
faster than tcmalloc, faster than ptmalloc2 and it scales with extra
processing cores better than Hoard, better than tcmalloc and better than
ptmalloc2 or ptmalloc3. Put another way, there is no faster portable
memory allocator out there! Unlike other allocators, it is written in C
and so can be used anywhere and it also comes under the Boost software
license which permits commercial usage.It has been tested on some very high end hardware with more than eight processing cores and more than 8Gb of RAM. It is in daily use by some of the world's major banks, root DNS servers, multinational airlines and consumer products (embedded). It also costs no money (though donations are welcome!). Thanks to work generously sponsored by Applied Research Associates, nedmalloc can patch itself into existing binaries to replace the system allocator on Windows - for example, Microsoft Word is noticeably quicker for very large documents after the nedmalloc DLL has been injected into it! It is more than 125 times faster than the standard Win32 memory allocator, 4-10 times faster than the standard FreeBSD memory allocator and up to twice as fast as ptmalloc2, the standard Linux memory allocator. It can sustain a minimum of between 7.3m and 8.2m malloc & free pair operations per second on a 3400 (2.20Ghz) AMD Athlon64 machine. It scales with extra CPU's far better than either the standard Win32 memory allocator or ptmalloc2 and can cause significantly less memory bloating than ptmalloc2. It avoids processor serialisation (locking) entirely when the requested memory size is in the thread cache leading to the kind of scalability you can see in the graph on the right. In real world code:
If you want an explanation of the difference between the Packetised and Memory Mapped benchmarks, please see the Tn homepage (but basically, the Packetised involves performing a lot more memory ops in a more loaded multithreaded environment). As you can see above, the benefits of nedmalloc translate into real world code with more than a 50% speed increase over the default win32 allocator. The Tn speed test is very heavy on the memory bus, so you can expect your own applications to see greater improvements than this. See below for a Frequently Asked Questions list. Below and to the right is a series of comparisons between nedmalloc, system allocators and a number of other replacement memory allocators such as tcmalloc and Hoard. The graphs below are for v1.00 but are still good for an idea of performance on a wide variety of systems, but note than nedmalloc has become much faster in recent revisions (as you can see on the right). To my knowledge, nedmalloc is the fastest portable memory allocator available.
Downloads:ChangeLog (from SVN) Current: v1.05 (svn 1078) of nedmalloc (80Kb). Beta 1 of v1.06 (svn 1151) is also available. Changes in v1.06: E. ChangeLog:
-=-=-=-=-=-=-
v1.06 beta 1 13th January 2010:
* { 1079 } Fixed misdeclaration of struct mallinfo as C++ type. Thanks to James Mansion for reporting this.
* { 1082 } Fixed dlmalloc bug which caused header corruption to mmap() allocations when running under multiple threads
* { 1088 } Fixed assertion failure for nedblksize() with latest dlmalloc. Thanks to Anteru for reporting this.
* { 1088 } Added neddestroysyspool(). Thanks to Lars Wehmeyer for suggesting this.
* { 1088 } Fixed thread id high bit set bug causing SIGABRT on Mac OS X. Thanks to Chris Dillman for reporting this.
* { 1094 } Integrated dlmalloc v2.8.4 final.
* { 1095 } Added nedtrimthreadcache(). Thanks to Hayim Hendeles for suggesting this.
* { 1095 } Fixed silly assertion of null pointer dereference. Thanks to Ullrich Heinemann for reporting this.
* { 1096 } Fixed lots of level 4 warnings on MSVC. Thanks to Anteru for suggesting this.
* { 1098 } Improved non-nedmalloc block detection to 6.25% probability of being wrong. Thanks to Applied Research Associates for
sponsoring this.
* { 1099 } Added USE_MAGIC_HEADERS which allows nedmalloc to handle freeing a system allocated block. Added USE_ALLOCATOR which
allows the changing of which backend allocator to use (with choices between the system allocator and dlmalloc - choosing the system
allocator is intended for debug situations only e.g. valgrind). Thanks to Applied Research Associates for sponsoring this.
* { 1105 } Added ability to build nedmalloc as a DLL. Added support for a run time PE binary patcher which can patch all usage of
the system allocator replacing it with nedmalloc. Thanks to Applied Research Associates for sponsoring this.
* { 1108 } Added patcher loader which can load any arbitrary program injecting the nedmalloc DLL which then patches in its replacement
for the system allocator. Doesn't work on all programs, but does on most e.g. Microsoft Word. Thanks to Applied Research Associates
for sponsoring this.
* { 1116 } Finished debugging and optimising the latest additions to the codebase. The patcher now works well on x64 as well as x86.
Added support for large pages on Windows. Thanks to Applied Research Associates for sponsoring this.
* { 1125 } Added nedpoollist() which returns a snapshot of the nedpool's currently existing. The Windows DLL thread exit code now
disables the thread cache for all currently existing nedpool's. Thanks to Applied Research
Associates for sponsoring this.
* { 1126 } Added ENABLE_TOLERANT_NEDMALLOC which allows nedmalloc to recognise system allocator blocks and to do the right thing with them.
* { 1139 } Added link time code generation support for Windows builds. This currently has zero performance improvement on x64 (on
MSVC9) but can add 15% to x86 performance (on MSVC9). Also added scons SConstruct and SConscript files.
Previous: v1.04 (svn 1040) of nedmalloc (80Kb) v1.03 of nedmalloc (76.4Kb) v1.02 of nedmalloc (76.3Kb) v1.01 of nedmalloc (71.9Kb) v1.00 of nedmalloc (69.7Kb) You can fetch nedmalloc from SVN here with a web view of SVN here. Frequently Asked Questions:
|
|||||||||||||||||||||||||||||||||||||