![]() |
|||
|
|
![]() |
|
Improving Locality with Dynamic Memory Allocation
Download TP memory allocator: Coming Soon!
The goal of my thesis is to improve performance of dynamic C++ applications. We improve an application's data locality through careful consideration when allocating memory. We developed three locality improving memory allocators that reduce the execution time of seven large and real-world applications, such as Spec 2006 CPU2006 - 471.omnetpp, 483.xalancbmk and 447.dealII -, Atlas - a mathematical application -, mesh transport, molecular dynamic and network simulation, by an average of 7% when compared to one of the best available memory allocators in Doug Lea's allocator and by an average of 17% when compared to FreeBSD's allocator. Any C++ application that uses dynamic memory allocation benefits from this approach.
A memory allocator must keep track of the memory blocks that are in use by an application (allocated) and the ones that are available for use (freed), minimizing time and memory usage in doing so. To achieve this goal, a memory allocator need only be concerned with managing the size of the free blocks.
However, with the introduction of memory hierarchies in computer architecture design in the early 1980s, the performance of a memory allocator can no longer be measured only based on allocation speed and fragmentation; the quality of the data layout it allocates directly affects an application's locality and thus its performance. To improve data layout, a memory allocator need to manage its memory based on the same currency used to measure locality: address location.
Traditional memory allocators either do not use address as a differential attribute in their management, such as segregated storage and buddy systems, or the ones that do employ expensive techniques in doing so, such as cartesian trees and address-order sorting of blocks. This is partly because adding a address parameter to an already complex task of managing memory blocks fast and frugal further complicates the design of locality improving memory allocators.
We developed three novel memory management and allocation strategies that improve data locality. We implemented these techniques using adjustable address neighborhood approximations. This flexibly allowed us to gauge the effect of locality accuracy on the performance of allocators and applications. We tested these novel allocators on large applications of hundreds of thousands of lines of code and our allocators outperform state-of-the-art allocators. Compared with one of the best memory allocators available in Doug Lea's allocator, our allocators reduce the execution time by an average of 7%. Compared to FreeBSD's allocator, by an average of 17%.
In addition to size parameter present in a memory request, our allocators allow an address parameter, allocation hint, to be requested, representing the desired address neighborhood for the new memory block. The ISO C++ Standard recognized the importance of locality at the memory allocation level and added in 1998 a hint parameter to the standard allocator's interface. Thus, our allocators' interface is ISO C++ compatible. When allocation hints are used, our allocators improve an application's spatial locality, by trying to allocate the requested block size in the hint's neighborhood. When allocation hints are ignored, our allocators switch to improving an application's temporal locality, by allocating consecutive blocks from the same region.
// Traditional memory allocation int*x = (int*) malloc(8); or int*x = new int; free(x); or delete x; //The ISO C++ Standard interface for STL allocators std::allocator::allocate(size_t n, void* hint=0)
To explore unconventional memory allocation strategies, we developed a generic library. This library is to memory allocation what C++ STL is to programming: it allows users to focus on the allocation strategies rather than their implementation. This approach also allows memory management and allocation to be performed base on unconventional characteristics, such as cache set and allocation call site, without burden on the interface and integration. We also formalized a generic and theoretical approach to memory allocation problem.
// Traditional allocation & deallocation requests 1. int *a1 = new int; ... delete a1; 2. void *a2 = malloc(8); ...free(a2); // Generic allocation & deallocation requests // Allocation based on two attributes, call site and size 3. int *a3 = Allocator::allocate(Request< CallSiteType,int>(callsite,4)); 4. Allocator::deallocate(Request< CallSiteType,int>(callsite,4)); // Allocation based on three attributes, cache set, thread-id and size 5. int *a3 = Allocator::allocate(Request< CacheSet, ThreadId,int>(cs,0,4)); 6. Allocator::deallocate(Request< CacheSet, ThreadId,int>(cs,0,4));Download TP memory allocator: Coming Soon!
Parasol Home | Research | People | General info | Seminars | Resources Parasol Lab, 301 Harvey R. Bright Bldg, 3112 TAMU, College Station, TX 77843-3112 Contact Webmaster Phone 979.458.0722 Fax 979.458.0718
Department of Computer Science | Dwight Look College of Engineering | Texas A&M University Privacy statement: Computer Science Engineering TAMU |