When I ran a program that had a list of integers with 5000000 elements in it, I was surprised to find that it used 118MB of memory, even when the *only* thing done by the program is the creation of the list.

As far as I understand, each node in a list contains an element as well as pointers to both the following and preceding elements. Both an integer and a pointer to an integer are 4 bytes, so one would think that a node would use up 12 bytes, and therefore a list with 5M elements should use 60MB, not 118MB.

So why do lists seem to use nearly twice the amount of memory that they should be using?

I've tried implementing my own list, using the following code:

#include <iostream>

struct node
{
    node *next;
    int elem;
    node *prev;
};

int main()
{
    node *first = new node;
    node *curr=first;
    for(int i=0;i<5000000;++i)
    {
        curr->elem=i;
        node *next = new node;
        curr->next=next;
        next->prev=curr;
        curr=curr->next;
    }

    std::cin.get();

    return 0;
}

When I run the above program, it uses the *exact* same amount of memory as the below program, which uses the list from the STL:

#include <iostream>
#include <list>

int main()
{
    std::list<int> l;
    for(int i=0;i<5000000;++i)
        l.push_back(i);

    std::cin.get();

    return 0;
}

I don't understand how this can be: in both cases, each node should only be using 12 bytes of memory. Why does it appear to be using just under 24 bytes per node? I could possibly understand it if the STL list had some extra overhead I didn't know about, but there doesn't seem to be any way that the list I implemented should use more than 12 bytes per node.

Edited 6 Years Ago by Allophyl: n/a

> As far as I understand, each node in a list contains an element as well as pointers to both the following and preceding elements.
> Both an integer and a pointer to an integer are 4 bytes, so one would think that a node would use up 12 bytes,
> and therefore a list with 5M elements should use 60MB, not 118MB.

That is rather too simplistic a view.

When you allocate memory using a new, C++ guarantees that the memory pointed to will be correctly aligned for *any* complete object type whose size is no greater than the requested size.

The allocation function attempts to allocate the requested amount of storage. If it is successful, it shall return the address of the start of a block of storage whose length in bytes shall be at least as large as the requested size.
...[elided]...
The pointer returned shall be suitably aligned so that it can be converted to a pointer of any complete object type and then used to access the object or array in the storage allocated
...[elided]... - IS

For example, if the alignment requirement for an 8-byte double is 8 bytes, a 12-byte block would be aligned on an 8-byte boundary.

For performance reasons, every allocator maintains available chunks in bins, grouped by size. Because of alignment requirements, these chunk sizes are a multiple of 8 on 32-bit platforms. So a request for 12 bytes will consume at least one 16-byte chunk. For example, the code that you posted consumes exactly 80MB on FreeBSD-i86/gemalloc. Some allocators, like the Microsoft allocator in the debug version of their CRT use some extra memory for block headers.

For more information, see:
http://en.wikipedia.org/wiki/Malloc#dlmalloc_and_its_derivatives
http://gee.cs.oswego.edu/dl/html/malloc.html
http://people.freebsd.org/~jasone/jemalloc/bsdcan2006/jemalloc.pdf

Comments
Excellent reply as always
This article has been dead for over six months. Start a new discussion instead.