Timo Dickscheid's sparse web notes.
home research programming web

I recently encountered a problem with a std::vector of objects needing more memory than available on the system. This occurs easily when using the standard allocation procedure, which re-allocates the vector to double size once it needs more memory, and hence suddenly needs triple the amount of memory. On a 32 bit system the limit is at 2GB, so a 600+MB vector is probably too big already.

What is the solution to this problem? First of all, you should pre-allocate the vector and check max_size(), and also catch the allocation exception that vector throws. You may also consider using more memory-efficient datatypes for your problem: Have a look at the boost flyweight library, if you have many items taking on the same values, for example. Second, rethink your algorithm - that's what I did. In most cases there is no need to hold such massive data in memory.

However you might end up searching for a way to stream objects to disk seamlessly "in the background" while still using the familiar STL-like syntax. In such a case have a look at the STXXL library (the "Standard Template Library for Extra Large Data Sets") - it's just that.