I have about a 1gb data file that needs to be read at the beginning of my program. This takes about 2 minutes to do. I feel like the answer is "no way", but here is my question.

In matlab, there is the concept of a "workspace". I can ready my giant file into the workspace, then run code that accesses this data, then modify the code, and run it again on the same data without reading it in again. Is there any idea similar to this with c++?

Ed would do this by defining a "window" into the file so that only a small part of the file is in memory at any given moment to save memory. When accessing a record that isn't in the window, the window is moved appropriately. How you move the window depends on how the file is structured.

For read-only access with well structured files, that's about ideal. If you need write access too, it gets a little harder because files aren't usually easy to update without rewriting the file entirely. Maybe if you cache the changes and when the number of changes gets to a certain threshold, rewrite the file and make all of the changes.

Working with large files is a pain. Edward recommends avoiding them entirely and using a relational database if you can get away with it. ;)

This article has been dead for over six months. Start a new discussion instead.