I'm writing a scientific application that has to store a large (1GB to 500GB+) amount of data on a hard drive, and then, once written, read it back sequentially to process it. The amount of data for a particular experiment is known in advance, exact to the byte.

When I write this file to disk at the moment, it ends up extremely fragmented (500+ fragments), despite there being enough contiguous space on the drive at the start to have it in one piece. This ends up being extremely detrimental to performance when analysing the data. I imagine this happens because currently my programme does not anticipate writing a file of any particular size and just keeps writing and writing, with the OS (or whatever handles this, I don't actually know) deciding where on the disk it physically goes.

So my question is, can I allocate a contiguous region of disk (assuming that one exists, which it usually would) to write this file to in order to speed up my processing? Surely I should be able to take advantage of the fact that I know how big the file will be in advance? I feel like this should be possible, but don't really know where to look and haven't found anything helpful on the web so far.

Thanks in advance,


PS Assuming that this is possible in some way, is there a way that, if there is not a contiguous region of disk to write to, that a file with a minimum number of fragments can be allocated, rather than just allowing the OS or whatever to use this as an opportunity to fill in all its gaps?

PPS Someone on the MSDN forum suggesting using Sysinternals contig utility to create a contiguous file. This sounds good but I don't know how to overwrite the file without deleting its contents in each write cycle. Link to the MSDN thread where I have posted this is: http://social.msdn.microsoft.com/Forums/en-US/vcgeneral/thread/7a5c14ef-b50e-4349-9633-26add123df40

7 Years
Discussion Span
Last Post by mmmerlin

The last post in that thread told you how to write to the file without first deleting its contents.

The last post in the thread is me, and in it I'm asking for clarification! :-O

I've said that I am currently using fopen in "wb" mode. My understanding of that, from http://www.cplusplus.com/reference/clibrary/cstdio/fopen/ is that "w" will create an empty file for writing. If a file with the same name already exists (which it will) then its content is erased and the file is treated as a new empty file, so this will just write over my old file, and ignore its pre-allocated size.

However, if I use "a" then it will append, and grow the file, neither of which is quite what I want to do here. So what I'm wondering is how to overwrite the contents, but not have the file treated as an empty (i.e. new) file?


The last post was from someone named Brian Muth, and suggested opening the file with "rb+", which is what you want to do.


The last post was from someone named Brian Muth, and suggested opening the file with "rb+", which is what you want to do.

Lol, my bad, MSDN can be quite laggy sometimes, I did hit refresh before posting here, I swear! :D

Yes, that does indeed answer my question, and serves me right for being so impatient as to post the same thing on two different forums!

Thanks for your help though

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.