I'm trying to read 2 files each has aprox 1MB but when I check for memory usage i get 2MB
(this still persist for more file f*1mb)
It is possible to read each file with minimum memory usage?
If you read them in a piece at a time and overwrite the piece each time, you can make that piece as small as you like
It all depends on how you read the files. In general, with buffered reading, like with
ifstream, the stream will read blocks of data ahead of what you are currently reading. The reason for this is to minimize the time usage, because it's more efficient to read substantial blocks of data from the files than it is to read it byte-for-byte (unbuffered). Classes like
ifstream use heuristics to determine how much data should be read into the buffer. "Heuristics" just means simple rules that perform well in practice.
I have never tested the buffering behavior of
ifstream, but I would not be surprised if it reads 1MB or more at a time from the files.
For example, one way to come up with a heuristic for buffering is to look at the input and output latency (time to wait to get data). Waiting for data from the hard-drive can easily take millions of clock cycles on typical systems. Getting RAM data that is already cached (in cache memory) can take maybe around 50 clock cycles (but it depends on the level of cache and many other things). So, you have a ratio of about 1 million to 1 in the latency between where you get the data from (HDD) and where you are delivering it to (CPU cache). This means that the consumer (CPU) could read out 1 million bytes in the time it takes for the producer (HDD) to produce one chunk of data. So, if you make those chunks of data 1 million bytes in size (1MB), then you can compensate fully for the latency ratio between the producer and consumer. It's as simple as that. And that's why it would not surprise me if
ifstream buffered data at this rate, because it's a reasonable heuristic value that will work well on typical systems.
If you want to manually reduce the buffer size, you can do so by providing your own, limited size, buffer to the stream object. As so:
std::ifstream in_file; // create buffer for 256 bytes only: char my_buffer; // give that to the file stream: in_file.rdbuf()->pubsetbuf(my_buffer, 256); // then, open the file: in_file.open("somefile.txt");
But this will slow down the overall reading process significantly if your buffer is too small.