
On Wed, Aug 17, 2011 at 7:53 AM, Thomas Luzat
On 2011-08-17 14:43, Chris Cleeland wrote:
Have you considered mmap'ing the file and allowing all your activity to occur on the mmap'd file? That way the VM subsystem would worry about paging things in or out as necessary, and there wouldn't be any issues with contention across multiple threads. Of course, if you don't have mmap on your system...
I have considered mmaping or reading through the whole file, but benchmarking so far has shown that I am mostly I/O-limited. By synchronously working on blocks in parallel I avoid disk seeks as much as possible.
Ah, very well. I've also had similar situations wherein mmap provided no performance benefit over reading through well-tuned buffered i/o since accesses were mostly sequential.
I might offer such an implementation for cases where seeks are not that expensive (such as for SSDs or slower CPUs).
If main memory is large enough, a seek is likely to hit an existing page rather than something that must be paged in.
Another problem is that mmap alone is not a complete solution in itself on 32 bit systems given that files may very well be larger than a few GB, but this can be solved now, too.
That's a much more difficult issue to deal with. Not impossible, but definitely more difficult, and would nudge me towards the solution you originally inquired about.