On Sat, Jun 9, 2012 at 1:34 AM, Mikhail Eremin
Hello, SETTING: - There is an application, written using Boost Template library, meant for QUICK processing of bulk text files (cca 50-100Gb each). - There is a huge, quick and expensive piece of hardware with HUGE amount of RAM and multiple CPU. - There is [theoretically] any possible UNIX-like OS, even Microsoft Windows(R) is considered. - Boost Thread Pool extension is used; previously memory mapped files through memory_segment have been used, now got rid of the entire Boost::interprocess. - There are NO explicit data items in the application's algorithm to be shared by threads, each has its own piece of input file, thus - there is NO explicit concurrency. PROBLEM: - Ensure fast processing without locks and threads sleeping. Currently the threads sleep on some internal mutex. We thought it's been boost::interprocess (specifically - mmap, wrapped by a mutex), but it apparently isn't so.
SPECIFIC QUESTION: - How could we get rid of Boost locks?
Mike
Okay, so you have enough memory to map an entire file into memory at
once? Are the files read-only? Where are you using a boost lock to
get rid of? Probably the threadpool library uses a lock on a queue
somewhere?
You could certainly write this without a threadpool. I'd imagine that
the cost of launching threads will be insignificant compared to
running the algorithm on these regions:
std::vector< std::pair