[mapreduce] Prim Calculator
Hi there, first of all, I'm not sure if boost user list should be used for boost-sandbox libraries. Anyone? Now, I would like create a new example for the mapreduce lib. How can I, for instance, setup a system that computes the prim numbers for 0 to 1,000,000,000. For this, I need to create the input for the map function on the fly, rather than reading some files from the hard drive. How is that possible? The documentation is a bit spare so far. Thanks, Christian
I'm not sure if boost user list should be used for boost-sandbox libraries. Anyone? Yes, very appropriate IMHO
Now, I would like create a new example for the mapreduce lib. Thanks, I appreciate contributions from users.
How can I, for instance, setup a system that computes the prim numbers for 0 to 1,000,000,000. I'm assuming here that you don't need help with the algorithm, just how to use the library...
For this, I need to create the input for the map function on the fly, rather than reading some files from the hard drive. How is that possible? You'll need to write a "datasource" policy and supply it as the 4th template parameter of your mapreduce::job<> type.
The datasource only needs to implement two public member functions: bool const setup_key(typename MapTask::key_type &key) const; bool const get_data(typename MapTask::key_type &key, typename MapTask::value_type &value) const; setup_key() is called by the library to return a key for the next map task to be run, and returns false when there is no more data to map. get_data() is called by the library to get retrieve the "value" data for a given key (previously returned from setup_key). The return value here is a traditional success/fail code and false will terminate the map task.
The documentation is a bit spare so far. Yes, definitely work in progress. The Reference is not yet written
Thanks for your interest in my library. -- Craig
Hi Craig, I did my best in translating the is_prime problem into a map_reduce solution. The problem is simple: Find all prime numbers for a given range of numbers, for instance, 0...1,000. Below is the source code that I came up with. I cannot attach files from my work, so, please excuse the bad formatting. The map function takes a number and emits a key/value pair of ( size_t, number ). The size_t key basically states weather or not a the number is prime. I don't quite understand how do the reduce function? I can see that in your word count example you accumulate the same words to calculate the count. What I want is to pushing back the prime numbers into a vector that I can print out at the end of program. #include <algorithm> #include <cmath> #include <numeric> #include <boost/mapreduce.hpp> namespace prime_calculator { std::size_t is_prime( std::size_t number ) { long n = static_cast<long>( number ); if( number == 0 ) return 1; n = std::abs( n ); std::size_t sqrt_number = static_cast< std::size_t >( std::sqrt( static_cast< double >( n ))); for( std::size_t i = 2; i <= sqrt_number; i++ ) { if( n % i == 0 ) return 0; } return 1; } template< typename MapTask > class number_source : boost::noncopyable { public: number_source( std::size_t start , std::size_t end ) : _start( start ) , _end ( end ) , _current( 0 ) {} const bool setup_key( typename MapTask::key_type& key ) const { if( _current < _end ) { key = _current; return true; } else { return false; } } const bool get_data( typename MapTask::key_type& key , typename MapTask::value_type& value ) { if( _current < _end ) { value.first = _current; value.second = _current; _current++; return true; } else { return false; } } private: std::size_t _start; std::size_t _end; std::size_t _current; }; struct map_task : public boost::mapreduce::map_task< std::size_t // MapKey , std::pair< std::size_t , std::size_t > // MapValue > { template<typename Runtime> static void map( Runtime& runtime , const std::size_t& /*key*/ , value_type& value ) { runtime.emit_intermediate( is_prime( value.first ), value.first ); } }; std::vector< std::size_t > prime_numbers; struct reduce_task : public boost::mapreduce::reduce_task< std::size_t , unsigned > { template< typename Runtime , typename It > static void reduce( Runtime& runtime , const std::size_t& key , It it , const It ite ) { if( key > 0 ) { copy( it, ite, back_inserter( prime_numbers )); } } }; typedef boost::mapreduce::job< prime_calculator::map_task , prime_calculator::reduce_task , boost::mapreduce::null_combiner , prime_calculator::number_source< prime_calculator::map_task >
job;
} // namespace prime_calculator int _tmain(int argc, _TCHAR* argv[]) { boost::mapreduce::specification spec; boost::mapreduce::results result; prime_calculator::job::datasource_type datasource( 0, 1000 ); spec.map_tasks = 0; spec.reduce_tasks = std::max( 1U, boost::thread::hardware_concurrency() ); std::cout << "\nRunning Parallel Prime_Calculator MapReduce..."; prime_calculator::job job( datasource, spec ); job.run< boost::mapreduce::schedule_policy::cpu_parallel<prime_calculator::job
( result ); std::cout << "\nMapReduce Finished.";
return 0; } Thanks, Christian
Hi Craig, I did my best in translating the is_prime problem into a map_reduce solution. The problem is simple: Find all prime numbers for a given range of numbers, for instance, 0...1,000. Below is the source code that I came up with. I cannot attach files from my work, so, please excuse the bad formatting.
The map function takes a number and emits a key/value pair of ( size_t, number ). The size_t key basically states weather or not a the number is prime. I don't quite understand how do the reduce function? I can see that in your word count example you accumulate the same words to calculate the count. What I want is to pushing back the prime numbers into a vector that I can print out at the end of program.
The generic form of Map/Reduce maps a key/value pair k1,v1 to a list of key/value pairs k2,v2. The reduce then takes a group of v2 values for each unique key k2 and produces a final list of v2 map (k1, v1) --> list(k2,v2) reduce (k2, list(v2)) --> list(v2) Your input is a list of unique integers, k1, and v1 is unused. Map emits intermediates where k2 is 0 or 1, indicating prime or not prime and v2 is the number (copied from k1). The Reduce function should then emit v2 if k2 is 1 and do nothing if k2 is 0. This will result in the final dataset containing prime numbers. i.e. map(1,2,3,4,5,6) --> ((1,1), (1,2), (1,3), (0,4), (1,5), (0,6)) reduce(1, (1,2,3,5)) --> (1,2,3,5) reduce(0, (4,6)) --> null I'll take a look at your code tomorrow and try and supply you a working example. If you get it working in the meantime, let me know. Regards -- Craig
Hi Craig, what you're suggesting is exactly what I do. For reasons I don't understand I'm getting the following assertion: Running Parallel Prime_Calculator MapReduce...Assertion failed: map_key != typename map_task_type::key_type(), file c:\chh\boost\boost\mapreduce\job.hpp, line 210 I don't understand why that happens? Can you help me out here. Thanks, Christian On Thu, Aug 20, 2009 at 6:09 PM, Craig Henderson<cdm.henderson@googlemail.com> wrote:
Hi Craig, I did my best in translating the is_prime problem into a map_reduce solution. The problem is simple: Find all prime numbers for a given range of numbers, for instance, 0...1,000. Below is the source code that I came up with. I cannot attach files from my work, so, please excuse the bad formatting.
The map function takes a number and emits a key/value pair of ( size_t, number ). The size_t key basically states weather or not a the number is prime. I don't quite understand how do the reduce function? I can see that in your word count example you accumulate the same words to calculate the count. What I want is to pushing back the prime numbers into a vector that I can print out at the end of program.
The generic form of Map/Reduce maps a key/value pair k1,v1 to a list of key/value pairs k2,v2. The reduce then takes a group of v2 values for each unique key k2 and produces a final list of v2
map (k1, v1) --> list(k2,v2) reduce (k2, list(v2)) --> list(v2)
Your input is a list of unique integers, k1, and v1 is unused. Map emits intermediates where k2 is 0 or 1, indicating prime or not prime and v2 is the number (copied from k1). The Reduce function should then emit v2 if k2 is 1 and do nothing if k2 is 0. This will result in the final dataset containing prime numbers.
i.e. map(1,2,3,4,5,6) --> ((1,1), (1,2), (1,3), (0,4), (1,5), (0,6)) reduce(1, (1,2,3,5)) --> (1,2,3,5) reduce(0, (4,6)) --> null
I'll take a look at your code tomorrow and try and supply you a working example. If you get it working in the meantime, let me know.
Regards -- Craig
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-----Original Message----- From: boost-users-bounces@lists.boost.org [mailto:boost-users- bounces@lists.boost.org] On Behalf Of Christian Henning Sent: 21 August 2009 15:05 To: boost-users@lists.boost.org Subject: Re: [Boost-users] [mapreduce] Prim Calculator
Hi Craig, what you're suggesting is exactly what I do. For reasons I don't understand I'm getting the following assertion:
Running Parallel Prime_Calculator MapReduce...Assertion failed: map_key != typename map_task_type::key_type(), file c:\chh\boost\boost\mapreduce\job.hpp, line 210
I don't understand why that happens? Can you help me out here.
Hi Christian The assertion is an incorrect error condition which meant that the key could not be a default constructed valued (in the case of your numeric key, zero). I have updated the sandbox - thanks for finding this. Get the latest version of job.hpp & in_memory.hpp from the sandbox. There are a couple other problems in your implementation, too. * number_source constructor should initialize _current to be start, not zero. * using a global to hold prime_numbers isn't thread-safe, and adding locking will introduce contention. The MapReduce library provides mechanisms to avoid this. In reduce_task::reduce, replace the line copy( it, ite, back_inserter( prime_numbers )); with for_each(it, ite, boost::bind(&Runtime::emit, &runtime, _1, 0)); which will emit the final results. To display the results: for (prime_calculator::job::const_result_iterator it=job.begin_results(); it!=job.end_results(); ++it) std::cout << it->first << " "; For performance, the datasource should return a range of integers so each task tests many prime numbers rather than one at a time. This will reduce lock contention in retrieving map keys. Finally, 0 & 1 are not prime numbers - this is a bug in your is_prime function. Regards -- Craig #include <algorithm> #include <cmath> #include <numeric> #include <boost/mapreduce.hpp> namespace prime_calculator { std::size_t is_prime( std::size_t number ) { long n = static_cast<long>( number ); if( number == 0 ) return 1; n = std::abs( n ); std::size_t sqrt_number = static_cast< std::size_t >( std::sqrt( static_cast< double >( n ))); for( std::size_t i = 2; i <= sqrt_number; i++ ) { if( n % i == 0 ) return 0; } return 1; } template< typename MapTask > class number_source : boost::noncopyable { public: number_source( std::size_t start , std::size_t end ) : _start( start ) , _end ( end ) , _current( start ) // CH {} const bool setup_key( typename MapTask::key_type& key ) const { if( _current < _end ) { key = _current; return true; } else { return false; } } const bool get_data( typename MapTask::key_type& key , typename MapTask::value_type& value ) { if( _current < _end ) { value.first = _current; value.second = _current; _current++; return true; } else { return false; } } private: std::size_t _start; std::size_t _end; std::size_t _current; }; struct map_task : public boost::mapreduce::map_task< std::size_t // MapKey , std::pair< std::size_t , std::size_t > // MapValue > { template<typename Runtime> static void map( Runtime& runtime , const std::size_t& /*key*/ , value_type& value ) { runtime.emit_intermediate( is_prime( value.first ), value.first ); } }; //std::vector< std::size_t > prime_numbers; // CH struct reduce_task : public boost::mapreduce::reduce_task< std::size_t , unsigned > { template< typename Runtime , typename It > static void reduce( Runtime& runtime , const std::size_t& key , It it , const It ite ) { if( key > 0 ) { // copy( it, ite, back_inserter( prime_numbers )); // CH for_each(it, ite, boost::bind(&Runtime::emit, &runtime, _1, 0)); // CH } } }; typedef boost::mapreduce::job< prime_calculator::map_task , prime_calculator::reduce_task , boost::mapreduce::null_combiner , prime_calculator::number_source< prime_calculator::map_task >
job;
} // namespace prime_calculator int main(int argc, char* argv[]) { boost::mapreduce::specification spec; boost::mapreduce::results result; prime_calculator::job::datasource_type datasource( 0, 1000 ); spec.map_tasks = 0; spec.reduce_tasks = std::max( 1U, boost::thread::hardware_concurrency() ); std::cout << "\nRunning Parallel Prime_Calculator MapReduce..."; prime_calculator::job job( datasource, spec ); job.run< boost::mapreduce::schedule_policy::cpu_parallel<prime_calculator::job> >( result ); std::cout << "\nMapReduce Finished."; for (prime_calculator::job::const_result_iterator it=job.begin_results(); it!=job.end_results(); ++it) std::cout << it->first << " "; return 0; }
Where in the sandbox is mapreduce? Thanks for all your comments! I'll get back to you once I can run the prime calculator. On Fri, Aug 21, 2009 at 3:52 PM, Craig Henderson<cdm.henderson@googlemail.com> wrote:
-----Original Message----- From: boost-users-bounces@lists.boost.org [mailto:boost-users- bounces@lists.boost.org] On Behalf Of Christian Henning Sent: 21 August 2009 15:05 To: boost-users@lists.boost.org Subject: Re: [Boost-users] [mapreduce] Prim Calculator
Hi Craig, what you're suggesting is exactly what I do. For reasons I don't understand I'm getting the following assertion:
Running Parallel Prime_Calculator MapReduce...Assertion failed: map_key != typename map_task_type::key_type(), file c:\chh\boost\boost\mapreduce\job.hpp, line 210
I don't understand why that happens? Can you help me out here.
Hi Christian The assertion is an incorrect error condition which meant that the key could not be a default constructed valued (in the case of your numeric key, zero). I have updated the sandbox - thanks for finding this. Get the latest version of job.hpp & in_memory.hpp from the sandbox.
There are a couple other problems in your implementation, too. * number_source constructor should initialize _current to be start, not zero. * using a global to hold prime_numbers isn't thread-safe, and adding locking will introduce contention. The MapReduce library provides mechanisms to avoid this. In reduce_task::reduce, replace the line copy( it, ite, back_inserter( prime_numbers )); with for_each(it, ite, boost::bind(&Runtime::emit, &runtime, _1, 0)); which will emit the final results.
To display the results: for (prime_calculator::job::const_result_iterator it=job.begin_results(); it!=job.end_results(); ++it) std::cout << it->first << " ";
For performance, the datasource should return a range of integers so each task tests many prime numbers rather than one at a time. This will reduce lock contention in retrieving map keys.
Finally, 0 & 1 are not prime numbers - this is a bug in your is_prime function.
Regards -- Craig
#include <algorithm> #include <cmath> #include <numeric>
#include <boost/mapreduce.hpp>
namespace prime_calculator {
std::size_t is_prime( std::size_t number ) { long n = static_cast<long>( number );
if( number == 0 ) return 1;
n = std::abs( n ); std::size_t sqrt_number = static_cast< std::size_t >( std::sqrt( static_cast< double >( n )));
for( std::size_t i = 2; i <= sqrt_number; i++ ) { if( n % i == 0 ) return 0; }
return 1; }
template< typename MapTask > class number_source : boost::noncopyable { public:
number_source( std::size_t start , std::size_t end ) : _start( start ) , _end ( end ) , _current( start ) // CH {}
const bool setup_key( typename MapTask::key_type& key ) const { if( _current < _end ) { key = _current; return true; } else { return false; } }
const bool get_data( typename MapTask::key_type& key , typename MapTask::value_type& value ) { if( _current < _end ) { value.first = _current; value.second = _current; _current++;
return true; } else { return false; } }
private:
std::size_t _start; std::size_t _end; std::size_t _current; };
struct map_task : public boost::mapreduce::map_task< std::size_t // MapKey , std::pair< std::size_t , std::size_t > // MapValue > { template<typename Runtime> static void map( Runtime& runtime , const std::size_t& /*key*/ , value_type& value ) { runtime.emit_intermediate( is_prime( value.first ), value.first ); } };
//std::vector< std::size_t > prime_numbers; // CH
struct reduce_task : public boost::mapreduce::reduce_task< std::size_t , unsigned > { template< typename Runtime , typename It > static void reduce( Runtime& runtime , const std::size_t& key , It it , const It ite ) { if( key > 0 ) { // copy( it, ite, back_inserter( prime_numbers )); // CH for_each(it, ite, boost::bind(&Runtime::emit, &runtime, _1, 0)); // CH } } };
typedef boost::mapreduce::job< prime_calculator::map_task , prime_calculator::reduce_task , boost::mapreduce::null_combiner , prime_calculator::number_source< prime_calculator::map_task >
job;
} // namespace prime_calculator
int main(int argc, char* argv[]) { boost::mapreduce::specification spec;
boost::mapreduce::results result; prime_calculator::job::datasource_type datasource( 0, 1000 );
spec.map_tasks = 0; spec.reduce_tasks = std::max( 1U, boost::thread::hardware_concurrency() );
std::cout << "\nRunning Parallel Prime_Calculator MapReduce..."; prime_calculator::job job( datasource, spec ); job.run< boost::mapreduce::schedule_policy::cpu_parallel<prime_calculator::job> >( result ); std::cout << "\nMapReduce Finished.";
for (prime_calculator::job::const_result_iterator it=job.begin_results(); it!=job.end_results(); ++it) std::cout << it->first << " ";
return 0; }
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Where in the sandbox is mapreduce? Thanks for all your comments! I'll get back to you once I can run the prime calculator.
Here's a direct link to the files you need. https://svn.boost.org/svn/boost/sandbox/boost/mapreduce/job.hpp https://svn.boost.org/svn/boost/sandbox/boost/mapreduce/intermediates/in_mem ory.hpp -- Craig
Where in the sandbox is mapreduce? Thanks for all your comments! I'll get back to you once I can run the prime calculator.
Here's a direct link to the files you need.
https://svn.boost.org/svn/boost/sandbox/boost/mapreduce/job.hpp https://svn.boost.org/svn/boost/sandbox/boost/mapreduce/intermediates/i n_memory.hpp
-- Craig
I've had more time to look at this now. While the code I posted here yesterday works with the library in the sandbox, there is a flaw in the logic that I missed. I have re-written the Prime Calculator with tighter type correctness and discovered the problem. Christian, you are using std::size and unsigned for your reduce key/value types, which allows the emit() to swap keys and values around, so the final output has the prime numbers in the key field instead of the value. This is incorrect, and the result should be (true, (3,5,7,...)) and not ((3,0),(5,0),(7,0),...) This has led to me discovering a problem in the library when iterating over the results. The assertion in boost/mapreduce/intermediates/in_memory.hpp, line 141 catches cases where there are multiple Values for a reduce Key. This should be valid, and the assertion is incorrect, however, the iteration code cannot currently cope with this. I need to re-visit this and post a fix. In the meantime, remove the iteration at the end of main() and define the job as typedef boost::mapreduce::job<prime_calculator::map_task , prime_calculator::reduce_task , boost::mapreduce::null_combiner , prime_calculator::number_source<prime_calculator::map_task> , boost::mapreduce::intermediates::in_memory<prime_calculator::map_task,prime_ calculator::reduce_task> , boost::mapreduce::intermediates::reduce_file_output<prime_calculator::map_ta sk,prime_calculator::reduce_task>
job;
Which will create a file mapreduce_2_of_2 containing the Prime Numbers in column 2. mapreduce_1_of_2 will be empty because the non-Primes are not emitted in the reduce task. Regards -- Craig
Hi Craig, I also have rewritten the prime_calculator. I never like that the fact that the reduce key type was std::size_t. It should be bool. is_prime is now returning a boolean. I also have changed map value type from std::pair<std::size_t,std::size_t> to just std::size_t. I hope this is correct in terms of the mapreduce methodologies. To adopt the mapreduce problem description notation this is what I want: map: ( number, number ) -----> list( boolean, number ) reduce: ( boolean, list( number ) ---------> list( number ) Well, it all compiles and runs but the results is empty. I have intercepted the reduce function and the supplied list is correct. Meaning all primes are in there. Weird. Dunno what's wrong here. To make sure we are on the same page, I have added this project to my subversion on google code. Here is the link: http://gil-contributions.googlecode.com/svn/trunk/prime_calculator Regards, Christian
-----Original Message----- From: boost-users-bounces@lists.boost.org [mailto:boost-users- bounces@lists.boost.org] On Behalf Of Christian Henning Sent: 22 August 2009 17:19 To: boost-users@lists.boost.org Subject: Re: [Boost-users] [mapreduce] Prim Calculator
Hi Craig,
I also have rewritten the prime_calculator. I never like that the fact that the reduce key type was std::size_t. It should be bool. is_prime is now returning a boolean. I also have changed map value type from std::pair<std::size_t,std::size_t> to just std::size_t. I hope this is correct in terms of the mapreduce methodologies.
To adopt the mapreduce problem description notation this is what I want:
map: ( number, number ) -----> list( boolean, number ) reduce: ( boolean, list( number ) ---------> list( number )
Well, it all compiles and runs but the results is empty. I have intercepted the reduce function and the supplied list is correct. Meaning all primes are in there. Weird. Dunno what's wrong here.
Did you try defining the job to write the results to file? This will work until I fix the iterator issue. A small optimization in the is_prime is to check for %2 - this avoids expensive sqrt & loops for even numbers that are always not prime bool const is_prime(long const number) { if (number == 0 || number == 1) return false; else if (number == 2) return true; else if (number%2 == 0) return false; ... } Regards -- Craig
Craig,
Well, it all compiles and runs but the results is empty. I have intercepted the reduce function and the supplied list is correct. Meaning all primes are in there. Weird. Dunno what's wrong here.
Did you try defining the job to write the results to file? This will work until I fix the iterator issue.
Yes, I'm using reduce_file_output. But all files are empty. Can you reproduce that? Please make sure to get the latest from my subversion.
A small optimization in the is_prime is to check for %2 - this avoids expensive sqrt & loops for even numbers that are always not prime
bool const is_prime(long const number) { if (number == 0 || number == 1) return false; else if (number == 2) return true; else if (number%2 == 0) return false; ... }
Thanks. ;-) Most important for me right now is to learn how to solve problems with mapreduce. Ideas for some other problems? Christian
I'm afraid to admit that I f***ed up. The program is now working. I invalidated the iterators with a for loop before the for_each call. When I remove my for loop, the results are inside the output files. Cool stuff, Christian On Sat, Aug 22, 2009 at 2:15 PM, Christian Henning<chhenning@gmail.com> wrote:
Craig,
Well, it all compiles and runs but the results is empty. I have intercepted the reduce function and the supplied list is correct. Meaning all primes are in there. Weird. Dunno what's wrong here.
Did you try defining the job to write the results to file? This will work until I fix the iterator issue.
Yes, I'm using reduce_file_output. But all files are empty. Can you reproduce that? Please make sure to get the latest from my subversion.
A small optimization in the is_prime is to check for %2 - this avoids expensive sqrt & loops for even numbers that are always not prime
bool const is_prime(long const number) { if (number == 0 || number == 1) return false; else if (number == 2) return true; else if (number%2 == 0) return false; ... }
Thanks. ;-) Most important for me right now is to learn how to solve problems with mapreduce. Ideas for some other problems?
Christian
I'm afraid to admit that I f***ed up. The program is now working. I invalidated the iterators with a for loop before the for_each call. When I remove my for loop, the results are inside the output files.
Cool stuff, Christian
Great! I've added my version to the sandbox at http://svn.boost.org/svn/boost/sandbox/libs/mapreduce/examples/prime/ This includes a modified number_source that provides a range for each map task to work on, reducing locking contention. Regards -- Craig
Cool. I have changed my code towards using ranges, as well. Also, my map only emits prime numbers and no none-prime numbers. This is for better memory requirements. When I run ranges of 1,000,000 numbers wide, I can see that my 4 cores are 100% busy and kernel time is close to nothing. Though, the processor utilization seems fine. When do you think you'll fix the problem with iterating over the results? Regards, Christian On Sat, Aug 22, 2009 at 3:07 PM, Craig Henderson<cdm.henderson@googlemail.com> wrote:
I'm afraid to admit that I f***ed up. The program is now working. I invalidated the iterators with a for loop before the for_each call. When I remove my for loop, the results are inside the output files.
Cool stuff, Christian
Great! I've added my version to the sandbox at http://svn.boost.org/svn/boost/sandbox/libs/mapreduce/examples/prime/
This includes a modified number_source that provides a range for each map task to work on, reducing locking contention.
Regards -- Craig
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Cool. I have changed my code towards using ranges, as well. Also, my map only emits prime numbers and no none-prime numbers. This is for better memory requirements. When I run ranges of 1,000,000 numbers wide, I can see that my 4 cores are 100% busy and kernel time is close to nothing. Though, the processor utilization seems fine.
Sounds good.
When do you think you'll fix the problem with iterating over the results?
I'll try to get to it tomorrow (Sunday). I'll be away from this list after that under next weekend. Regards -- Craig
From: boost-users-bounces@lists.boost.org [mailto:boost-users- bounces@lists.boost.org] On Behalf Of Christian Henning Sent: 22 August 2009 20:43 To: boost-users@lists.boost.org Subject: Re: [Boost-users] [mapreduce] Prim Calculator
[snip]
When do you think you'll fix the problem with iterating over the results?
Sorry for the delay - I've been on vacation this week. I have committed the fix to the sandbox. Update from https://svn.boost.org/svn/boost/sandbox/boost/mapreduce/intermediates/in_mem ory.hpp (Revision 55862) should fix the problem for you. I've also updated my version of your Prime Calculator. Let me know if this works for you? Thanks -- Craig
Thanks Craig, I'll try your fix asap. What are the next steps for your library? Christian On Sat, Aug 29, 2009 at 7:40 AM, Craig Henderson<cdm.henderson@googlemail.com> wrote:
From: boost-users-bounces@lists.boost.org [mailto:boost-users- bounces@lists.boost.org] On Behalf Of Christian Henning Sent: 22 August 2009 20:43 To: boost-users@lists.boost.org Subject: Re: [Boost-users] [mapreduce] Prim Calculator
[snip]
When do you think you'll fix the problem with iterating over the results?
Sorry for the delay - I've been on vacation this week.
I have committed the fix to the sandbox. Update from https://svn.boost.org/svn/boost/sandbox/boost/mapreduce/intermediates/in_mem ory.hpp (Revision 55862) should fix the problem for you. I've also updated my version of your Prime Calculator.
Let me know if this works for you?
Thanks -- Craig
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Hi Craig, thanks for your fix. Everything works now as expected! On Sat, Aug 29, 2009 at 11:22 AM, Christian Henning<chhenning@gmail.com> wrote:
Thanks Craig, I'll try your fix asap. What are the next steps for your library?
Christian
On Sat, Aug 29, 2009 at 7:40 AM, Craig Henderson<cdm.henderson@googlemail.com> wrote:
From: boost-users-bounces@lists.boost.org [mailto:boost-users- bounces@lists.boost.org] On Behalf Of Christian Henning Sent: 22 August 2009 20:43 To: boost-users@lists.boost.org Subject: Re: [Boost-users] [mapreduce] Prim Calculator
[snip]
When do you think you'll fix the problem with iterating over the results?
Sorry for the delay - I've been on vacation this week.
I have committed the fix to the sandbox. Update from https://svn.boost.org/svn/boost/sandbox/boost/mapreduce/intermediates/in_mem ory.hpp (Revision 55862) should fix the problem for you. I've also updated my version of your Prime Calculator.
Let me know if this works for you?
Thanks -- Craig
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Craig, I have added your comments and it seems the system to finishes. But when I iterate over the results I have another assertion in in_memory.hpp[141]. Commenting out helps. Christian #include <algorithm> #include <cmath> #include <numeric> #include <boost/mapreduce.hpp> namespace prime_calculator { std::size_t is_prime( std::size_t number ) { long n = static_cast<long>( number ); if( number == 0 || number == 1 ) return 0; n = std::abs( n ); std::size_t sqrt_number = static_cast< std::size_t >( std::sqrt( static_cast< double >( n ))); for( std::size_t i = 2; i <= sqrt_number; i++ ) { if( n % i == 0 ) return 0; } return 1; } template< typename MapTask > class number_source : boost::noncopyable { public: number_source( std::size_t start , std::size_t end ) : _start ( start ) , _end ( end ) , _current( start ) {} const bool setup_key( typename MapTask::key_type& key ) const { if( _current < _end ) { key = _current; return true; } else { return false; } } const bool get_data( typename MapTask::key_type& key , typename MapTask::value_type& value ) { if( _current < _end ) { value.first = _current; value.second = _current; _current++; return true; } else { return false; } } private: std::size_t _start; std::size_t _end; std::size_t _current; }; struct map_task : public boost::mapreduce::map_task< std::size_t // MapKey , std::pair< std::size_t , std::size_t > // MapValue > { template<typename Runtime> static void map( Runtime& runtime , const std::size_t& //key , value_type& value ) { runtime.emit_intermediate( is_prime( value.first ), value.first ); } }; struct reduce_task : public boost::mapreduce::reduce_task< std::size_t , unsigned > { template< typename Runtime , typename It > static void reduce( Runtime& runtime , const std::size_t& key , It it , const It ite ) { if( key > 0 ) { for_each( it , ite , boost::bind( &Runtime::emit , &runtime , _1 , 0 ) ); } } }; typedef boost::mapreduce::job< prime_calculator::map_task , prime_calculator::reduce_task , boost::mapreduce::null_combiner , prime_calculator::number_source< prime_calculator::map_task >
job;
} // namespace prime_calculator int _tmain(int argc, _TCHAR* argv[]) { boost::mapreduce::specification spec; boost::mapreduce::results result; prime_calculator::job::datasource_type datasource( 0, 1000 ); spec.map_tasks = 0; spec.reduce_tasks = std::max( 1U, boost::thread::hardware_concurrency() ); std::cout << "\nRunning Parallel Prime_Calculator MapReduce..." << std::endl; prime_calculator::job job( datasource, spec ); job.run< boost::mapreduce::schedule_policy::cpu_parallel<prime_calculator::job
( result ); std::cout << "\nMapReduce Finished." << std::endl;
for( prime_calculator::job::const_result_iterator it = job.begin_results() ; it!=job.end_results() ; ++it ) { std::cout << it->first << " "; } return 0; }
Craig, I have added your comments and it seems the system to finishes. But when I iterate over the results I have another assertion in in_memory.hpp[141]. Commenting out helps.
Hmm, I compiled & ran your pasted code and it runs fine. What compiler/platform are you on? Did you get the latest in_memory.hpp from the sandbox (https://svn.boost.org/svn/boost/sandbox/boost/mapreduce/intermediates/in_me mory.hpp) save it in the intermediates directory? -- Craig
I'm using VS2005. I'll make sure tomorrow. Did you check the output? For some reasons "0" is displayed but not "2" and "3". Which is all wrong. On Fri, Aug 21, 2009 at 5:44 PM, Craig Henderson<cdm.henderson@googlemail.com> wrote:
Craig, I have added your comments and it seems the system to finishes. But when I iterate over the results I have another assertion in in_memory.hpp[141]. Commenting out helps.
Hmm, I compiled & ran your pasted code and it runs fine. What compiler/platform are you on?
Did you get the latest in_memory.hpp from the sandbox (https://svn.boost.org/svn/boost/sandbox/boost/mapreduce/intermediates/in_me mory.hpp) save it in the intermediates directory?
-- Craig
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
bounces@lists.boost.org] On Behalf Of Christian Henning
I'm using VS2005. I'll make sure tomorrow. Did you check the output? For some reasons "0" is displayed but not "2" and "3". Which is all wrong.
The problems you see I'm sure are because for some reason you don't have the latest version of in_memory.hpp with the fixes I made yesterday. Here's my output: Running Parallel Prime_Calculator MapReduce... MapReduce Finished. 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379 383 389 397 401 409 419 421 431 433 439 443 449 457 461 463 467 479 487 491 499 503 509 521 523 541 547 557 563 569 571 577 587 593 599 601 607 613 617 619 631 641 643 647 653 659 661 673 677 683 691 701 709 719 727 733 739 743 751 757 761 769 773 787 797 809 811 821 823 827 829 839 853 857 859 863 877 881 883 887 907 911 919 929 937 941 947 953 967 971 977 983 991 997 Regards -- Craig
participants (2)
-
Christian Henning
-
Craig Henderson