Sorry for not getting back sooner, and thanks for the initial reply. I added a print to packed_archive_send, so it now looks like:

void
packed_archive_send(MPI_Comm comm, int dest, int tag,
                    const packed_oarchive& ar)
{
std::cout << "ar size: " << ar.size() << std::endl;
  const void* size = &ar.size();
  ....

When I run this under release mode I with, I get:

ar size: 4265919640

MPI_Send: Invalid count, error stack:
MPI_Send(176): MPI_Send(buf=0x01E8AE18, count=-29047656, MPI_PACKED, dest=1, tag
=2147483647, MPI_COMM_WORLD) failed
MPI_Send(101): Negative count, value is -29047656

When I run it under debug mode, I don't get the error and 

ar size: 1622.

Any suggestions how to proceed are much appreciated. 

Thanks,

Nick


On Oct 19, 2010, at 8:20 PM, Matthias Troyer wrote:


On 19 Oct 2010, at 14:04, Nick Collier wrote:

Hi,

I've got an MPI application that uses the boost mpi libraries. It runs fine on OSX and Linux when compiled using both debug and more optimized compiler flags. Unfortunately, on windows, the application only runs when compiled using Visual Studio 2008 debug configuration. When I run the executable compiled in the release configuration it crashes, but not in the same place every time in my code. It does only occur though when I make communicator::send and communicator::recv calls, sending and receiving vectors of ints. In those cases, I get exceptions like:

{routine_=0x00d37d38 "MPI_Send" result_code_=805931010 message="MPI_Send: Invalid count, error stack:
MPI_Send(176): MPI_Send(buf=0x001BF950, count=-1833296, MPI_PACKED, dest=1, tag=2001, MPI_COMM_WORLD) failed
MPI_Send(101): Negative count, value is -1833296" }

Sometimes its for a send and sometimes for a receive, but the exception is always for a "negative count" which looks suspiciously like an uninitialized integer.

I figured I'm doing something wrong either in how I've compiled the boost libraries or in my own code (although its odd that it works in both linux and osx). Any suggestions obviously appreciated.

This is for boost-1.39. We are prevented from using anything newer because the machine we ultimately deploy on has boost-1.39 installed. 

On windows, I'm using microsoft's mpi implementation (based on MPICH2) under windows 7. The boost mpi libraries are compiled in both static debug and release mt versions. 


Not having Windows MPI and not having your code I cannot help much. But here are some suggestions:

The send is probably the send in packed_archive_send in point_to_point.cpp
Could you add some print statements there to check the size of the archive ar.size(). This is very strange since ar.size() is the size of a std::vector, and that cannot be negative. This sounds like sone strange compiler bug that we should identify and then work around.

Matthias


_______________________________________________
Boost-users mailing list
Boost-users@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users