[serialization]Boost.Serialization1.33.0BreaksSerializationCodeinDLLs

I've been reading this thread with increasing alarm. I agree with Martin Ecker and Vladimir Prus that the synchronization in question belongs inside the serialization library. The system we are building will have several inter-process communication mechanisms, which I'm hoping to be able to build on top of Boost.Serialization. (As noted in another thread, I've run across some performance issues, which might be addressed by resetting and reusing archives, rather than constructing new archives all the time.) It is expected in our system design that it will be possible to dynamically load DLLs containing new types which support serialization, and then create new IPC connections (with associated archives) to/from which instances of those newly loaded types can be serialized. It is not a requirement that previously existing IPC connections (and their associated archives, either presently in use, or available for reuse if archive resetting is used) support these new types, since the type information for the connection is specified at connection creation time. We expect a substantial number of such connections to exist at any one time, with many threads involved, and they have some latency requirements. We presently plan to punt support for unloading DLLs, on the basis that the benefit (in our system) would be small compared to the design / implementation / documentation cost for supporting unloading safely and correctly. (We might revisit the decision about unloading, but for now supporting it is not a requirement for us.) I'm pretty sure that intertwining runtime DLL loading with all of our clients of the serialization library is just not going to fly. I think that making it a configuration option whether the library protects its internal data structures is a reasonable approach, since some applications won't need it (either because they are single-threaded or are willing to deal with the issues at the application level) and might not want to pay the cost. But for some applications, having the library be responsible for the consistency of its internal data structures is really the only realistic option. I think the only way I will be able to sell the use of the serialization library for our project is to either convince you to change on this issue, or to obtain or develop a patch and carry it forward. The latter might be less work in the long term than developing the subset of serialization features that we actually need, but the long-term maintenance headache for a patch is worrisome.

Kim Barrett wrote:
The system we are building will have several inter-process communication mechanisms, which I'm hoping to be able to build on top of Boost.Serialization. (As noted in another thread, I've run across some performance issues, which might be addressed by resetting and reusing archives, rather than constructing new archives all the time.)
So, we have received only anecdotal data on where such performance bottlenecks might be.
It is expected in our system design that it will be possible to dynamically load DLLs containing new types which support serialization, and then create new IPC connections (with associated archives) to/from which instances of those newly loaded types can be serialized.
I'm pretty sure that intertwining runtime DLL loading with all of our clients of the serialization library is just not going to fly
Note that as presentlly implemented an archive used for marshalleling something like IPC transaction would be a short operation - open archive with a string stream, serialize, close archive, send string to ipc connection. My previous suggested solution of deriving from an existing archive class an adding would work very well in this case and be indistinguishable from a solution which built threading in at a lower level.
I think the only way I will be able to sell the use of the serialization library for our project is to either convince you to change on this issue,
I yet see no reason to change my view.
or to obtain or develop a patch and carry it forward.
Good luck, Robert Ramey

At 8:32 PM -0700 8/19/05, Robert Ramey wrote:
Kim Barrett wrote:
(As noted in another thread, I've run across some performance issues, which might be addressed by resetting and reusing archives, rather than constructing new archives all the time.)
So, we have received only anecdotal data on where such performance bottlenecks might be.
Reset and reuse demonstrably helps; the question was why. And Alex Besogonov <cyberax@elewise.com> wrote (15-Aug-2005):
I've managed to pinpoint the bottleneck: it's containers reallocations of dynamic storage. Each time serialization is performed at least one dynamic memory allocation for each container is neccessary.
If containers are reused then these allocations are performed only once, because STL containers don't deallocate underlying storage in their clear/reset/... methods.
That seems like more than anecdotal data...
I'm pretty sure that intertwining runtime DLL loading with all of our clients of the serialization library is just not going to fly
Note that as presentlly implemented an archive used for marshalleling something like IPC transaction would be a short operation - open archive with a string stream, serialize, close archive, send string to ipc connection.
And as noted previously, that approach of creating a new archive for each marshalling operation has a significant performance impact.
My previous suggested solution of deriving from an existing archive class an adding would work very well in this case and be indistinguishable from a solution which built threading in at a lower level.
Even without the performance implications for creating new archives, this would not be acceptable in our system. These IPC transactions are part of a robot control system. We can withstand some fine-grained latencies due to lock contention for synchronized data structures. Turning off the whole IPC system for however long it takes to load a DLL is something else entirely.
participants (2)
-
Kim Barrett
-
Robert Ramey