boost::interprocess, shared memory and multi-core
Shared memory or other constructs from boost::interprocess will be used on multi-core computers? Will it be a performance degradation if two processes are allocated to different cores versus one core? The same with threads? Another look at this question is - should one program the inter-process/inter- thread communication first and worry about multi-core later? Or something should be planned at the development stage?
QuitePlace wrote:
Shared memory or other constructs from boost::interprocess will be used on multi-core computers? Will it be a performance degradation if two processes are allocated to different cores versus one core? The same with threads?
In theory, there is no problem with multi-core computers. The OS does the job of mapping memory between two processes wherever they are. Regards, Ion
On Dec 17, 2008, at 10:09 AM, QuitePlace wrote:
Another look at this question is - should one program the inter- process/inter- thread communication first and worry about multi-core later? Or something should be planned at the development stage?
As with other areas such as exception handling etc., I would advise you to take multi-core technology into account at design time. Programing efficiently for more than one core is certainly not as easy as making cookies and thus if you don't plan up front you are likely to lose efficiency later on. Especially in your example about interprocess/interthread communications. Ciao, Andreas
Andreas Masur
On Dec 17, 2008, at 10:09 AM, QuitePlace wrote:
Another look at this question is - should one program the inter- process/inter- thread communication first and worry about multi-core later? Or something should be planned at the development stage?
As with other areas such as exception handling etc., I would advise you to take multi-core technology into account at design time. Programing efficiently for more than one core is certainly not as easy as making cookies and thus if you don't plan up front you are likely to lose efficiency later on. Especially in your example about interprocess/interthread communications.
Ciao, Andreas
This is exactly what worries me. Should that planning be done "outside" of boost framework? For example, if I am using features provided by ::interconnect library - how am I suppose to take multi-core technology into account if, say, "shared memory" already locks me into a solution where I don't have much control on anything multi-core and where "multi-core" is not even present as a concept? Your comments are very much appreciated
On Thu, Dec 18, 2008 at 4:55 PM, QPlace
Andreas Masur
writes: On Dec 17, 2008, at 10:09 AM, QuitePlace wrote:
Another look at this question is - should one program the inter- process/inter- thread communication first and worry about multi-core later? Or something should be planned at the development stage?
As with other areas such as exception handling etc., I would advise you to take multi-core technology into account at design time. Programing efficiently for more than one core is certainly not as easy as making cookies and thus if you don't plan up front you are likely to lose efficiency later on. Especially in your example about interprocess/interthread communications.
Ciao, Andreas
This is exactly what worries me. Should that planning be done "outside" of boost framework? For example, if I am using features provided by ::interconnect library - how am I suppose to take multi-core technology into account if, say, "shared memory" already locks me into a solution where I don't have much control on anything multi-core and where "multi-core" is not even present as a concept?
Your comments are very much appreciated
Just some side comments on multi-cpu program design. If you want a program truly multi-cpu, without slowing down the more cpu's you access, then the programming style you would need to use is the Actor style, or one of its kin. Basically you need to treat every Actor as its own state, pretend there is no such thing as global state, so do not use statics, no globals, etc... It is not easy to do in C++, but it can be replicated well enough. Your problem domain will also need to be easily separated so it can be operated in parts, if it cannot then you have a bigger problem then just the design, and almost all programs can be split up to some extent. Read up on the Actor model, it will give you plenty of ideas. Perhaps work with Erlang a bit to get a feel for the Actor style. The knowledge you come away with is invaluable for designing scalable multi-threaded apps.
OvermindDL1
On Thu, Dec 18, 2008 at 4:55 PM, QPlace
wrote: Andreas Masur
writes: On Dec 17, 2008, at 10:09 AM, QuitePlace wrote:
Another look at this question is - should one program the inter- process/inter- thread communication first and worry about multi-core later? Or something should be planned at the development stage?
As with other areas such as exception handling etc., I would advise you to take multi-core technology into account at design time. Programing efficiently for more than one core is certainly not as easy as making cookies and thus if you don't plan up front you are likely to lose efficiency later on. Especially in your example about interprocess/interthread communications.
Ciao, Andreas
This is exactly what worries me. Should that planning be done "outside" of boost framework? For example, if I am using features provided
by ::interconnect
library - how am I suppose to take multi-core technology into account if, say, "shared memory" already locks me into a solution where I don't have much control on anything multi-core and where "multi-core" is not even present as a concept?
Your comments are very much appreciated
Just some side comments on multi-cpu program design. If you want a program truly multi-cpu, without slowing down the more cpu's you access, then the programming style you would need to use is the Actor style, or one of its kin. Basically you need to treat every Actor as its own state, pretend there is no such thing as global state, so do not use statics, no globals, etc... It is not easy to do in C++, but it can be replicated well enough. Your problem domain will also need to be easily separated so it can be operated in parts, if it cannot then you have a bigger problem then just the design, and almost all programs can be split up to some extent. Read up on the Actor model, it will give you plenty of ideas. Perhaps work with Erlang a bit to get a feel for the Actor style. The knowledge you come away with is invaluable for designing scalable multi-threaded apps.
Thank you for your comments, I will definitely follow your advise. But, coming back to "shared memory" issue and usage of it - what is your opinion on the following scenario: Say, there is a producer of data on one core and multiple consumers on other cores. "Shared memory" should have some sort of exclusive lock on it in order to support write/read ops, shouldn't it? If it is true then there might be some sort of a bottleneck using "shared memory" for data pumping in multi-core system like that? May be a network data exchange between the cores is better from the scalability standpoint?
On Fri, Dec 19, 2008 at 10:47 AM, QPlace
OvermindDL1
writes: On Thu, Dec 18, 2008 at 4:55 PM, QPlace
wrote: Andreas Masur
writes: On Dec 17, 2008, at 10:09 AM, QuitePlace wrote:
Another look at this question is - should one program the inter- process/inter- thread communication first and worry about multi-core later? Or something should be planned at the development stage?
As with other areas such as exception handling etc., I would advise you to take multi-core technology into account at design time. Programing efficiently for more than one core is certainly not as easy as making cookies and thus if you don't plan up front you are likely to lose efficiency later on. Especially in your example about interprocess/interthread communications.
Ciao, Andreas
This is exactly what worries me. Should that planning be done "outside" of boost framework? For example, if I am using features provided
by ::interconnect
library - how am I suppose to take multi-core technology into account if, say, "shared memory" already locks me into a solution where I don't have much control on anything multi-core and where "multi-core" is not even present as a concept?
Your comments are very much appreciated
Just some side comments on multi-cpu program design. If you want a program truly multi-cpu, without slowing down the more cpu's you access, then the programming style you would need to use is the Actor style, or one of its kin. Basically you need to treat every Actor as its own state, pretend there is no such thing as global state, so do not use statics, no globals, etc... It is not easy to do in C++, but it can be replicated well enough. Your problem domain will also need to be easily separated so it can be operated in parts, if it cannot then you have a bigger problem then just the design, and almost all programs can be split up to some extent. Read up on the Actor model, it will give you plenty of ideas. Perhaps work with Erlang a bit to get a feel for the Actor style. The knowledge you come away with is invaluable for designing scalable multi-threaded apps.
Thank you for your comments, I will definitely follow your advise. But, coming back to "shared memory" issue and usage of it - what is your opinion on the following scenario: Say, there is a producer of data on one core and multiple consumers on other cores. "Shared memory" should have some sort of exclusive lock on it in order to support write/read ops, shouldn't it? If it is true then there might be some sort of a bottleneck using "shared memory" for data pumping in multi-core system like that? May be a network data exchange between the cores is better from the scalability standpoint?
I use shared memory in my Actor-style libraries, but they do not use locks, rather they use atomic CAS assembly (I have to use assembly for it since it is not in the current C++ standard, they are in the next C++ standard though, will be glad to drop the assembly then). Compared to using locks, a circular linked list in shared memory that is managed with non-locking primitives is very fast. Have to change the backend coding style a touch, but I have noticed definite performance enhancements, and since nothing blocks there should be no speed hit for any number of threads. The only issue is if there is a contention (multiple threads writing to the same memory at the *exact* same time, and I mean exact near to the nanosecond, in which case the atomic CAS fails causing a stall in the pipeline for about 12 cycles on AMD cpu's and about 40 cycles on Intel CPU's, although that might be changed with the core 2 duo's, have not tried those yet), but just reissue the command after reading in and accounting for the new data and all is good. If you intend to want to scale arbitrarily, then there are two main things I have learned. First, make sure that what you are coding *can* be split up in the first place. Second, do not use locks except in a few very rare circumstances, use atomic CAS (which is what kernel's use for example). There is plenty of info on Atomic Compare-And-Swap on the 'net, and a library or two if you do not fancy writing the assembly yourself (or just wait till C++09 comes out with supporting compilers).
participants (5)
-
Andreas Masur
-
Ion Gaztañaga
-
OvermindDL1
-
QPlace
-
QuitePlace