[context review] Several Questions

Hello, Before I start working with this library for review I need a clarification of several statements to understand actual usefulness of this library:
A context switch between threads costs usually thousands of CPU cycles on x86 compared to a user-level context switch with few hundreds of cycles.
Performance ----------- One of the first things I did when read this statement and take a look on measurements is to write my own benchmark of context switch. I've run it on: - Intel i5 2.5GHz CPU 2 cores 4 threads. - Linux x86_64, Ubuntu 10.10 And compared context switch performance using sched_yeild and jump_to. I used default boost::context<> settings and default build (against Boost-1.46.0) and used dummy switch to measure actual operations beyond the switching There were two threads giving each one a time quanta and measured how much time context switch takes (of course included some warm up) sched_yeild - 377us Boost.Contex - 214us Dummy - 10us All tests done on a single CPU using taskset 0x1 ./test params So finally I can see that Boost.Context does not behave **much** better then OS context switch? I understand that I probably used by default ucontext but it is default and this is how it is going to be shipped by most distributions as it would probably be the safest. I need to see rationale, limitations and so on, in very explicit way. Usefulness of N:M model (or even 1:M model) ------------------------------------------- Long time ago OS developers used N:M threading model where several kernel threads were mapped to several user-space thread: - Solaris < 9 or 10 - Linux <= 2.4, - FreeBSD < 8 All these OSs today moved to 1:1 model as most efficient one, so I can't buy it that N:M or even 1:N model in this case would give performance advantage. As you know, even POSIX 2008 deprecated ucontext at all. So I would like to see some very good and based rationale with description of specific use cases and examples. I understand that it may be very useful paradigm but as long as I see most of implementations move away of N:M model... Why do you bring it back despite of huge drawbacks users space threads have: like interaction with blocking system calls, interaction with physical CPUs and so on? -------------------- Thanks, I'd like to see an answer on this topics before I continue. Artyom

-------- Original-Nachricht --------
Datum: Mon, 21 Mar 2011 01:26:31 -0700 (PDT) Von: Artyom <artyomtnk@yahoo.com> An: boost@lists.boost.org Betreff: [boost] [context review] Several Questions
Hello,
Before I start working with this library for review I need a clarification of several statements to understand actual usefulness of this library:
A context switch between threads costs usually thousands of CPU cycles on x86 compared to a user-level context switch with few hundreds of cycles.
Performance -----------
One of the first things I did when read this statement and take a look on measurements is to write my own benchmark of context switch.
I've run it on:
- Intel i5 2.5GHz CPU 2 cores 4 threads. - Linux x86_64, Ubuntu 10.10
And compared context switch performance using sched_yeild and jump_to. I used default boost::context<> settings and default build (against Boost-1.46.0) and used dummy switch to measure actual operations beyond the switching
There were two threads giving each one a time quanta and measured how much time context switch takes (of course included some warm up)
sched_yeild - 377us Boost.Contex - 214us Dummy - 10us
All tests done on a single CPU using taskset 0x1 ./test params
So finally I can see that Boost.Context does not behave **much** better then OS context switch?
I understand that I probably used by default ucontext but it is default and this is how it is going to be shipped by most distributions as it would probably be the safest.
I need to see rationale, limitations and so on, in very explicit way.
Hmm - did you see the performance measuring test coming with the lib (libs/context/performace)? It counts the CPU cycles taken by a switch. The tool was used to compare ucontext_t (which does some system calls for preserving the signal mask) and fcontext_t (which doesn't perserve the signal mask and is which is implemented in assembler). I modified it and I measured following (AMD Athlon(tm) 64 X2 Dual Core Processor 4400+): cmd: performace -s -i 100 sched_yield(): 2108 cycles ucontext_t: 1795 cycles fcontext_t: 156 cycles Maybe you can send me your test code?
Usefulness of N:M model (or even 1:M model) -------------------------------------------
Long time ago OS developers used N:M threading model where several kernel threads were mapped to several user-space thread:
- Solaris < 9 or 10 - Linux <= 2.4, - FreeBSD < 8
All these OSs today moved to 1:1 model as most efficient one, so I can't buy it that N:M or even 1:N model in this case would give performance advantage.
As you know, even POSIX 2008 deprecated ucontext at all.
So I would like to see some very good and based rationale with description of specific use cases and examples.
Indeed the OS develoeprs moved from N:M to 1:1 (so as Solaris did) and the N:M paradigm may not usefull for OS but may for user-land apps. But I use it for instance in boost.task to allow a task create and wait on sub-tasks using boost.context. This allows work-stealing by other worker-threads. I'm working on boost.strand and it does things like StratifiedJS ( http://onilabs.com/stratifiedjs - thanks to Fernando Pelliccioni). In general it is useable in all cases where the code may want to jump to another execution path but wants to come back and have all local data preserved in order to continue its work. sol ong, Oliver -- NEU: FreePhone - kostenlos mobil telefonieren und surfen! Jetzt informieren: http://www.gmx.net/de/go/freephone

There were two threads giving each one a time quanta and measured how much time context switch takes (of course included some warm up)
sched_yeild - 377us Boost.Contex - 214us Dummy - 10us
All tests done on a single CPU using taskset 0x1 ./test params
So finally I can see that Boost.Context does not behave **much** better then OS context switch?
Hmm - did you see the performance measuring test coming with the lib (libs/context/performace)? It counts the CPU cycles taken by a switch. The tool was used to compare ucontext_t (which does some system calls for preserving the signal mask) and fcontext_t (which doesn't perserve the signal mask and is which is implemented in assembler).
I modified it and I measured following (AMD Athlon(tm) 64 X2 Dual Core Processor 4400+):
cmd: performace -s -i 100
sched_yield(): 2108 cycles ucontext_t: 1795 cycles fcontext_t: 156 cycles
Actually it is quite consistent with what you had show, the Boost.Context is better (but not significantly) then real OS (kernel) context switch.
Maybe you can send me your test code?
I'll send it later (no access to that PC now)
Usefulness of N:M model (or even 1:M model) -------------------------------------------
Indeed the OS develoeprs moved from N:M to 1:1 (so as Solaris did) and the N:M paradigm may not usefull for OS but may for user-land apps.
But I use it for instance in boost.task to allow a task create and wait on sub-tasks using boost.context. This allows work-stealing by other worker-threads.
It would be very useful to see real usability cases. Because otherwise it is hard to estimate usefulness of this library
In general it is usable in all cases where the code may want to jump to another execution path but wants to come back and have all local data preserved in order to continue its work.
I understand, it may be useful for "yeild" like iterators and so but it is not clear how does it work and what is the advantage in comparison to ordinary event driven approach. Regards Artyom

sched_yield(): 2108 cycles ucontext_t: 1795 cycles fcontext_t: 156 cycles
Actually it is quite consistent with what you had show, the Boost.Context is better (but not significantly) then real OS (kernel) context switch.
factor 13x ?
It would be very useful to see real usability cases. Because otherwise it is hard to estimate usefulness of this library
The lib is intended to be a building block for higher abstractions like coroutines.
I understand, it may be useful for "yeild" like iterators and so but it is not clear how does it work and what is the advantage in comparison to ordinary event driven approach.
I remember that factor (http://factorcode.org/) does context switching for its internal working. You can preserve local data: void myfunc() { for ( int i = 0; i< 10; ++i) { if ( 5 == i) yield(); -- GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit gratis Handy-Flat! http://portal.gmx.net/de/go/dsl

From: Oliver Kowalke <oliver.kowalke@gmx.de>
sched_yield(): 2108 cycles ucontext_t: 1795 cycles fcontext_t: 156 cycles
Actually it is quite consistent with what you had show, the Boost.Context is better (but not significantly) then real OS (kernel) context switch.
factor 13x ?
I was talking about _default_ build which uses ucontext. Artyom

-------- Original-Nachricht --------
Datum: Mon, 21 Mar 2011 06:38:58 -0700 (PDT) Von: Artyom <artyomtnk@yahoo.com> An: boost@lists.boost.org Betreff: Re: [boost] [context review] Several Questions
From: Oliver Kowalke <oliver.kowalke@gmx.de>
sched_yield(): 2108 cycles ucontext_t: 1795 cycles fcontext_t: 156 cycles
Actually it is quite consistent with what you had show, the Boost.Context is better (but not significantly) then real OS (kernel) context switch.
factor 13x ?
I was talking about _default_ build which uses ucontext.
OK - ucontext does system calls which are time consuming. Anyway with sched_yield() you do not achive the same as with ucontext. The idea is not to block your current process/thread if some conditions are not met but to proceed with other work an return if the the conditions are met. void my_action_1() { // compute something ... // some condtions are not met // don not block thread; // jump out and do other stuff (for instance my_action-2() // in the meantime yield(); // conditions now met and we can proceed with our computation } void my_action_2() {...} the code has choosen to return execution control and let the thread do other things (like process my_action_2()). If, for instance an external event happend and the conditions are met for my_action_1() the execution control is given back to my_action_1() with all its local state (variables, stackframe, etc.). Using event-loops you can achive similiar things (not fully equivalent) but you have to be always aware of the event loop - and at least I don't feel this paradigm straigth forward. If you take a look into boos.tasklet you could see that this lib has some classes like mutex,condition- and event-variables etc. whic hhide the yield() call. You can programm like for boost.thread. boost.context and boost.tasklet are derivative work from boost.task. boost.task implements a thread-pool and my aim was that the tasks I push to the poll don't block the worker-thread if the current computation can not be fullfilled because some criteria are not met. For instance a task creates certain amount of subtasks and has to wait for the result of its sub-tasks. Until all sub-tasks are finished the parent-task blocks its worker-thread inside the pool. If you create more sub-tasks than worker-threads than your thead-pool is blocked. In order to solve this problem I've developed boost.tasklet (fromaly boost.fiber) and boost.context. As Phil Endecott requested I should move the cotnext switching code into a library (== boost.context) so other libs could probably benefit from it (boost.coroutine). several other languages provide such context switching facilities: Lua : http://lua-users.org/wiki/CoroutinesTutorial Go : http://sites.google.com/site/gopatterns/concurrency/coroutines Scheme : http://en.wikipedia.org/wiki/Scheme_%28programming_language%29#First-class_c... Stackless Python : http://www.stackless.com/ why shouldn't C++ not provide such a facility (at least this is my motivation) best regards, Oliver -- NEU: FreePhone - kostenlos mobil telefonieren und surfen! Jetzt informieren: http://www.gmx.net/de/go/freephone

Hello,
sched_yield(): 2108 cycles
ucontext_t: 1795 cycles fcontext_t: 156 cycles
Actually it is quite consistent with what you had show, the Boost.Context is better (but not significantly) then real OS (kernel) context switch.
factor 13x ?
I was talking about _default_ build which uses ucontext.
OK - ucontext does system calls which are time consuming.
Ok I understand and actually I see several things I already would like to try with Boost.Context. Let's continue... ----------------- Several problems I see: a) (BIG ONE) The shared object link using fcontext crashes! Using boost_1_46_0 Building boost_context: bjam toolset=gcc context-impl=asm architecture=x86 instruction-set=native address-model=64 --with-context variant=release stage Then I compile example link.cpp Run it It crashes! When linking with static libboost_context.a it works... OS: Linux artik-laptop 2.6.35-24-generic #42-Ubuntu SMP Thu Dec 2 02:41:37 UTC 2010 x86_64 GNU/Linux Compiler: gcc-4.4.5 CPU: Intel(R) Core(TM) i5 CPU M 460 @ 2.53GHz b) Building with fcontext is... Too complicated Options: context-impl=asm architecture=x86 instruction-set=native address-model=64 Why should I specify all these parameters? They should be fully auto-configured. I understand that BBv2 is far from being too friendly and powerful system I still expect that most of parameters should be defined by default otherwise there is no chance that users would be actually able to build it clearly. Other Questions ---------------- 1. Please provide a rationale why not both methods are compiled to same library? Sometimes wouldn't it be better to have option about the type of the method you actually use especially if user may want to have both methods. 2. Why boost::contexts::context is template class? How does it benefit from this? Please provide rationale. Thanks, Artyom

Message du 21/03/11 21:04 De : "Artyom" A : boost@lists.boost.org Copie à : Objet : Re: [boost] [context review] Several Questions
Let's continue... ----------------- b) Building with fcontext is... Too complicated
Options: context-impl=asm architecture=x86 instruction-set=native address-model=64
Why should I specify all these parameters? They should be fully auto-configured.
I understand that BBv2 is far from being too friendly and powerful system I still expect that most of parameters should be defined by default otherwise there is no chance that users would be actually able to build it clearly.
Hi Artyom, let me you explain why things are like that. In previous versions of Boost.Context Oliver had build bjam with a Python option that allowed the Jamfile to identify the configuration using a Python script.. The major problem with this a approach is that it forced every one to rebuild Boost.Build with this option. Vladimir and I requested him to change the Jamfile to avoid this dependency. I really think that that decision was the right one, but I can still be wrong. You are right that setting the parameters is not simple, but if you need to cross compile you will need them. The solution I see now is: 1. Oliver creates a feature request to extend Boost.Build so he is able to get the configuration. 2. In the meantime the people like you and me that want to evaluate the library takes a little more time to find how to build it. 3. When Boost.Build will provide these new features, Oliver will adapt the Jamfiles. Hoping this difficulty doesn't avoid you and others to make a review of the library. Thanks, Vicente

Am 21.03.2011 21:50, schrieb Vicente BOTET:
Message du 21/03/11 21:04 De : "Artyom" A : boost@lists.boost.org Copie à : Objet : Re: [boost] [context review] Several Questions
Let's continue... ----------------- b) Building with fcontext is... Too complicated
Options: context-impl=asm architecture=x86 instruction-set=native address-model=64
Why should I specify all these parameters? They should be fully auto-configured.
I understand that BBv2 is far from being too friendly and powerful system I still expect that most of parameters should be defined by default otherwise there is no chance that users would be actually able to build it clearly. Hi Artyom,
let me you explain why things are like that.
In previous versions of Boost.Context Oliver had build bjam with a Python option that allowed the Jamfile to identify the configuration using a Python script.. The major problem with this a approach is that it forced every one to rebuild Boost.Build with this option. Vladimir and I requested him to change the Jamfile to avoid this dependency. I really think that that decision was the right one, but I can still be wrong.
this is correct - Vladimir told me he is working on a BBv2 version providing archtiecture, instruction-set, address-model as regular boost.build options so you don't have to specify them at commandline. Until he has finished his work we have to specify it at bjam commandline.
You are right that setting the parameters is not simple, but if you need to cross compile you will need them. The solution I see now is:
1. Oliver creates a feature request to extend Boost.Build so he is able to get the configuration. 2. In the meantime the people like you and me that want to evaluate the library takes a little more time to find how to build it. 3. When Boost.Build will provide these new features, Oliver will adapt the Jamfiles.
Hoping this difficulty doesn't avoid you and others to make a review of the library.
correct thx, Oliver

Let's continue... -----------------
Several problems I see:
a) (BIG ONE) The shared object link using fcontext crashes!
Using boost_1_46_0
boost-1.46 not tested - use 1.45, please
Building boost_context:
bjam toolset=gcc context-impl=asm architecture=x86 instruction-set=native address-model=64 --with-context variant=release stage
which platform i386? Into the documention I provide for all the platforms (arm, i386, x86_64, mips, ppc) the bjam-options you have to used. - for x86 64bit: bjam toolset=gcc architecture=x86 instruction-set=yorksfield address-model=64 context-impl=asm
Then I compile example link.cpp
Run it
It crashes!
not in my test
b) Building with fcontext is... Too complicated
this is not my fault - boost.build in its current version provides the properites archictecture, instruction-set, address-model as optional - that means that it do not set the values forarchitecture, instruction-set, address-model. boost.context requires those properties in order to select the correct assembler file (see context/build/Jamfile.v2). Vladimir Prus works on a new version of boost.build which provides the properties as regular one so that you don't have to specify them.
Options: context-impl=asm architecture=x86 instruction-set=native address-model=64
instruction-set=nativ is not correct
Why should I specify all these parameters? They should be fully auto-configured.
because in the Jamfile I've to select the assembler file implementing the context switching functions for the correct CPU+address-model#ABI+binary-format -> LINUX on x86_64: architecture = x86, instruction-set = YORKSFIELD, address-model = 64, abi = SYSV, binary-format = ELF
I understand that BBv2 is far from being too friendly and powerful system I still expect that most of parameters should be defined by default otherwise there is no chance that users would be actually able to build it clearly.
as I wrote boost.build doesn't set the values for those properties (architecture, instruction-set, address-model are optional). Even worst Application Binary Interface (ABI -> calling conventions) and the binary-format (ELF, MACH-O, WindowsPE) are not available by the boost.build system.
Other Questions ----------------
1. Please provide a rationale why not both methods are compiled to same library?
Sometimes wouldn't it be better to have option about the type of the method you actually use especially if user may want to have both methods.
Do mean why you can specify to use the context swapping provided by the Operating System or the assembler code (==fcontext)? 1.) ucontext (on UNIX) is less performant (factor 13 as you see in my previous email) compared to fcontext-assembler implementation, because it does system calls to the kernel which consume time in order to preserve unix signal mask. If your require handling UNIX signals you have to use ucontext instead of fcontext (but keep in mind that the signal handler may invoke only obstruction free and async safe functions). 2.) fcontext (UNIX and Windows) does context switching in assembler without any system call and UNIX signal mask preserving. That is the reason beacause it is faster than ucontext (no system calls - kernel calls). If you need UNIX signal handling you could use a separate thread using sigwait() and deliver the signal synchronously (so you don't have the limitations as obstruction free and async safe). 3.) The implementation of Windows Fiber may be equivalent to fcontext but doesn't allow to specify the memory for the stack. You have only the option to set the stacksize (see CreateFiber from Windows Fiber API). That means for each Windows fiber you have a memory allocation and deallocation. With fcontext provided by boost.context you can allocate and reuse your own memory for the context swapping/jumping. An example is boost::tasklets::scheduler from boost.tasklet lib (it caches the stacks used by the boost::context instances - so allocations/deallocations are reduced).
2. Why boost::contexts::context is template class? How does it benefit from this?
The template argument of boost::context specifies the type which abstracts the stack required for your context (remember your execution context is determined by the CPU registers, instruction pointer, stack pointer and the memory area used as stack). This memory must be allocated and deallocated. boost.context provides a stack implementation 'protected_stack' which allocates memory an appends an guard page at the end of the stack so that it protects against exceeding the stack (because it was choosen to small in size). What happens is a segmentation fault/access violation (otherwise I could happen that you overwrite the memory of your own application resulting in unexpected/undefined behaviour). Because it is a template argument you are free to use your own stack class. Oliver

a) (BIG ONE) The shared object link using fcontext crashes!
Using boost_1_46_0
boost-1.46 not tested - use 1.45, please
Sorry, I can't accept this, also because crash happens inside boost_context function I don't think it is related to 1.45 / 1.46 differences
Building boost_context:
bjam toolset=gcc context-impl=asm architecture=x86
instruction-set=native
address-model=64 --with-context variant=release stage
which platform i386? Into the documention I provide for all the platforms (arm, i386, x86_64, mips, ppc) the bjam-options you have to used. - for x86 64bit: bjam toolset=gcc architecture=x86 instruction-set=yorksfield address-model=64 context-impl=asm
Then I compile example link.cpp
Run it
It crashes!
not in my test
Unfortunately it is not so good answer :-) Fresh Build with bjam: bjam toolset=gcc context-impl=asm architecture=x86 instruction-set=yorksfield address-model=64 --with-context stage Build of program: g++ -g -O3 -Wall link.cpp -I ../../boost_1_46_0 -L ../../boost_1_46_0/stage/lib/ -Wl,-rpath=../../boost_1_46_0/stage/lib/ -lboost_context -lrt $ ./a.out Segmentation fault $ gdb ./a.out GNU gdb (GDB) 7.2-ubuntu Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /home/artik/Packages/boost/reviews/bc/a.out...done. (gdb) r Starting program: /home/artik/Packages/boost/reviews/bc/a.out [Thread debugging using libthread_db enabled] Program received signal SIGSEGV, Segmentation fault. 0x00000000006033f0 in get_fcontext () (gdb) bt #0 0x00000000006033f0 in get_fcontext () #1 0x00000000004018cd in context () at ../../boost_1_46_0/boost/context/context.hpp:68 #2 main () at link.cpp:24 Starting program: /home/artik/Packages/boost/reviews/bc/a.out [Thread debugging using libthread_db enabled] Program received signal SIGSEGV, Segmentation fault. 0x00000000006033f0 in get_fcontext () I'd suggest to dig deeper because it can have very critical effect on how reviewers will see this library.
this is not my fault - boost.build in its current version
No more explanations required... I know all drawbacks of BBv2 and accept as fully reasonable answer! (When do we move to CMake?..)
Do mean why you can specify to use the context swapping provided by the Operating System or the assembler code (==fcontext)?
I mean something like boost::context<> // some default boost::ucontext<> boost::fcontext<> So you can have two implementations in same code. Thanks, Artyom

>>> a) (BIG ONE) The shared object link using fcontext crashes! >>> >>> Using boost_1_46_0 >> boost-1.46 not tested - use 1.45, please > Sorry, I can't accept this, also because crash happens > inside boost_context function I don't think it is related > to 1.45 / 1.46 differences > could you please so kind to get my working tree? git clone git://git.gitorious.org/boost-dev/boost-dev.git It would help that we are using the same code basis. >>> Building boost_context: >>> >>> bjam toolset=gcc context-impl=asm architecture=x86 >> instruction-set=native >>> address-model=64 --with-context >>> variant=release stage >> which platform i386? Into the documention I provide for all the platforms >> (arm, i386, x86_64, mips, ppc) the bjam-options you have to used. >> - for x86 64bit: bjam toolset=gcc architecture=x86 instruction-set=yorksfield >> address-model=64 context-impl=asm >> >>> Then I compile example link.cpp >>> >>> Run it >>> >>> It crashes! >>> >> not in my test >> > Unfortunately it is not so good answer :-) ;-) I'm a little bit surprised because it worked and I tried it on several intel computers. But I've an idea because if I use your cammandline buildinglink.cpp I get an segfault too. g++ -g link.cpp -I /opt/boost/include/ -L /opt/boost/lib/ -Wl,-rpath=/opt/boost/lib -lboost_context -lrt segfault at get_fcontext () This is because you are missing -fPIC option. On x86_64 the code is build position independent. g++ -g link.cpp -I /opt/boost/include/ -L /opt/boost/lib/ -Wl,-rpath=/opt/boost/lib -lboost_context -lrt -fPIC -> works I think it is better to build the examples with bjam. That means: 1.) got to <boost-root>/libs/context/examples 2.) compile the examples: bjam toolset=gcc architecture=x86 instruction-set=native address-model=64 context-impl=asm 3.) test the examples in <boost-root>/bin.v2/libs/context/example/gcc-4.4.5/release/address-model-64/architecture-x86/context-impl-asm/instruction-set-native/link-static/ or if Jamvile.v2 was modified to use link against shared lib (<link>shared) <boost-root>/bin.v2/libs/context/example/gcc-4.4.5/release/address-model-64/architecture-x86/context-impl-asm/instruction-set-native/ >> Do mean why you can specify to use the context swapping provided by the >> Operating System or the assembler code (==fcontext)? > I mean something like > > boost::context<> // some default > boost::ucontext<> > boost::fcontext<> > > So you can have two implementations in same code. > This is a option too - my previous intention was to hide this for the user. At the time the boost-libs are compiled the user decides if it wants the fast version or the UNIX signal safe version. It is also hard to provide boost:.fcontext<> for all platforms - for instance on MIPS I would have to implement 4 version of fcontext (because of the multiple MIPS ABIs). Oliver

Message du 21/03/11 23:57 De : "Oliver Kowalke" A : boost@lists.boost.org Copie à : Objet : Re: [boost] [context review] Several Questions
a) (BIG ONE) The shared object link using fcontext crashes!
Using boost_1_46_0 boost-1.46 not tested - use 1.45, please Sorry, I can't accept this, also because crash happens inside boost_context function I don't think it is related to 1.45 / 1.46 differences
could you please so kind to get my working tree?
git clone git://git.gitorious.org/boost-dev/boost-dev.git
It would help that we are using the same code basis.
Oliver, I think the best you can do is to try with the compressed file we are reviewing (the one on the Vault) and Boost.1.46 to see if you get the same errors than Artyom. Be sure to run exactly the same commands as him. Best, Vicente

Am 22.03.2011 00:19, schrieb Vicente BOTET:
Message du 21/03/11 23:57 De : "Oliver Kowalke" A : boost@lists.boost.org Copie à : Objet : Re: [boost] [context review] Several Questions
a) (BIG ONE) The shared object link using fcontext crashes!
Using boost_1_46_0 boost-1.46 not tested - use 1.45, please Sorry, I can't accept this, also because crash happens inside boost_context function I don't think it is related to 1.45 / 1.46 differences
could you please so kind to get my working tree?
git clone git://git.gitorious.org/boost-dev/boost-dev.git
It would help that we are using the same code basis. Oliver, I think the best you can do is to try with the compressed file we are reviewing (the one on the Vault) and Boost.1.46 to see if you get the same errors than Artyom. Be sure to run exactly the same commands as him.
I beleive it is solved - it was a missing -fPIC option by compiling the example Maybe I migrate to 1.46.1 in the next days Oliver

Message du 22/03/11 00:28 De : "Oliver Kowalke" A : boost@lists.boost.org Copie à : Objet : Re: [boost] [context review] Several Questions
Am 22.03.2011 00:19, schrieb Vicente BOTET:
Message du 21/03/11 23:57 De : "Oliver Kowalke" A : boost@lists.boost.org Copie à : Objet : Re: [boost] [context review] Several Questions It would help that we are using the same code basis. Oliver, I think the best you can do is to try with the compressed file we are reviewing (the one on the Vault) and Boost.1.46 to see if you get the same errors than Artyom. Be sure to run exactly the same commands as him.
I beleive it is solved - it was a missing -fPIC option by compiling the example Maybe I migrate to 1.46.1 in the next days
Hi, Usually with Bjam the user doesn't take care of this option as Bjam include it when compiling objects for a shared library. Please could you show where you missed the -fPIC option? Thanks,, Vicente

Usually with Bjam the user doesn't take care of this option as Bjam include it when compiling objects for a shared library. Please could you show where you missed the -fPIC option?
Artyom compiled the libs with bjam (-fPIC is required for shared libs on AMD64) and the example he compiled by hand (see its last posts for the commandline) without -fPIC and he get a segfault in the executable as function get_fcontext() (which is implemented in assembler). If -fPIC is added to Artyoms commandline the example works. If I compile the examples with bjam (as I did until now) I see -fPIC applied by bjam automatically: bjam toolset=gcc architecture=x86 instruction-set=yorksfield address-model=64 context-impl=asm -d 2 A look into 'X86_64 ABI - System V Application Binary Interface ': 'References from within the executable file to a function defined in a shared object will normally be resolved by the link editor to the address of the procedure linkage table entry for that function within the executable file.' I've to investigate why it doesn't work. Oliver -- GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit gratis Handy-Flat! http://portal.gmx.net/de/go/dsl

> could you please so kind to get my working tree? > > git clone git://git.gitorious.org/boost-dev/boost-dev.git > > It would help that we are using the same code basis. > Ok, I can take a look on this, but see notes below. > >>> Building boost_context: > >>> > >>> bjam toolset=gcc context-impl=asm architecture=x86 > >> instruction-set=native > >>> address-model=64 --with-context > >>> variant=release stage > >> which platform i386? Into the documention I provide for all the platforms > >> (arm, i386, x86_64, mips, ppc) the bjam-options you have to used. > >> - for x86 64bit: bjam toolset=gcc architecture=x86 >instruction-set=yorksfield > >> address-model=64 context-impl=asm > >> > >>> Then I compile example link.cpp > >>> > >>> Run it > >>> > >>> It crashes! > >>> > >> not in my test > >> > > Unfortunately it is not so good answer :-) > > ;-) I'm a little bit surprised because it worked and I tried it on > several intel computers. But I've an idea because if I use your > cammandline buildinglink.cpp I get an segfault too. > > g++ -g link.cpp -I /opt/boost/include/ -L /opt/boost/lib/ > -Wl,-rpath=/opt/boost/lib -lboost_context -lrt > segfault at get_fcontext () > > > This is because you are missing -fPIC option. On x86_64 the code is > build position independent. > I'm not missing -fPIC option. -fPIC is used only for building shared object, you don't use it for executables! I don't know why bjam adds it for executables. It is plane wrong (harmless but wrong). No other mature build systems like CMake or autotools does this. In the real world users would not use bjam and would not compile their executables with -fPIC as it is not required. > g++ -g link.cpp -I /opt/boost/include/ -L /opt/boost/lib/ > -Wl,-rpath=/opt/boost/lib -lboost_context -lrt -fPIC > -> works > > I think it is better to build the examples with bjam. That means: > No, I should build libraries with bjam, but by no means, any code I write should be build with it. :-) > 1.) got to <boost-root>/libs/context/examples > 2.) compile the examples: bjam toolset=gcc architecture=x86 > instruction-set=native address-model=64 context-impl=asm > 3.) test the examples in ><boost-root>/bin.v2/libs/context/example/gcc-4.4.5/release/address-model-64/architecture-x86/context-impl-asm/instruction-set-native/link-static/ >/ > or if Jamvile.v2 was modified to use link against shared lib (<link>shared) ><boost-root>/bin.v2/libs/context/example/gcc-4.4.5/release/address-model-64/architecture-x86/context-impl-asm/instruction-set-native/ >/ > > > >> Do mean why you can specify to use the context swapping provided by the > >> Operating System or the assembler code (==fcontext)? > > I mean something like > > > > boost::context<> // some default > > boost::ucontext<> > > boost::fcontext<> > > > > So you can have two implementations in same code. > > > > This is a option too - my previous intention was to hide this for the > user. At the time the boost-libs are compiled the user decides if it > wants the fast version or the UNIX signal safe version. > It is also hard to provide boost:.fcontext<> for all platforms - for > instance on MIPS I would have to implement 4 version of fcontext > (because of the multiple MIPS ABIs). > You can always fall-back to ucontext if assembly version is not provided. I think it would be very useful. Because different parts of program may required different contexts and inability to use two of them in same executable is quite problematic. Artyom

-------- Original-Nachricht --------
Datum: Mon, 21 Mar 2011 22:04:54 -0700 (PDT) Von: Artyom <artyomtnk@yahoo.com> An: boost@lists.boost.org Betreff: Re: [boost] [context review] Several Questions
could you please so kind to get my working tree?
git clone git://git.gitorious.org/boost-dev/boost-dev.git
It would help that we are using the same code basis.
Ok, I can take a look on this, but see notes below.
not necessary - thx
I'm not missing -fPIC option. -fPIC is used only for building shared object, you don't use it for executables!
I don't know why bjam adds it for executables. It is plane wrong (harmless but wrong). No other mature build systems like CMake or autotools does this.
In the real world users would not use bjam and would not compile their executables with -fPIC as it is not required.
it was too late last night - I'll take care of it this afternoon
I think it is better to build the examples with bjam. That means:
No, I should build libraries with bjam, but by no means, any code I write should be build with it. :-)
I thought bjam is the default for compiling boost libs? at least the docs tell it
I mean something like
boost::context<> // some default boost::ucontext<> boost::fcontext<>
You can always fall-back to ucontext if assembly version is not provided.
I think it would be very useful. Because different parts of program may required different contexts and inability to use two of them in same executable is quite problematic.
the problem is that it is unlikely that I can implement fcontext for all platforms and ucontext is not available for some platforms (for instance ARM - I know glibc-ports but you don't have always the possiblity to patch/recompile the C-lib). that means on some systems boost::ucontext<>/boost::fcontext<> may not be available. I've some doubt that this is acceptable. regards, Oliver -- NEU: FreePhone - kostenlos mobil telefonieren und surfen! Jetzt informieren: http://www.gmx.net/de/go/freephone

You can always fall-back to ucontext if assembly version is not provided.
I think it would be very useful. Because different parts of program may required different contexts and inability to use two of them in same executable is quite problematic.
the problem is that it is unlikely that I can implement fcontext for all platforms and ucontext is not available for some platforms (for instance ARM - I know glibc-ports but you don't have always the possiblity to patch/recompile the C-lib). that means on some systems boost::ucontext<>/boost::fcontext<> may not be available. I've some doubt that this is acceptable.
The point is that my may provide something like that: boost::context<> // some default, fcontext if avalible otherwise ucontext boost::context<ucontext_with_fallback> // ucontext, unless available fallback to fcontext boost::context<fcontext_with_fallback> // fcontext, unless available fallback to ucontext boost::context<ucontext> // ucontext, fail to compile if not available boost::context<fcontext> // fcontext, fail to compile if not available It would be much more useful and I think this can be fully acceptable for users... Other issue is that if somebody build boost with uncontext and links with shared object and other builds boost with fcontext their interface is absolutly similar but ABI is different. So for example if Red Had provides Boost builded with fcontext and Debian with ucontext then same executable would not be able to run even when linked to same library. What may be ever worth when some binary library you use uses Xcontext and you compile with Ycontext then it would fail badly especially if you both linked boost_context statically then it would crash accidentially in different places. I'm really concerned about ABI issues especially that non-default build is likely one that is desired. Artyom

You can always fall-back to ucontext if assembly version is not
provided.
I think it would be very useful. Because different parts of program may required different contexts and inability to use two of them in same executable is quite problematic.
the problem is that it is unlikely that I can implement fcontext for all platforms and ucontext is not available for some platforms (for instance ARM - I know glibc-ports but you don't have always the possiblity to patch/recompile the C-lib). that means on some systems boost::ucontext<>/boost::fcontext<> may not be available. I've some doubt that this is acceptable.
The point is that my may provide something like that:
boost::context<> // some default, fcontext if avalible
otherwise ucontext boost::context<ucontext_with_fallback> // ucontext, unless available fallback
to fcontext boost::context<fcontext_with_fallback> // fcontext, unless available fallback
to ucontext boost::context<ucontext> // ucontext, fail to compile if not available boost::context<fcontext> // fcontext, fail to compile if not available
Or other option may be create: a) boost::abstract_context and all ucontext, fcontext, fiber_context that is to be implemented in terms of abstract context So all changed would be binary compatible b) make boost::context use pimpl ideom so all details are hidden behind the scenes in the sources so switching Xcontext would not change the ABI of the library. Artyom

Am 22.03.2011 09:54, schrieb Artyom:
b) make boost::context use pimpl ideom so all details are hidden behind the scenes in the sources so switching Xcontext would not change the ABI of the library.
well this was one of my previous solutions - but some people of the community requested to remove the pimpl because it requires an additional memory allocation. I'll think about your other suggestions - maybe I can provide an alternative implementation. regards, Oliver

No, I should build libraries with bjam, but by no means, any code I write should be build with it. :-)
so I was able to track the buck down - I forgot the assembler directive '.type <name>,@function' in the asm (sets symbol <name> to be a function symbol). I'll commit the fix this afternoon to my git repo. Oliver -- GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit gratis Handy-Flat! http://portal.gmx.net/de/go/dsl

bjam toolset=gcc context-impl=asm architecture=x86 instruction-set=native address-model=64 --with-context variant=release stage
Then I compile example link.cpp
Run it
It crashes!
When linking with static libboost_context.a it works...
OS: Linux artik-laptop 2.6.35-24-generic #42-Ubuntu SMP Thu Dec 2 02:41:37 UTC 2010 x86_64 GNU/Linux Compiler: gcc-4.4.5 CPU: Intel(R) Core(TM) i5 CPU M 460 @ 2.53GHz
I have not tested it on i5 but it should work too. I tried it on my system (Intel(R) Core(TM)2 Quad CPU Q6700 @ 2.66GHz, Linux c2dlx01 2.6.35-28-generic #49-Ubuntu SMP Tue Mar 1 14:39:03 UTC 2011 x86_64 GNU/Linux). cd libs/context/examples bjam toolset=gcc context-impl=asm architecture=x86 instruction-set=native address-model=64 --with-context variant=release ../../../bin.v2/libs/context/example/gcc-4.4.5/release/address-model-64/architecture-x86/context-impl-asm/instruction-set-native/link -> worked I changed <link> from 'static' to 'shared' in Jamfile.v2 in context/examples directory. bjam toolset=gcc context-impl=asm architecture=x86 instruction-set=native address-model=64 --with-context variant=release ../../../bin.v2/libs/context/example/gcc-4.4.5/release/address-model-64/architecture-x86/context-impl-asm/instruction-set-native/link -> worked ldd ../../../bin.v2/libs/context/example/gcc-4.4.5/release/address-model-64/architecture-x86/context-impl-asm/instruction-set-native/link linux-vdso.so.1 => (0x00007fffdc098000) libboost_context.so.1.45.0 => /home/graemer/Projekte/C++/boost-dev/bin.v2/libs/context/build/gcc-4.4.5/release/address-model-64/architecture-x86/context-impl-asm/instruction-set-native/libboost_context.so.1.45.0 (0x00007fcfe8343000) libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007fcfe801a000) libm.so.6 => /lib/libm.so.6 (0x00007fcfe7d97000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007fcfe7b81000) libc.so.6 => /lib/libc.so.6 (0x00007fcfe77fd000) /lib64/ld-linux-x86-64.so.2 (0x00007fcfe855a000) ->shared library libboost_context.so.1.45.0 is loaded I'm not sure what you did that it crashes. Can you tell me on which line of code it happens (gdb -> stacktrace). the assembler file is invoked like: "g++" -x assembler-with-cpp -O3 -finline-functions -Wno-inline -Wall -march=native -DBOOST_ALL_NO_LIB=1 -DBOOST_CONTEXT_DYN_LINK=1 -DNDEBUG -I"../../.." -c -o "../../../bin.v2/libs/context/build/gcc-4.4.5/release/address-model-64/architecture-x86/context-impl-asm/instruction-set-native/asm/fcontext_x86_64_sysv_elf_gas.o" "../../../libs/context/src/asm/fcontext_x86_64_sysv_elf_gas.S" the *.cpp files: "g++" -ftemplate-depth-128 -O3 -finline-functions -Wno-inline -Wall -march=native -fPIC -m64 -DBOOST_ALL_NO_LIB=1 -DBOOST_CONTEXT_DYN_LINK=1 -DNDEBUG -I"../../.." -c -o "../../../bin.v2/libs/context/build/gcc-4.4.5/release/address-model-64/architecture-x86/context-impl-asm/instruction-set-native/protected_stack_posix.o" "../../../libs/context/src/protected_stack_posix.cpp" thx, Oliver

I'm not sure what you did that it crashes. Can you tell me on which line of code it happens (gdb -> stacktrace).
This is what I got from gdb so far: Program received signal SIGSEGV, Segmentation fault. 0x00000000006033f0 in get_fcontext () (gdb) bt #0 0x00000000006033f0 in get_fcontext () #1 0x00000000004018cd in context () at ../../boost_1_46_0/boost/context/context.hpp:68 #2 main () at link.cpp:24 (gdb) frame 0 #0 0x00000000006033f0 in get_fcontext () (gdb) disassemble Dump of assembler code for function get_fcontext: => 0x00000000006033f0 <+0>: mov %rbx,(%rdi) 0x00000000006033f3 <+3>: mov %r12,0x8(%rdi) 0x00000000006033f7 <+7>: mov %r13,0x10(%rdi) 0x00000000006033fb <+11>: mov %r14,0x18(%rdi) 0x00000000006033ff <+15>: mov %r15,0x20(%rdi) 0x0000000000603403 <+19>: mov %rbp,0x28(%rdi) 0x0000000000603407 <+23>: stmxcsr 0x48(%rdi) 0x000000000060340b <+27>: fnstcw 0x4c(%rdi) 0x000000000060340e <+30>: lea 0x8(%rsp),%rcx 0x0000000000603413 <+35>: mov %rcx,0x30(%rdi) 0x0000000000603417 <+39>: mov (%rsp),%rcx 0x000000000060341b <+43>: mov %rcx,0x38(%rdi) 0x000000000060341f <+47>: xor %rax,%rax 0x0000000000603422 <+50>: retq End of assembler dump. (gdb) info registers rax 0x0 0 rbx 0x7fffffffe0b0 140737488347312 rcx 0x0 0 rdx 0x0 0 rsi 0x7fffffffe1e0 140737488347616 rdi 0x7fffffffe0b0 140737488347312 rbp 0x7fffffffe140 0x7fffffffe140 rsp 0x7fffffffe098 0x7fffffffe098 r8 0x5 5 r9 0x0 0 r10 0x7fffffffde20 140737488346656 r11 0x7ffff7bc7378 140737349710712 r12 0x4015f0 4199920 r13 0x7fffffffe1d0 140737488347600 r14 0x0 0 r15 0x0 0 rip 0x6033f0 0x6033f0 <get_fcontext> eflags 0x10246 [ PF ZF IF RF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 (gdb) Artyom

Am 21.03.2011 21:02, schrieb Artyom:
a) (BIG ONE) The shared object link using fcontext crashes!
Using boost_1_46_0
Building boost_context:
bjam toolset=gcc context-impl=asm architecture=x86 instruction-set=native address-model=64 --with-context variant=release stage
Then I compile example link.cpp
Run it
It crashes!
the assembler file was missing the .type directive. I've appended the corrected version - it should replace libs/context/src/asm/fcontext_x86_64_sysv_elf_gas.S. (alternatively git://git.gitorious.org/boost-dev/boost-dev.git) recompiling the lib should fix the problem. regards, Oliver

sched_yield(): 2108 cycles ucontext_t: 1795 cycles fcontext_t: 156 cycles
Actually it is quite consistent with what you had show, the Boost.Context is better (but not significantly) then real OS (kernel) context switch.
factor 13x ?
It would be very useful to see real usability cases. Because otherwise it is hard to estimate usefulness of this library
The lib is intended to be a building block for higher abstractions like coroutines.
I understand, it may be useful for "yeild" like iterators and so but it is not clear how does it work and what is the advantage in comparison to ordinary event driven approach.
I remember that factor (http://factorcode.org/) does context switching for its internal working. You can preserve local data: void myfunc() { std::string s = "abc"; for ( int i = 0; i< 10; ++i) { if ( 5 == i) // jump out to another code path not related to myfunc() // and preserve value of i and s yield(); // if we jumped back we return from yield i==5, s == "abc" } } reagrds, Oliver -- GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit gratis Handy-Flat! http://portal.gmx.net/de/go/dsl
participants (3)
-
Artyom
-
Oliver Kowalke
-
Vicente BOTET