[transaction] New Boost.Transaction library under discussion

vicente.botet

18 Jan 2010 18 Jan '10

1:09 p.m.

Hi Stefan, Bob, I've created in the sandbox the transaction directory (http://svn.boost.org/svn/boost/sandbox/transaction/) so we can share our work. I've created also a wiki page (http://svn.boost.org/trac/boost/wiki/BoostTransaction) on which we can compile the result of our discussions, requirements for the library, design, ... If you don't have right access to the Boost wiki pages, please request them throught this ML. Of course, anyone is welcome to participate, or comment on this wiki. HTH, _____________________ Vicente Juan Botet Escribá

Show replies by date

Stefan Strasser

18 Jan 18 Jan

3:39 p.m.

New subject: [transaction] New Boost.Transaction library under discussion

Am Monday 18 January 2010 14:09:11 schrieb vicente.botet:

...

Hi Stefan, Bob,

I've created in the sandbox the transaction directory (http://svn.boost.org/svn/boost/sandbox/transaction/) so we can share our work.

I've uploaded some code that we've discussed and/or I think needs discussion to transaction/dev: https://svn.boost.org/svn/boost/sandbox/transaction/dev/ it is not meant as an initial code base for "transaction", but only for this discussion. basic_transaction_manager.hpp: my implementation of the TransactionManager concept basic_transaction.hpp: the transaction scope class, and the atomic{}retry; macros. transaction_manager.hpp: the (configurable) default transaction manager basic_loc.hpp and loc.hpp: an example of a C++98 pseudo-template-alias. note the anonymous namespace and the conversion operators in loc.hpp. I think this technically violates the One Definition Rule but I don't think this actually causes a problem. exception.hpp: defines isolation_exception, with some stuff to unwind the "nested transaction stack" up to the transaction that caused the isolation exception. have a look at this, all of our libraries need an exception that is a request to the user to repeat the transaction, be it because of a MVCC failure or because of a deadlock.

...

If you don't have right access to the Boost wiki pages, please request them throught this ML.

I don't have wiki access. If this is connected to SVN please use my svn l/p.

vicente.botet

8:52 p.m.

New subject: [transaction] New Boost.Transaction library underdiscussion

Hi Stefan thanks for making available your code on the sandbox. This will help a lot to discuss about the different points of views. ----- Original Message ----- From: "Stefan Strasser" <strasser@uni-bremen.de> To: <boost@lists.boost.org> Sent: Monday, January 18, 2010 4:39 PM Subject: Re: [boost] [transaction] New Boost.Transaction library underdiscussion

...

Am Monday 18 January 2010 14:09:11 schrieb vicente.botet:

...
Hi Stefan, Bob,

I've created in the sandbox the transaction directory (http://svn.boost.org/svn/boost/sandbox/transaction/) so we can share our work.

I've uploaded some code that we've discussed and/or I think needs discussion to transaction/dev: https://svn.boost.org/svn/boost/sandbox/transaction/dev/

it is not meant as an initial code base for "transaction", but only for this discussion.

Looking at the code I think that the lazy creation of resource associated to the created transaction introduce more problems that it solves. How can you ensure that each resource has an equivalent stack of local nested transactions if you create them only when the application access a resource on the context of a global transaction?

...

basic_transaction_manager.hpp: my implementation of the TransactionManager concept

I find extrange the way basic_transaction_manager is made a singleton. static bool has_active(){ return active_; } static basic_transaction_manager &active(){ if(active_) return *active_; else persistent::detail::throw_(no_active_transaction_manager()); } void bind(){ active_=this; } static void unbind(){ active_=0; } What can the application do when there is no transaction_manager? Nothing in my opinion, so the system need to ensure this invariant. I would preffer just an instance() static function. If basic_transaction_manager is not able to define this function, we can make basic_transaction_manager a mixin, that will have the final transaction_manager as parameter. template<class Final, class Resources,bool Threads=true,bool TThreads=true> class basic_transaction_manager { static Final& instance() { return Final::instance(); } ... } Another issue with the current interface for the active transaction. void bind_transaction(transaction &tx){ this->active_transaction(&tx,mpl::bool_<Threads>()); } void unbind_transaction(){ this->active_transaction(0,mpl::bool_<Threads>()); } transaction &active_transaction() const{ if(transaction *tx=this->active_transaction(mpl::bool_<Threads>())) return *tx; else persistent::detail::throw_(no_active_transaction()); } bool has_active_transaction() const{ return this->active_transaction(mpl::bool_<Threads>()) ? true : false; } As the active transaction can be null the better is that the function return the pointer to the active transaction. I find this prototype simple. transaction* active_transaction() const{ return this->active_transaction(mpl::bool_<Threads>()); } void active_transaction(transaction* tx) const{ this->active_transaction(tx,mpl::bool_<Threads>()); } With this interface the functions are clearer. Instead of typename detail::transaction_construct_t begin_transaction(){ if(this->has_active_transaction()) return typename detail::transaction_construct_t(&this->active_transaction()); else return typename detail::transaction_construct_t(0); } we can have typename detail::transaction_construct_t begin_transaction(){ return typename detail::transaction_construct_t(this->active_transaction()); } We don't need to test. Just do it. Instead of if(tx.parent) this->bind_transaction(*tx.parent); else this->unbind_transaction(); by this->bind_transaction(tx.parent);

...

basic_transaction.hpp: the transaction scope class, and the atomic{}retry; macros.

I would separate the language-like macros from the basic_transaction class.

...

transaction_manager.hpp: the (configurable) default transaction manager

basic_loc.hpp and loc.hpp: an example of a C++98 pseudo-template-alias. note the anonymous namespace and the conversion operators in loc.hpp. I think this technically violates the One Definition Rule but I don't think this actually causes a problem.

If I have understood it will be only one transaction_manager class for an application. Why add the template parameter TxMgr to all the other classes? In other words, which class other than transaction_manager can be a template for the basic_loc class? template<class T> class loc : public basic_loc<T,transaction_manager>{...} template<class T> class loc2 : public basic_loc<T,???>{...} Here is how I will implement transaction manager // transaction_manager.hpp #ifndef BOOST_TRANSACTION_TRANSACTION_MANAGER_HPP #define BOOST_TRANSACTION_TRANSACTION_MANAGER_HPP #include BOOST_TRANSACTION_USER_TRANSACTION_MANAGER_HPP #endif Of course this impose to define always BOOST_TRANSACTION_USER_TRANSACTION_MANAGER_HPP at compilation time, but avoid the template parameter. Thus, instead of template<class TxMgr> class basic_transaction; typedef basic_transaction<transaction_manager> transaction; we can have just class transaction; // using the single transaction_manager

...

exception.hpp: defines isolation_exception, with some stuff to unwind the "nested transaction stack" up to the transaction that caused the isolation exception. have a look at this, all of our libraries need an exception that is a request to the user to repeat the transaction, be it because of a MVCC failure or because of a deadlock.

I'll come to this point later. Regards, Vicente

Stefan Strasser

11:10 p.m.

New subject: [transaction] New Boost.Transaction library underdiscussion

Am Monday 18 January 2010 21:52:49 schrieb vicente.botet:

...

Looking at the code I think that the lazy creation of resource associated to the created transaction introduce more problems that it solves. How can you ensure that each resource has an equivalent stack of local nested transactions if you create them only when the application access a resource on the context of a global transaction?

why couldn't you ensure that? line 170 recursively calls resource_transaction() to do exactly that.

...

...
basic_transaction_manager.hpp: my implementation of the TransactionManager concept

I find extrange the way basic_transaction_manager is made a singleton.

static bool has_active(){ return active_; }

static basic_transaction_manager &active(){ if(active_) return *active_; else persistent::detail::throw_(no_active_transaction_manager()); } void bind(){ active_=this; } static void unbind(){ active_=0; }

What can the application do when there is no transaction_manager? Nothing in my opinion, so the system need to ensure this invariant. I would preffer just an instance() static function.

correct, Nothing. and it is a singleton. but it is not invariant. 1. the transaction manager is constructed by the user, and it must be. the entire runtime configuration of a library depends on that, including the filename of a database file, desired cache size, etc. (by passing those to a resource manager and then passing the resource manager to the transaction manager constructor). how do you construct a transaction manager (and the resource managers used by it) at the point of (lazy) singleton construction if you don't have any constructor arguments? 2. why shouldn't it be possible to do: { open and use database in file db1.db } { open and use database in file db2.db } ?

...

As the active transaction can be null the better is that the function return the pointer to the active transaction. I find this prototype simple.

transaction* active_transaction() const{ return this->active_transaction(mpl::bool_<Threads>()); }

this is not uber-important to me and returning a pointer would be ok, but I nevertheless disagree with that. you are looking at the implementation of basic_transaction_manager. I was foremost trying to define a concept TransactionManager without looking at implementation details. when the active transaction is needed for an operation, for example by a basic_loc to access a persistent object, there are generally two cases: a) "I need a transaction for the following operation, if there is none an exception must be thrown." for example: writing to a transactional object. write_to_object(txmgr.active_transaction()); //throws b) "although I could do this operation without a transaction, if there is one, another code path must be followed." for example: reading from a transactional object. if(txmgr.has_active_transaction()){ read_transactional(); }else{ read_global(); } especially case a) is much more verbose using your interface. every operation that requires a transaction (which is most of them) has to check for a 0 return and manually throw an exception.

...

With this interface the functions are clearer. Instead of

typename detail::transaction_construct_t begin_transaction(){ if(this->has_active_transaction()) return typename detail::transaction_construct_t(&this->active_transaction()); else return typename detail::transaction_construct_t(0); }

implementation details.

...

...
basic_transaction.hpp: the transaction scope class, and the atomic{}retry; macros.

I would separate the language-like macros from the basic_transaction class.

into a seperate header? ok. I'd also prefer another for the macro itself, considering boost::atomic...

...

...
transaction_manager.hpp: the (configurable) default transaction manager

basic_loc.hpp and loc.hpp: an example of a C++98 pseudo-template-alias. note the anonymous namespace and the conversion operators in loc.hpp. I think this technically violates the One Definition Rule but I don't think this actually causes a problem.

If I have understood it will be only one transaction_manager class for an application. Why add the template parameter TxMgr to all the other classes? In other words, which class other than transaction_manager can be a template for the basic_loc class?

template<class T> class loc : public basic_loc<T,transaction_manager>{...} template<class T> class loc2 : public basic_loc<T,???>{...}

think of loc/loc2 not as classes but as typedefs. (they really are, they are only classes to work around the lack of template aliases in C++98.). you could make up an example that uses 2 different loc's, but a much simpler example is 2 different transaction's: an application might use Boost.Persistent to store some application state on disk. then, in a complete different part of the application, in another translation unit, someone tries to use Boost.STM and fails, because "transaction" is already configured to use Boost.Persistent. (either because the translation unit includes a header that uses Boost.Persistent or because of a linker error because of One Definition Rule violation).

...

template<class TxMgr> class basic_transaction; typedef basic_transaction<transaction_manager> transaction;

we can have just

class transaction; // using the single transaction_manager

...for the entire application, possibly including linked libraries. Regards,

vicente.botet

20 Jan 20 Jan

11:49 p.m.

New subject: [transaction] New Boost.Transaction libraryunderdiscussion

Hi, sorry tp replay so late. ----- Original Message ----- From: "Stefan Strasser" <strasser@uni-bremen.de> To: <boost@lists.boost.org> Sent: Tuesday, January 19, 2010 12:10 AM Subject: Re: [boost] [transaction] New Boost.Transaction libraryunderdiscussion

...

Am Monday 18 January 2010 21:52:49 schrieb vicente.botet:

...
Looking at the code I think that the lazy creation of resource associated to the created transaction introduce more problems that it solves. How can you ensure that each resource has an equivalent stack of local nested transactions if you create them only when the application access a resource on the context of a global transaction?

why couldn't you ensure that? line 170 recursively calls resource_transaction() to do exactly that.

I've seen that call. This ensure that you have a complete stack on the moment of the resource_transaction, but not when the application commit. This means that you can have resources that can participate on the transaction on the outer levels but not on the inner ones. As this resource will not be commited when the inner transaction is commited, we are unable to see conflicts with other transactions respect to this resource. Is for this reason I said that it is better to ensure that all the resources have the same stack of nested transactions.

...

...
...
basic_transaction_manager.hpp: my implementation of the TransactionManager concept

I find extrange the way basic_transaction_manager is made a singleton.

static bool has_active(){ return active_; }

static basic_transaction_manager &active(){ if(active_) return *active_; else persistent::detail::throw_(no_active_transaction_manager()); } void bind(){ active_=this; } static void unbind(){ active_=0; }

What can the application do when there is no transaction_manager? Nothing in my opinion, so the system need to ensure this invariant. I would preffer just an instance() static function.

correct, Nothing. and it is a singleton. but it is not invariant.

1. the transaction manager is constructed by the user, and it must be. the entire runtime configuration of a library depends on that, including the filename of a database file, desired cache size, etc. (by passing those to a resource manager and then passing the resource manager to the transaction manager constructor).

how do you construct a transaction manager (and the resource managers used by it) at the point of (lazy) singleton construction if you don't have any constructor arguments?

The answer was already in my preceding post. If basic_transaction_manager is not able to define this function, we can make basic_transaction_manager a mixin, that will have the final transaction_manager as parameter. Note the new parameter Final. template<class Final, class Resources,bool Threads=true,bool TThreads=true> class basic_transaction_manager { static Final& instance() { return Final::instance(); } ... }

...

2. why shouldn't it be possible to do:

{ open and use database in file db1.db } { open and use database in file db2.db } ? This could be a good use case justifying the bind function of basic_transaction_manager, but IMO this operation corresponds to something I would expect associated to the database resource manager. But maybe I'm wrong.

...

...
As the active transaction can be null the better is that the function return the pointer to the active transaction. I find this prototype simple.

transaction* active_transaction() const{ return this->active_transaction(mpl::bool_<Threads>()); }

this is not uber-important to me and returning a pointer would be ok, but I nevertheless disagree with that.

you are looking at the implementation of basic_transaction_manager. I was foremost trying to define a concept TransactionManager without looking at implementation details.

I'm realy sorry to report at this level. The concepts are important as well as the details at the interface level. Note however that with the documentation we were not able to have this discussion. I will let the implementation details for the moment and concentrate on the concepts from now on, but I will however replay to your comments.

...

when the active transaction is needed for an operation, for example by a basic_loc to access a persistent object, there are generally two cases:

a) "I need a transaction for the following operation, if there is none an exception must be thrown." for example: writing to a transactional object.

write_to_object(txmgr.active_transaction()); //throws

what can the application do with the exception thrown? terminate?

...

b) "although I could do this operation without a transaction, if there is one, another code path must be followed." for example: reading from a transactional object.

if(txmgr.has_active_transaction()){ read_transactional(); }else{ read_global(); }

This seems reasonable.

...

especially case a) is much more verbose using your interface. every operation that requires a transaction (which is most of them) has to check for a 0 return and manually throw an exception.

Well in this case it seems that both operations should be provided one that throws an the other that returns 0 if there is no active transaction .

...

...
With this interface the functions are clearer. Instead of

typename detail::transaction_construct_t begin_transaction(){ if(this->has_active_transaction()) return typename detail::transaction_construct_t(&this->active_transaction()); else return typename detail::transaction_construct_t(0); }

implementation details.

Sure.

...

...
...
basic_transaction.hpp: the transaction scope class, and the atomic{}retry; macros.

I would separate the language-like macros from the basic_transaction class.

into a seperate header? ok. I'd also prefer another for the macro itself, considering boost::atomic...

...
...
transaction_manager.hpp: the (configurable) default transaction manager

basic_loc.hpp and loc.hpp: an example of a C++98 pseudo-template-alias. note the anonymous namespace and the conversion operators in loc.hpp. I think this technically violates the One Definition Rule but I don't think this actually causes a problem.

If I have understood it will be only one transaction_manager class for an application. Why add the template parameter TxMgr to all the other classes? In other words, which class other than transaction_manager can be a template for the basic_loc class?

template<class T> class loc : public basic_loc<T,transaction_manager>{...} template<class T> class loc2 : public basic_loc<T,???>{...}

think of loc/loc2 not as classes but as typedefs. (they really are, they are only classes to work around the lack of template aliases in C++98.).

I have a question. How loc finds out the persistent resource manager?

...

you could make up an example that uses 2 different loc's, but a much simpler example is 2 different transaction's: an application might use Boost.Persistent to store some application state on disk. then, in a complete different part of the application, in another translation unit, someone tries to use Boost.STM and fails, because "transaction" is already configured to use Boost.Persistent. (either because the translation unit includes a header that uses Boost.Persistent or because of a linker error because of One Definition Rule violation).

I think I start to understand what you are looking for. Are you saying that transaction_manager will be part of the interface the Persistent library (persistent::transaction_manager) and not of the of Transaction library? That we can have another stm::transaction_manager and possible also a third persistent_and_stm::transaction_manager, and that the developer will choose the transaction_manager depending on the context? If this was your goal, I understand now why you have the parameter TxMgr. While I can understand the independence need I don't see why what I would consider as a resource manager takes the place of a transaction manager. Parts of the application that uses a single resource manager will not conflict with other parts of the application that uses other resource managers as far as the resources managers don't manage the same resource. The following declaration typedef basic_transaction_manager<vector<persistent::resource_manager, stm::resource_manager>> transaction_manager; should works for them. I have another use case that is quite close to yours: an application might use Boost.Persistent to store some application state on disk. Then, in a complete different part of the application, in another translation unit, someone tries to use Boost.Persistent and don't fails at compile time neither at link time, but at run time, because both parts uses the same class and have binded twice. Only the part doing the last bind will work. Even you can have typedef basic_transaction_manager<vector<persistent::resource_manager1, persistent::resource_manager2>>. transaction_manager;

...

...
template<class TxMgr> class basic_transaction; typedef basic_transaction<transaction_manager> transaction;

we can have just

class transaction; // using the single transaction_manager

...for the entire application, possibly including linked libraries.

Ah, I see the real problem. If you want dynamic libraries participate on a transaction in an independent way I don't see how to achieve that if we don't have a *single* transaction_manager with dynamic polymorphic resource managers, and even more a transaction with dynamic polymorphic resources.

...

From my side, the goal is to have a single transactional system, with no more than a transaction by thread. I don't see how independent transactions can be nested without adding a lot of truble in the application. I don't say that this is not possible, just that this will need some kind of synchronization. How your design behaves in this context? Could you clarify on which context your design allowing multiple transaction managers works? Which parts of the application can use which transaction manager, can resources (objects) be shared by transactional managers, ...? Should these multiple transaction managers ne used uniquely by completly independent parts of the application running is separated threads and accessing completly separated data?

I think I start to see the limits of having a static list of resources managers associated to a transaction_manager and the high risk to have multiple transaction managers. Hoping the discussion was not limited to implementation details and thatwe will find a design that covers the different expectations. Regards, Vicente

Stefan Strasser

21 Jan 21 Jan

2:55 a.m.

New subject: [transaction] New Boost.Transaction libraryunderdiscussion

Am Thursday 21 January 2010 00:49:25 schrieb vicente.botet:

...

...
...
solves. How can you ensure that each resource has an equivalent stack of local nested transactions if you create them only when the application access a resource on the context of a global transaction?

I've seen that call. This ensure that you have a complete stack on the moment of the resource_transaction, but not when the application commit. This means that you can have resources that can participate on the transaction on the outer levels but not on the inner ones. As this resource will not be commited when the inner transaction is commited, we are unable to see conflicts with other transactions respect to this resource.

what conflict can the other resource cause if there is no nested transaction in it? global_root_tx( resource1_root_tx, resource2_root_tx) ^ ^ | | | | global_nested_tx(resource1_nested_tx, none) global_nested_tx.commit(): resource1_nested_tx is published into resource1_root_tx. how can this call cause a conflict in resource2?

...

...
how do you construct a transaction manager (and the resource managers used by it) at the point of (lazy) singleton construction if you don't have any constructor arguments?

The answer was already in my preceding post. If basic_transaction_manager is not able to define this function, we can make basic_transaction_manager a mixin, that will have the final transaction_manager as parameter. Note the new parameter Final.

template<class Final, class Resources,bool Threads=true,bool TThreads=true> class basic_transaction_manager { static Final& instance() { return Final::instance(); } ... }

I don't understand what difference that makes. I thought your point was that the transaction manager should be stored as an invariant singleton. How is that possible, with or without mixin? What does the mixim accomplish? What is the difference between instance() and the current active()?

...

...
think of loc/loc2 not as classes but as typedefs. (they really are, they are only classes to work around the lack of template aliases in C++98.).

I have a question. How loc finds out the persistent resource manager?

TxMgr::resource<ResourceTag>() (this relates to the following question).

...

...
2. why shouldn't it be possible to do:

{ open and use database in file db1.db } { open and use database in file db2.db } ?

This could be a good use case justifying the bind function of basic_transaction_manager, but IMO this operation corresponds to something I would expect associated to the database resource manager. But maybe I'm wrong.

at the moment resource managers cannot be de-registered from the transaction manager(in order to be destructed and then re-constructed using another database file). this is not part of the TransactionManager concept though, you could write a transaction manager that supports that. but looking at the overall picture, the user must be able to construct his resources at some point at run time, so either: 1. TxMgr::active() can not be an invariant singleton and must be able to throw a no_active_transaction_manager exception or 2. TxMgr::resource<Tag> can not be an invariant singleton and must be able to throw. I went with 1, we could also use 2 if you see an advantage with that approach (I don't at the moment). on a seperate note, you could argue that TxMgr::active() could have an existing transaction manager as a precondition and cause undefined behaviour. it's hard to draw the line which programming errors should cause exceptions and which are allowed to crash. (precendent: std::vector::operator[] vs. std::vector::at())

...

I'm realy sorry to report at this level. The concepts are important as well as the details at the interface level. Note however that with the

there is nothing wrong with discussing the implementation details and you're welcome to do so, but we should not change the concepts in favor of an implementation detail. I understood your argument for returning a pointer from active_transaction() as doing just that.

...

...
when the active transaction is needed for an operation, for example by a basic_loc to access a persistent object, there are generally two cases:

a) "I need a transaction for the following operation, if there is none an exception must be thrown." for example: writing to a transactional object.

write_to_object(txmgr.active_transaction()); //throws

what can the application do with the exception thrown? terminate?

or continue doing something else. I don't think it is acceptable for code like the following (outside of a transaction scope): loc<pers_type const> l=...; std::cerr << l->value << std::endl; to be ok but code like the following to cause undefined behaviour: loc<pers_type> l=...; std::cerr << l->value << std::endl; so it throws, even though it is a programming error. I guess there are similar subtle cases in your library.

...

...
b) "although I could do this operation without a transaction, if there is one, another code path must be followed." for example: reading from a transactional object.

if(txmgr.has_active_transaction()){ read_transactional(); }else{ read_global(); }

This seems reasonable.

...
especially case a) is much more verbose using your interface. every operation that requires a transaction (which is most of them) has to check for a 0 return and manually throw an exception.

Well in this case it seems that both operations should be provided one that throws an the other that returns 0 if there is no active transaction .

yes, I went with if(txmgr.has_active_transaction()) read(txmgr.active_transaction()); if you prefer if(TxMgr::transaction *tx=txmgr.try_get_active_transaction()) read(*tx); I guess that's only a matter of style. but there should be a function like the current active_transaction() that requires an active transaction and throws otherwise.

...

...
you could make up an example that uses 2 different loc's, but a much simpler example is 2 different transaction's: an application might use Boost.Persistent to store some application state on disk. then, in a complete different part of the application, in another translation unit, someone tries to use Boost.STM and fails, because "transaction" is already configured to use Boost.Persistent. (either because the translation unit includes a header that uses Boost.Persistent or because of a linker error because of One Definition Rule violation).

I think I start to understand what you are looking for. Are you saying that transaction_manager will be part of the interface the Persistent library (persistent::transaction_manager) and not of the of Transaction library?

no, see below.

...

typedef basic_transaction_manager<vector<persistent::resource_manager, stm::resource_manager>> transaction_manager;

now you're using a typedef. your argument I was responding to was why it is not sufficient to use one global define instead of a typedef. you suggested something like: #define basic_transaction_manager<...> TRANSACTION_MANAGER class transaction{ ...{ TRANSACTION_MANAGER::active()->...(); } } for the reasons mentioned(different parts of the application using Boost.Transaction in different configurations using different resource managers, linked libraries, One Definition Rule, etc) this will fail in these cases. that's why the current code is instead: template<class TxMgr> class basic_transaction{ ...{ TxMgr::active()->...() } } #define basic_transaction_manager<...> TRANSACTION_MANAGER typedef basic_transaction<TRANSACTION_MANAGER> transaction; the transaction-typedef is not a definition and can be local to the part of the application using the library, and can be typedef'd to another transaction manager in another part of the application.

...

should works for them.

I have another use case that is quite close to yours: an application might use Boost.Persistent to store some application state on disk. Then, in a complete different part of the application, in another translation unit, someone tries to use Boost.Persistent and don't fails at compile time neither at link time, but at run time, because both parts uses the same class and have binded twice. Only the part doing the last bind will work.

right, a transaction manager has to be a unique type to be used simultaniously with another one. that's a downside of our approach of using a singleton instead of passing everything, like the active transaction, as function arguments. and it is the reason why my resource manager implementation has a configurable tag, so you can do the following: struct application_preferences_tag; typedef multiversion_object_resource<...,application_preferences_tag> prefres; typedef basic_transaction_manager<prefres> preftxmgr; struct application_data_tag; typedef multiversion_object_resource<...,application_data_tag> datares; typedef basic_transaction_manager<datares> datatxmgr; so preftxmgr::active() yields another result as datatxmgr::active(), and a basic_transaction<preftxmgr> is completely independent of a basic_transaction<datatxmgr>. Regards, Stefan

vicente.botet

9:15 a.m.

New subject: [transaction] New Boost.Transactionlibraryunderdiscussion

----- Original Message ----- From: "Stefan Strasser" <strasser@uni-bremen.de> To: <boost@lists.boost.org> Sent: Thursday, January 21, 2010 3:55 AM Subject: Re: [boost] [transaction] New Boost.Transactionlibraryunderdiscussion

...

Am Thursday 21 January 2010 00:49:25 schrieb vicente.botet:

...
...
...
solves. How can you ensure that each resource has an equivalent stack of local nested transactions if you create them only when the application access a resource on the context of a global transaction?

I've seen that call. This ensure that you have a complete stack on the moment of the resource_transaction, but not when the application commit. This means that you can have resources that can participate on the transaction on the outer levels but not on the inner ones. As this resource will not be commited when the inner transaction is commited, we are unable to see conflicts with other transactions respect to this resource.

what conflict can the other resource cause if there is no nested transaction in it?

global_root_tx( resource1_root_tx, resource2_root_tx) ^ ^ | | | | global_nested_tx(resource1_nested_tx, none)

global_nested_tx.commit(): resource1_nested_tx is published into resource1_root_tx. how can this call cause a conflict in resource2?

I said

...

...
As this resource will not be commited when the inner transaction is commited, we are unable to see conflicts with other transactions respect to this resource.

What I want is the the commit on global_nested_tx see if this transaction must be aborted due the resource2 has conflics with other resources2 of transactions. No need to continue commiting N nesting levels on resorce1 if resource2 has already conflicts. This is maybe an optimization, but we use things like that in Boost.STM.

...

...
...
how do you construct a transaction manager (and the resource managers used by it) at the point of (lazy) singleton construction if you don't have any constructor arguments?

The answer was already in my preceding post. If basic_transaction_manager is not able to define this function, we can make basic_transaction_manager a mixin, that will have the final transaction_manager as parameter. Note the new parameter Final.

template<class Final, class Resources,bool Threads=true,bool TThreads=true> class basic_transaction_manager { static Final& instance() { return Final::instance(); } ... }

I don't understand what difference that makes. I thought your point was that the transaction manager should be stored as an invariant singleton. How is that possible, with or without mixin? What does the mixim accomplish? What is the difference between instance() and the current active()?

The trick is that instance is implemented by the Final class, and so can call a specific construtor. Your active() function is implemented by basic_transaction_manager, which is not able to make differences. basic_transaction_manager instance function could in addition ensure that there is a single instance, but doesn't build it. Best, Vicente

strasser＠uni-bremen.de

7:48 p.m.

New subject: [transaction] New Boost.Transactionlibraryunderdiscussion

Zitat von "vicente.botet" <vicente.botet@wanadoo.fr>:

...

...
what conflict can the other resource cause if there is no nested transaction in it?

global_root_tx( resource1_root_tx, resource2_root_tx) ^ ^ | | | | global_nested_tx(resource1_nested_tx, none)

global_nested_tx.commit(): resource1_nested_tx is published into resource1_root_tx. how can this call cause a conflict in resource2?

What I want is the the commit on global_nested_tx see if this transaction must be aborted due the resource2 has conflics with other resources2 of transactions. No need to continue commiting N nesting levels on resorce1 if resource2 has already conflicts. This is maybe an optimization, but we use things like that in Boost.STM.

ok, I see. consider the following example: transaction roottx; //modify res1 //modify res2 (1) { transaction nestedtx; //modify res1 nestedtx.commit(); // (2) } rootrx.commit(); like we´ve said, res2 isn´t taking part in the nested transaction, there is nothing between lines (1) and (2) that could create a conflict in res2. so if Boost.STM tries to discover conflicts as early as possible, why wasn´t the conflict discovered in line (1)? is Boost.STM checking for conflicts regularily, but not on every access, so line (1) doesn´t check for conflicts but you want to check for conflicts if the nestedtx takes, like, 10 seconds of runtime? if that´s the case we could implement something in basic_transaction_manager that regularily calls a resource manager so it can check for conflicts e.g. once every second even if a resource doesn´t take part in a transaction. what exactly triggers that in Boost.STM at the moment? I understand that this must be part of regular calls by the user and cannot be part of another thread, because the exception must be thrown in the user´s thread. but I don´t think that is a reason to drop the lazy construction of resource transactions entirely. take e.g. using Boost.STM and using a remote SQL database together, both managed by Boost.Transaction. starting a transaction to access boost.STM would automatically send a BEGIN to a, possibly remote, SQL database, if resource transaction aren´t lazily constructed, even though all you wanted to do is access some transactional memory.

...

...
I don't understand what difference that makes. I thought your point was that the transaction manager should be stored as an invariant singleton. How is that possible, with or without mixin? What does the mixim accomplish? What is the difference between instance() and the current active()?

The trick is that instance is implemented by the Final class, and so can call a specific construtor. Your active() function is implemented by basic_transaction_manager, which is not able to make differences. basic_transaction_manager instance function could in addition ensure that there is a single instance, but doesn't build it.

I think I know what you mean now, but are you really sure you prefer that approach? I see several problems with it, IIUC. example: typedef basic_transaction_manager<my_mixin,...> txmgr; struct my_mixin{ txmgr &instance(){ //construct transaction manager and accompanying resources, //using this->db_filename; } static char *db_filename; ... }; int main(){ my_mixin::db_filename="file.db"; my_mixin::...=...; ... basic_transaction<txmgr> tx; ... tx.commit(); } 1. it´s strange code in my book. requiring the user to initialize runtime configuration options before it is lazily constructed by a call to instance() of basic_transaction. 2. the fact that it is lazily constructed requires the call to instance() be protected by a mutex, or call_once. this is a global mutex, spanning all threads, locked at each call to basic_transaction_manager::active(). did I get that right? if this is not what you meant please provide a code example. Regards, Stefan

Vicente Botet Escriba

10:56 p.m.

New subject: [transaction] New Boost.Transactionlibraryunderdiscussion

Stefan Strasser-2 wrote:

...

Zitat von "vicente.botet" <vicente.botet@wanadoo.fr>:

...
...
what conflict can the other resource cause if there is no nested transaction in it?

global_root_tx( resource1_root_tx, resource2_root_tx) ^ ^ | | | | global_nested_tx(resource1_nested_tx, none)

global_nested_tx.commit(): resource1_nested_tx is published into resource1_root_tx. how can this call cause a conflict in resource2?

What I want is the the commit on global_nested_tx see if this transaction must be aborted due the resource2 has conflics with other resources2 of transactions. No need to continue commiting N nesting levels on resorce1 if resource2 has already conflicts. This is maybe an optimization, but we use things like that in Boost.STM.

ok, I see. consider the following example:

transaction roottx;

//modify res1 //modify res2 (1)

{ transaction nestedtx; //modify res1 nestedtx.commit(); // (2) }

rootrx.commit();

like we´ve said, res2 isn´t taking part in the nested transaction, there is nothing between lines (1) and (2) that could create a conflict in res2.

so if Boost.STM tries to discover conflicts as early as possible, why wasn´t the conflict discovered in line (1)?

Because other transaction on another thread has committed an update on res2 between (1) and (2). Depending on the Validation policy the STM system will either invalidate every transaction that can not succeed (setting its stae to rollback only), or do nothing. When validation is used , the victim transaction will see that it is a victim when try to use this resource res2 and then abort itself. Stefan Strasser-2 wrote:

...

is Boost.STM checking for conflicts regularily, but not on every access, so line (1) doesn´t check for conflicts but you want to check for conflicts if the nestedtx takes, like, 10 seconds of runtime?

STM check for conflicts in any access. So it is able to see a conflict in (1). Stefan Strasser-2 wrote:

...

if that´s the case we could implement something in basic_transaction_manager that regularily calls a resource manager so it can check for conflicts e.g. once every second even if a resource doesn´t take part in a transaction.

I don't think polling will be satisfactory Stefan Strasser-2 wrote:

...

what exactly triggers that in Boost.STM at the moment?

STM check also for rollback only state in any access or nested transaction creation. This rollback only state can be set by other transaction as explained above. Stefan Strasser-2 wrote:

...

I understand that this must be part of regular calls by the user and cannot be part of another thread, because the exception must be thrown in the user´s thread. but I don´t think that is a reason to drop the lazy construction of resource transactions entirely.

take e.g. using Boost.STM and using a remote SQL database together, both managed by Boost.Transaction. starting a transaction to access boost.STM would automatically send a BEGIN to a, possibly remote, SQL database, if resource transaction aren´t lazily constructed, even though all you wanted to do is access some transactional memory.

I understand the intent of lazy constructing the local transactions. IMO this lazy construction can be done, but as soon as a resource has already a transaction, any global nested transaction should create a local transaction on it. All the resources have the same nested level of transaction except those that have none. This should solve the preceding issue, and let STM check for rollback only as soon as possible. Stefan Strasser-2 wrote:

...

...
...
I don't understand what difference that makes. I thought your point was that the transaction manager should be stored as an invariant singleton. How is that possible, with or without mixin? What does the mixim accomplish? What is the difference between instance() and the current active()?

The trick is that instance is implemented by the Final class, and so can call a specific construtor. Your active() function is implemented by basic_transaction_manager, which is not able to make differences. basic_transaction_manager instance function could in addition ensure that there is a single instance, but doesn't build it.

I think I know what you mean now, but are you really sure you prefer that approach? I see several problems with it, IIUC.

example:

typedef basic_transaction_manager<my_mixin,...> txmgr;

struct my_mixin{ txmgr &instance(){ //construct transaction manager and accompanying resources, //using this->db_filename; } static char *db_filename; ... };

int main(){ my_mixin::db_filename="file.db"; my_mixin::...=...; ... basic_transaction<txmgr> tx; ... tx.commit(); }

1. it´s strange code in my book. requiring the user to initialize runtime configuration options before it is lazily constructed by a call to instance() of basic_transaction. 2. the fact that it is lazily constructed requires the call to instance() be protected by a mutex, or call_once. this is a global mutex, spanning all threads, locked at each call to basic_transaction_manager::active().

did I get that right? if this is not what you meant please provide a code example.

It is extrange also tome. Well, I don' use the mixin idiom tis way. For the mutex a double locking patter solve the issue of locking at each call to active(). I will do as follows: struct txmgr : basic_transaction_manager<txmgr,...> { txmgr &instance(){ static txmgr* instance_; static mutex mtx_; if (instance_==0) { scoped_lock<mutex> lk(mtx_); if (instance_==0) instance_ = new txmgr(...); } // here instance_!= 0 return *instance_; } static char *db_filename; ... }; char * txmgr::db_filename="file.db"; int main(){ basic_transaction<txmgr> tx; ... tx.commit(); } Can you found this code on your book? ;-) Best, Vicente -- View this message in context: http://old.nabble.com/-transaction--New-Boost.Transaction-library-under-disc... Sent from the Boost - Dev mailing list archive at Nabble.com.

Stefan Strasser

9:43 p.m.

New subject: [transaction] New Boost.Transactionlibraryunderdiscussion

Am Thursday 21 January 2010 23:56:07 schrieb Vicente Botet Escriba:

...

I understand the intent of lazy constructing the local transactions. IMO this lazy construction can be done, but as soon as a resource has already a transaction, any global nested transaction should create a local transaction on it. All the resources have the same nested level of transaction except those that have none. This should solve the preceding issue, and let STM check for rollback only as soon as possible.

ok, that STM can check on each access makes it even easier: let's introduce a ResourceManager::check(transaction &) function, which is called by the TransactionManager every time a transaction is created or committed(or on every access?). then there is no need to create empty nested transactions, is there? this could even be exposed to the user as basic_transaction<>::check(), which checks each resource, so he can do something like this: transaction tx; //modify res tx.check(); //do something unrelated, non-transactional, that takes a while //continue modifying res

...

if (instance_==0) { scoped_lock<mutex> lk(mtx_); if (instance_==0) instance_ = new txmgr(...); } // here instance_!= 0

I used something like that in basic_transaction_manager::resource_transaction before, but I believe this is invalid code. what stops the compiler from inlining the txmgr constructor and decomposing it into code like this? if(instance_==0){ scoped_lock<mutex> lk(mtx_); if(instance_==0){ instance_=operator new(sizeof(txmgr)); //construct the txmgr in instance_; } }

...

return *instance_; } static char *db_filename; ... }; char * txmgr::db_filename="file.db";

I put that in main() because configuration options might be determined at runtime. but ok, that might be quite rare. I'd still prefer that it's the user's obligation to construct a transaction manager, but maybe that's also just a difference in style. what exactly is your issue with that? that the user could make a programming error by omitting the construction?

...

int main(){ basic_transaction<txmgr> tx; ... tx.commit(); }

Can you found this code on your book? ;-)

no, not there :)

vicente.botet

22 Jan 22 Jan

6:42 a.m.

New subject: [transaction] NewBoost.Transactionlibraryunderdiscussion

Hi, ----- Original Message ----- From: "Stefan Strasser" <strasser@uni-bremen.de> To: <boost@lists.boost.org> Sent: Thursday, January 21, 2010 10:43 PM Subject: Re: [boost] [transaction] NewBoost.Transactionlibraryunderdiscussion

...

Am Thursday 21 January 2010 23:56:07 schrieb Vicente Botet Escriba:

...
I understand the intent of lazy constructing the local transactions. IMO this lazy construction can be done, but as soon as a resource has already a transaction, any global nested transaction should create a local transaction on it. All the resources have the same nested level of transaction except those that have none. This should solve the preceding issue, and let STM check for rollback only as soon as possible.

ok, that STM can check on each access makes it even easier: let's introduce a ResourceManager::check(transaction &) function, which is called by the TransactionManager every time a transaction is created or committed(or on every access?). then there is no need to create empty nested transactions, is there?

No, not really. So, you where right no need to create all the local transaction contexts. Do you need to create all the nested transactions lazyly up to the bottom? In the implementation you talked about an optimization.

...

this could even be exposed to the user as basic_transaction<>::check(), which checks each resource, so he can do something like this:

transaction tx;

//modify res

tx.check();

//do something unrelated, non-transactional, that takes a while

//continue modifying res

check doesn't means to much, we will need something more specific. But we will see.

...

...
if (instance_==0) { scoped_lock<mutex> lk(mtx_); if (instance_==0) instance_ = new txmgr(...); } // here instance_!= 0

I used something like that in basic_transaction_manager::resource_transaction before, but I believe this is invalid code.

This is not correct code but it is portable to a lot of compilers. The implementation can of course be made more robust using atomic to access the instance_ field on comilers that don't ensure atomicity on readig/writing a pointer. Anyway you see the idea, it is the Final class that constructs the instance. Not the mixin. And the singleton protocol can be ensured by another mixin, or by the vasic_transaction_manager itself.

...

what stops the compiler from inlining the txmgr constructor and decomposing it into code like this?

if(instance_==0){ scoped_lock<mutex> lk(mtx_); if(instance_==0){ instance_=operator new(sizeof(txmgr)); //construct the txmgr in instance_; } }

I don't see what is the issue.

...

...
return *instance_; } static char *db_filename; ... }; char * txmgr::db_filename="file.db";

I put that in main() because configuration options might be determined at runtime. but ok, that might be quite rare.

This is way I said that this should be associated to a resource. There are some resourcess that could need some open/connect close/disconnect specific phases. This resources can not participate on a transaction out of this [open,close] interval.

...

From my point of view an application should be able to work with transactional data, connect/disconnect to a database as many times as it is needed, without for this needing to stop the transaction manager, but maybe I'm wrong.

...

I'd still prefer that it's the user's obligation to construct a transaction manager, but maybe that's also just a difference in style. what exactly is your issue with that? that the user could make a programming error by omitting the construction?

As I said before if the user can bind itself, the last bind overrides the preceding. This introduce interaction problems between different parts of the application. Which one should make the bind? This does not means that a resource manager can not have its own bind (connect/disconnect) protocol. Vicente

strasser＠uni-bremen.de

12:26 p.m.

New subject: [transaction] NewBoost.Transactionlibraryunderdiscussion

Zitat von "vicente.botet" <vicente.botet@wanadoo.fr>:

...

Do you need to create all the nested transactions lazyly up to the bottom?

that´s how it´s implemented now, but I *think* you can avoid that. but I haven´t thought this through entirely. global_roottx (res1_roottx, none) global_nestedtx(res1_nestedtx,res2_roottx) note that there is a root transaction in resource2 used by the global nested transaction. then, when global_nestedtx is committed, res1_nestedtx is committed, and res2_roottx is moved to the global_roottx, without asking resource2 to do anything. result: global_roottx(res1_roottx(incorporated res1_nestedtx), res2_roottx) I think this is equivalent to creating both transactions in resource2, but the TODO comment was only the result of this idea of the top of my head, there might be some subtle problem with that approach I´m not seeing right now. the same could be done when there is a gap in the "nested transaction stack".

...

...
tx.check();

//do something unrelated, non-transactional, that takes a while

//continue modifying res

check doesn't means to much, we will need something more specific. But we will see.

what do you mean? the function name or anything beyond that?

...

...
if(instance_==0){ scoped_lock<mutex> lk(mtx_); if(instance_==0){ instance_=operator new(sizeof(txmgr)); //(**) //construct the txmgr in instance_; } }

...

This is not correct code but it is portable to a lot of compilers.

that the pointer might not be an atomic type on some exotic platform is not the issue, I believe this code is also invalid on x86. when there is a thread switch at (**) another thread goes ahead with an unconstructed transaction manager!

...

...
I'd still prefer that it's the user's obligation to construct a transaction manager, but maybe that's also just a difference in style. what exactly is your issue with that? that the user could make a programming error by omitting the construction?

As I said before if the user can bind itself, the last bind overrides the preceding. This introduce interaction problems between different parts of the application. Which one should make the bind?

This does not means that a resource manager can not have its own bind (connect/disconnect) protocol.

ok, I agree with the concern. I´d like to find another solution to it though. there are more problems with that approach in addition to the ones I already mentioned: - unexpected construction, and unexpected exceptions that follow from construction e.g. the first access to an object of a resource: pers_obj->value=1; can throw all kinds of exceptions related to resource construction, including: recovery_failure, failure to connect to a remote database, etc. which is kind of unexpected from the user´s perspective. - it makes default configurations more difficult. boost.persistent let´s you construct a default transaction manager that only uses boost.persistent. when there is a mixin that is always user defined, I don´t think there can be a default configuration? the user would always have to #define the configuration. - destruction. when is the lazily constructed transaction manager destructed? what do you think about simply removing TransactionManager::bind() and ::has_active(), and make its constructor throw if there already is a TransactionManager? then it´s a real singleton, and the TransactionManagre could then support connecting/disconnecting resources to support opening one database file first and then another, like shown in the example we´ve used in a previous mail.

Vicente Botet Escriba

3:29 p.m.

New subject: [transaction] NewBoost.Transactionlibraryunderdiscussion

strasser wrote:

...

Zitat von "vicente.botet" <vicente.botet@wanadoo.fr>:

...
Do you need to create all the nested transactions lazyly up to the bottom?

that´s how it´s implemented now, but I *think* you can avoid that. but I haven´t thought this through entirely.

global_roottx (res1_roottx, none) global_nestedtx(res1_nestedtx,res2_roottx)

note that there is a root transaction in resource2 used by the global nested transaction.

then, when global_nestedtx is committed, res1_nestedtx is committed, and res2_roottx is moved to the global_roottx, without asking resource2 to do anything. result:

global_roottx(res1_roottx(incorporated res1_nestedtx), res2_roottx)

I think this is equivalent to creating both transactions in resource2, but the TODO comment was only the result of this idea of the top of my head, there might be some subtle problem with that approach I´m not seeing right now. the same could be done when there is a gap in the "nested transaction stack".

I agree. We will see this point later on the implementation. strasser wrote:

...

...
...
tx.check();

//do something unrelated, non-transactional, that takes a while

//continue modifying res

check doesn't means to much, we will need something more specific. But we will see.

what do you mean? the function name or anything beyond that?

The name. Something more explicit as abort_if_rollback_only(). strasser wrote:

...

...
...
if(instance_==0){ scoped_lock<mutex> lk(mtx_); if(instance_==0){ instance_=operator new(sizeof(txmgr)); //(**) //construct the txmgr in instance_; } }

...
This is not correct code but it is portable to a lot of compilers.

that the pointer might not be an atomic type on some exotic platform is not the issue, I believe this code is also invalid on x86. when there is a thread switch at (**) another thread goes ahead with an unconstructed transaction manager!

Oh, you where loking for that error. What about txmgr* tmp=operator new(sizeof(txmgr)); //(**) //construct the txmgr in tmp; instance_=tmp; strasser wrote:

...

...
...
I'd still prefer that it's the user's obligation to construct a transaction manager, but maybe that's also just a difference in style. what exactly is your issue with that? that the user could make a programming error by omitting the construction?

As I said before if the user can bind itself, the last bind overrides the preceding. This introduce interaction problems between different parts of the application. Which one should make the bind?

This does not means that a resource manager can not have its own bind (connect/disconnect) protocol.

ok, I agree with the concern. I´d like to find another solution to it though.

there are more problems with that approach in addition to the ones I already mentioned:

- unexpected construction, and unexpected exceptions that follow from construction

e.g. the first access to an object of a resource:

pers_obj->value=1;

can throw all kinds of exceptions related to resource construction, including: recovery_failure, failure to connect to a remote database, etc. which is kind of unexpected from the user´s perspective.

this is an issue already as the instance is create after the first access. strasser wrote:

...

- it makes default configurations more difficult. boost.persistent let´s you construct a default transaction manager that only uses boost.persistent. when there is a mixin that is always user defined, I don´t think there can be a default configuration? the user would always have to #define the configuration.

boost.persistent.could define a default configuration using the mixin as it does now, isn't it? strasser wrote:

...

- destruction. when is the lazily constructed transaction manager destructed?

Form my point of view, at the program termination. strasser wrote:

...

what do you think about simply removing TransactionManager::bind() and ::has_active(), and make its constructor throw if there already is a TransactionManager? then it´s a real singleton, and the TransactionManagre could then support connecting/disconnecting resources to support opening one database file first and then another, like shown in the example we´ve used in a previous mail.

+1 for removing TransactionManager::bind() and ::has_active(), and make its constructor throw if there already is a TransactionManager +1 to support connecting/disconnecting resources to support opening one database file first and then another Best, Vicente -- View this message in context: http://old.nabble.com/-transaction--New-Boost.Transaction-library-under-disc... Sent from the Boost - Dev mailing list archive at Nabble.com.

Stefan Strasser

6:55 p.m.

New subject: [transaction] NewBoost.Transactionlibraryunderdiscussion

...

+1 for removing TransactionManager::bind() and ::has_active(), and make its constructor throw if there already is a TransactionManager

+1 to support connecting/disconnecting resources to support opening one database file first and then another

I have now implemented that, with one difference: why even construct a basic_transaction_manager when it is a singleton and all of its state is mutable(connecting and disconnecting resources)? basic_transaction_manager now has only static members, e.g. TxMgr::active_transaction(): https://svn.boost.org/svn/boost/sandbox/transaction/dev/basic_transaction_ma... changeset: https://svn.boost.org/trac/boost/changeset/59230 Am Friday 22 January 2010 16:29:33 schrieb Vicente Botet Escriba:

...

strasser wrote:

...
...
...
tx.check();

//do something unrelated, non-transactional, that takes a while

//continue modifying res

check doesn't means to much, we will need something more specific. But we will see.

what do you mean? the function name or anything beyond that?

The name. Something more explicit as abort_if_rollback_only().

I'd use check_transaction or something like that. calling it abort_if_rollback_only() sounds like calling commit_transaction() publish_if_no_conflict().

...

...
that the pointer might not be an atomic type on some exotic platform is not the issue, I believe this code is also invalid on x86. when there is a thread switch at (**) another thread goes ahead with an unconstructed transaction manager!

Oh, you where loking for that error. What about

txmgr* tmp=operator new(sizeof(txmgr)); //(**) //construct the txmgr in tmp; instance_=tmp;

how do you control that? Boost.Atomic? that would probably be close to using a mutex (at least a memory fence), but the issue is now irrelevant anyway.

Stefan Strasser

23 Jan 23 Jan

8:37 p.m.

New subject: [transaction] NewBoost.Transactionlibraryunderdiscussion

Am Friday 22 January 2010 19:55:07 schrieben Sie:

...

...
+1 for removing TransactionManager::bind() and ::has_active(), and make its constructor throw if there already is a TransactionManager

+1 to support connecting/disconnecting resources to support opening one database file first and then another

I have now implemented that, with one difference: why even construct a basic_transaction_manager when it is a singleton and all of its state is mutable(connecting and disconnecting resources)?

unfortunately, the answer to this question is: because basic_transaction_manager will have a state beyond state as soon at it maintains a distributed transaction log. so that part will have to be reversed, so the user can construct it as we've discussed, using e.g. the filename for the log.

Oliver Kowalke

18 Jan 18 Jan

6:27 p.m.

New subject: [transaction] New Boost.Transaction library under discussion

I would suggest a three phase model for transactions operating on shared resource: 1.) prepare phase: multiple readers/one writer have access to the resource 2.) modification phase: only one writer has access and can can modify the resource 3.) commit/rollback phase: write publishes its local modifications and releases the execlusive lock With this pattern you cold support queries and transactions in parrallel (at least for phase 1 and 3). Oliver vicente.botet wrote:

...

Hi Stefan, Bob,

I've created in the sandbox the transaction directory (http://svn.boost.org/svn/boost/sandbox/transaction/) so we can share our work. I've created also a wiki page (http://svn.boost.org/trac/boost/wiki/BoostTransaction) on which we can compile the result of our discussions, requirements for the library, design, ...

If you don't have right access to the Boost wiki pages, please request them throught this ML.

Of course, anyone is welcome to participate, or comment on this wiki.

HTH, _____________________ Vicente Juan Botet Escribá _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Stefan Strasser

9:02 p.m.

New subject: [transaction] New Boost.Transaction library under discussion

Am Monday 18 January 2010 19:27:25 schrieb Oliver Kowalke:

...

I would suggest a three phase model for transactions operating on shared resource:

1.) prepare phase: multiple readers/one writer have access to the resource 2.) modification phase: only one writer has access and can can modify the resource 3.) commit/rollback phase: write publishes its local modifications and releases the execlusive lock

With this pattern you cold support queries and transactions in parrallel (at least for phase 1 and 3).

I would leave this up to the resources. the resource only has to make sure that the state that was the basis for signalling a successful prepare-phase in a two-phase-commit is upheld until commit: prepare: returns if the transaction can be committed. commit: publish the local modifications. if prepare signalled a successful transaction, this has to succeed. an exclusive lock to the entire resource could be used to upheld the state of the prepare phase until commit, but there are other ways. my implementation e.g. never blocks another transaction from reading an object, even during commit, based on http://en.wikipedia.org/wiki/Optimistic_concurrency_control and http://en.wikipedia.org/wiki/Multiversion_concurrency_control Bob(STLdb) is more concerned about unnessary rollbacks because he expects a higher rate of concurrent access in indexes, so locks are even acquired in your "phase 1" and are upheld until commit. both approaches might be right for their specific use case.

Bob Walters

19 Jan 19 Jan

4:39 a.m.

New subject: [transaction] New Boost.Transaction library under discussion

On Mon, Jan 18, 2010 at 4:02 PM, Stefan Strasser <strasser@uni-bremen.de> wrote:

...

an exclusive lock to the entire resource could be used to upheld the state of the prepare phase until commit, but there are other ways. my implementation e.g. never blocks another transaction from reading an object, even during commit, based on http://en.wikipedia.org/wiki/Optimistic_concurrency_control and http://en.wikipedia.org/wiki/Multiversion_concurrency_control

How do you control atomicity with this design? i.e. If a transaction modifies A and B and you are using a lock-free approach to update the cached versions of these objects from their shadow copies, how do you ensure that any other threads which might be reading A and B (or B and A) see the two entries consistently? Is this isolation enforced, or can other threads see a mixture of A and B from before/after the commit point?

Stefan Strasser

12:23 p.m.

New subject: [transaction] New Boost.Transaction library under discussion

Am Tuesday 19 January 2010 05:39:30 schrieb Bob Walters:

...

On Mon, Jan 18, 2010 at 4:02 PM, Stefan Strasser <strasser@uni-bremen.de> wrote:

...
an exclusive lock to the entire resource could be used to upheld the state of the prepare phase until commit, but there are other ways. my implementation e.g. never blocks another transaction from reading an object, even during commit, based on http://en.wikipedia.org/wiki/Optimistic_concurrency_control and http://en.wikipedia.org/wiki/Multiversion_concurrency_control

How do you control atomicity with this design? i.e. If a transaction modifies A and B and you are using a lock-free approach to update the cached versions of these objects from their shadow copies, how do you ensure that any other threads which might be reading A and B (or B and A) see the two entries consistently?

that's part of the optimistic approach, a transaction is allowed to see an inconsistent (inter-object) state, but it is guaranteed to fail with an isolation_exception, at commit at the latest, if it did. (there were also pessimistic transactions once, but they fell victim to some refactoring...)

...

I'm not against throwing away portions of my code base to better unify.

I'm neither. could you point me to your free list/allocation code? my library spends a lot of time allocating and deallocating disk space, I guess there could be improvements. my logger is already pretty generic (because I planned to use it for the storage engine log and for the log of the transaction manager for distributed transaction). here's some example code on how to use it: typedef mpl::vector< map_inserted, //must be POD map_entry_removed, ...,

...

entries;

olog<entries> myolog(...); myolog << map_inserted(1,2,3); myolog << map_entry_removed(4,5); --- struct listener{ void operator()(map_inserted){ } void operator()(map_entry_removed){ } }; ilog<entries> myilog(...); myilog >> listener(); hidden behind olog is all the log rolling and the syncing interval stuff we've discussed to ensure sequential writing performance.

Bob Walters

6:53 p.m.

New subject: [transaction] New Boost.Transaction library under discussion

On Tue, Jan 19, 2010 at 7:23 AM, Stefan Strasser <strasser@uni-bremen.de> wrote:

...

Am Tuesday 19 January 2010 05:39:30 schrieb Bob Walters:

...
On Mon, Jan 18, 2010 at 4:02 PM, Stefan Strasser <strasser@uni-bremen.de> wrote:

...
an exclusive lock to the entire resource could be used to upheld the state of the prepare phase until commit, but there are other ways. my implementation e.g. never blocks another transaction from reading an object, even during commit, based on http://en.wikipedia.org/wiki/Optimistic_concurrency_control and http://en.wikipedia.org/wiki/Multiversion_concurrency_control

How do you control atomicity with this design? i.e. If a transaction modifies A and B and you are using a lock-free approach to update the cached versions of these objects from their shadow copies, how do you ensure that any other threads which might be reading A and B (or B and A) see the two entries consistently?

that's part of the optimistic approach, a transaction is allowed to see an inconsistent (inter-object) state, but it is guaranteed to fail with an isolation_exception, at commit at the latest, if it did. (there were also pessimistic transactions once, but they fell victim to some refactoring...)

OK. This would worry me if I was planning to use Boost.Persistent in an app. I understand one aspect of the isolation in acid to imply that when a given thread commits a transaction, all of the changes within that transaction become visible to other threads at a single moment in time. Or put another way: t1: modifies A and B, commits. t2: reads A and sees the newly committed value from t1 t2: reads B it *must* see the committed value of B from t1, or from some later commit, and not a value from before t1. IIUC there is a race condtion which could allow t2 to see the older value of B if t1 and t2 overlayed in just the right way. (?) Although your optimistic locking will protect t2 from an update to an old version of B, it won't prevent t2 from using B in a read-only manner (IIUC?), thinking that the value is current and correct. To me that's a problem. I *think* my interpretation of the isolation requirement is correct here. Or at least, I am pretty sure that all RDBMS provide this kind of assurance. This is one of the main reasons that I ended up having to lock the map during commit processing - to ensure the moment-in-time application of all changes - I couldn't just update entries sequentially.

...

I'm neither. could you point me to your free list/allocation code? my library spends a lot of time allocating and deallocating disk space, I guess there could be improvements.

I'm working on this currently with the checkpoint algorithm, and it isn't checked in yet. But I'm using a multimap keyed by chunk size to implement best fit. One consequence of this is that I'm not (currently) writing to the checkpoint sequentially. It can seek around, putting free space to use. I don't sync the file until all entries for the checkpoint have been written, so hopefully, that will help mitigate this.

...

my logger is already pretty generic (because I planned to use it for the storage engine log and for the log of the transaction manager for distributed transaction). here's some example code on how to use it:

<snip>

This looks very usable and even a bit familiar in the sense that I'm also accumulating operations for later execution during commit.

strasser＠uni-bremen.de

7:42 p.m.

Zitat von Bob Walters <bob.s.walters@gmail.com>:

...

t1: modifies A and B, commits. t2: reads A and sees the newly committed value from t1 t2: reads B it *must* see the committed value of B from t1, or from some later commit, and not a value from before t1.

IIUC there is a race condtion which could allow t2 to see the older value of B if t1 and t2 overlayed in just the right way. (?) Although your optimistic locking will protect t2 from an update to an old version of B, it won't prevent t2 from using B in a read-only manner (IIUC?), thinking that the value is current and correct. To me that's a problem.

this is indeed something an optimistic implementation has to take care of, but it is taken care of: case 1: t1: reads A t2: modifies A, commits t1: reads A again //throws case 2: t1: reads A t2: modifies A, commits t1: uses the value read above from A to modify B. t1: commits //throws there is some problems with optimistic transactions with algorithms that can´t handle this kind of inter-object inconsistency, but it can never lead to an inconsistent database state. I think this is more an academic problem than it is practical, and a lot of databases don´t allow these kinds of algorithms I think. for example: int *array=new int[123]; loc<pers_type> a=...; loc<pers_type> b=...; array[a->value - b->value]=1; this depends on a consistent inter-object state at any time. even though the transction would fail on commit, your application has already crashed before the commit is reached. a pessimistic transaction would have to be used here. I haven´t (re-)implemented those as I figured it is a too obscure feature that anyone would ever use it.

...

I'm working on this currently with the checkpoint algorithm, and it isn't checked in yet. But I'm using a multimap keyed by chunk size to implement best fit.

I currently also use a intrusive multiset by size and an intrusive set by offset, to be able to match by size and merge blocks on deallocation. since even a modified object requires an allocation it can be quite costly managing the maps.

Bob Walters

24 Jan 24 Jan

2:34 a.m.

New subject: [transaction] New Boost.Transaction library under discussion

On Mon, Jan 18, 2010 at 4:02 PM, Stefan Strasser <strasser@uni-bremen.de> wrote:

...

the resource only has to make sure that the state that was the basis for signalling a successful prepare-phase in a two-phase-commit is upheld until commit: prepare: returns if the transaction can be committed. commit: publish the local modifications. if prepare signalled a successful transaction, this has to succeed.

Stefan, IIUC, you are essentially proposing we implement resource managers per the XA standard, correct? (http://www.opengroup.org/onlinepubs/009680699/toc.pdf) I'm basing this on your previous reference to BerkeleyDB's XA APIs. How far do you intend to go with ensuring the recoverability of these multi-resource transactions? Specifically, The XA standard requires that RMs persist the state of the transaction during prepare() so that the transaction can still be completed or rolled back, consistently across all RMs even if the two-phase commit is interrupted by process failures. The TM itself must have some form of persistent storage to track the state of all such transactions, for recovery processing. Am I understanding your intention correctly? - Bob

strasser＠uni-bremen.de

3:58 a.m.

Zitat von Bob Walters <bob.s.walters@gmail.com>:

...

Stefan, IIUC, you are essentially proposing we implement resource managers per the XA standard, correct? (http://www.opengroup.org/onlinepubs/009680699/toc.pdf) I'm basing this on your previous reference to BerkeleyDB's XA APIs.

How far do you intend to go with ensuring the recoverability of these multi-resource transactions? Specifically, The XA standard requires that RMs persist the state of the transaction during prepare() so that the transaction can still be completed or rolled back, consistently across all RMs even if the two-phase commit is interrupted by process failures.

I haven't read much of the XA standard, but I guess it's pretty much the same thing. you also use write-ahead redo-logging, don't you? so what the RMs need to do is split up the commit in 2 phases with a prepare message written to the log at the end of the prepare phase, and change the recovery process a bit. right now when the log is read a transaction is replayed based on if there's a commit message or not. when there is a prepare message (but no commit or rollback) the RM needs to report that to the TM through a function like recover() and then the TM decides if that transaction ought to be replayed or rolled back. The TM itself must have some form

...

of persistent storage to track the state of all such transactions, for recovery processing. Am I understanding your intention correctly?

yes, I've been referring to that by "distributed transaction log". it's a quite simple log that contains commits messages for distributed transactions and a list of the IDs of the resource transactions for each. at least in my mind, I haven't implemented any of this. basic_transaction_manager throws if more than one RM took part in the transaction.

Bob Walters

25 Jan 25 Jan

6:46 p.m.

New subject: [transaction] New Boost.Transaction library under discussion

On Sat, Jan 23, 2010 at 10:58 PM, <strasser@uni-bremen.de> wrote:

...

Zitat von Bob Walters <bob.s.walters@gmail.com>:

...
Stefan, IIUC, you are essentially proposing we implement resource managers per the XA standard, correct? (http://www.opengroup.org/onlinepubs/009680699/toc.pdf) I'm basing this on your previous reference to BerkeleyDB's XA APIs.

How far do you intend to go with ensuring the recoverability of these multi-resource transactions? Specifically, The XA standard requires that RMs persist the state of the transaction during prepare() so that the transaction can still be completed or rolled back, consistently across all RMs even if the two-phase commit is interrupted by process failures.

I haven't read much of the XA standard, but I guess it's pretty much the same thing.

you also use write-ahead redo-logging, don't you?

Yes.

...

so what the RMs need to do is split up the commit in 2 phases with a prepare message written to the log at the end of the prepare phase, and change the recovery process a bit.

Typically, this results in a lot of sync()s, but I'm not sure how to avoid that. e.g. prepare() has to sync because the resource manager is not allowed to forget the transaction due to machine death. commit() presumably must sync again. The TM does its own logging (or perhaps shares a resource manager) to keep track of transaction progress. Once a two-phase commit protocol comes into play, things slow down. One result of this is that we should be sure that the TM is optimized for one-phase commits when only one RM is in play.

...

right now when the log is read a transaction is replayed based on if there's a commit message or not. when there is a prepare message (but no commit or rollback) the RM needs to report that to the TM through a function like recover() and then the TM decides if that transaction ought to be replayed or rolled back.

This makes sense. There is more to the RM API than just prepare(), rollback() and commit(). We will probably end up needing recover(), forget(), etc. per the XA api, when this is all worked out. I wonder if either of you have also considered the notion of sharing a transactional disk infrastructure (e.g. write-ahead log and checkpointing mechanism) so that it would actually be possible for the 3 different libraries to share a transactional context and result in a single write-ahead log record per transaction. i.e. One log overall, instead of one log per RM. Specifically: prepare() might have each RM create a buffer containing the data that they need in order to recover that RMs changes for the transaction. The sum of this data for each RMs is written to the write-ahead log for the transaction. As a result, there is no need for a two-phase commit. Instead, during recovery processing, each RM is told to recover based on their particular portion of the log record contents. Checkpoints would likewise need to be coordinated in some fashion like this. The reason I mention this is because I think that the write-ahead logging and checkpointing being done by stdb and Boost.Persistence do sound comparable at this point, so I'm starting to think that looking at the problem from the vantage point of shared logging might prove optimal. Is this alternative worth looking at? Vicente, I don't know enough about your libraries design to tell whether this is feasible, and so must ask... - Bob

strasser＠uni-bremen.de

26 Jan 26 Jan

6:36 p.m.

Zitat von Bob Walters <bob.s.walters@gmail.com>:

...

...
so what the RMs need to do is split up the commit in 2 phases with a prepare message written to the log at the end of the prepare phase, and change the recovery process a bit.

Typically, this results in a lot of sync()s, but I'm not sure how to avoid that. e.g. prepare() has to sync because the resource manager is not allowed to forget the transaction due to machine death. commit() presumably must sync again. The TM does its own logging (or perhaps shares a resource manager) to keep track of transaction progress. Once a two-phase commit protocol comes into play, things slow down.

One result of this is that we should be sure that the TM is optimized for one-phase commits when only one RM is in play.

I agree. it's probably 5 syncs for a transaction across 2 RMs. the TM should also take into consideration if resources are persistent or not. Boost.STM for example is transactional memory only and does not maintain a log on disk, so when it is used together with 1 other persistent resource, the TM can perform a two-phase commit on the transient resource and a one-phase on the persistent resource. transient_resource->prepare(); persistent_resource->commit(); //not prepared transient_resource_commit(); the same is probably the case when your library is used in a non-file region? I´ve been thinking about introducing a RM category for that: typedef one_phase_tag category; //only supports one-phase commit. or typedef persistent_tag category; //supports two-phase, persistent or typedef transient_tag category; //supports two-phase, non-persistent only when 2 or more persistent_tag RMs are used in a global transaction the TM has to prepare both and write to its log to have a unique commit point, otherwise the commit of the only persistent_tag RM is the commit point. (any ideas for a better name for "one_phase_tag"?)

...

I wonder if either of you have also considered the notion of sharing a transactional disk infrastructure (e.g. write-ahead log and checkpointing mechanism) so that it would actually be possible for the 3 different libraries to share a transactional context and result in a single write-ahead log record per transaction. i.e. One log overall, instead of one log per RM.

do you see a reason why that would need support from the TM? when we've talked about the off-list I always assumed to implement that on the RM level. so when RM 1 offers service A and RM 2 offers service B, and they should share a log, a RM 3 is implemented that maintains the shared log and offers services A and B, forwarding service calls to RM 1 and 2. so as far as the TM is concerned, there is only one RM, so it performs a one-phase commit.

...

Specifically: prepare() might have each RM create a buffer containing the data that they need in order to recover that RMs changes for the transaction. The sum of this data for each RMs is written to the write-ahead log for the transaction. As a result, there is no need for a two-phase commit. Instead, during recovery processing, each RM is told to recover based on their particular portion of the log record contents. Checkpoints would likewise need to be coordinated in some fashion like this.

I don't use checkpoints, so I guess there would be no coordination needed, as long as your RM has access to the shared log?

...

The reason I mention this is because I think that the write-ahead logging and checkpointing being done by stdb and Boost.Persistence do sound comparable at this point, so I'm starting to think that looking at the problem from the vantage point of shared logging might prove optimal.

Is this alternative worth looking at?

I think it's definitely worth looking at as an optimization, but imho not as an alternative to the distributed transaction approach, as you cannot make all resources share a log, e.g. a RDBMS resource.

Vicente Botet Escriba

7:08 p.m.

strasser wrote:

...

Zitat von Bob Walters <bob.s.walters@gmail.com>:

I´ve been thinking about introducing a RM category for that:

typedef one_phase_tag category; //only supports one-phase commit.

or

typedef persistent_tag category; //supports two-phase, persistent

or

typedef transient_tag category; //supports two-phase, non-persistent

only when 2 or more persistent_tag RMs are used in a global transaction the TM has to prepare both and write to its log to have a unique commit point, otherwise the commit of the only persistent_tag RM is the commit point. (any ideas for a better name for "one_phase_tag"?)

non_recoverable? Vicente -- View this message in context: http://old.nabble.com/-transaction--New-Boost.Transaction-library-under-disc... Sent from the Boost - Dev mailing list archive at Nabble.com.

strasser＠uni-bremen.de

27 Jan 27 Jan

5:33 p.m.

Zitat von Vicente Botet Escriba <vicente.botet@wanadoo.fr>:

...

strasser wrote:

...
Zitat von Bob Walters <bob.s.walters@gmail.com>:

I´ve been thinking about introducing a RM category for that:

typedef one_phase_tag category; //only supports one-phase commit.

or

typedef persistent_tag category; //supports two-phase, persistent

or

typedef transient_tag category; //supports two-phase, non-persistent

only when 2 or more persistent_tag RMs are used in a global transaction the TM has to prepare both and write to its log to have a unique commit point, otherwise the commit of the only persistent_tag RM is the commit point. (any ideas for a better name for "one_phase_tag"?)

non_recoverable?

is there a reason why this can't be a Service of the resource manager? I don't have a clear definition of a Service so far, and neither does the XA standard linked above. services usually are domain-specific functions intended to be called by the user interface (e.g. a function to get an instance of a persistent object), not functions handling transactions itself, but is there a reason why the optional interface for distributed transactions can't use the usual service interface? I can't think of one right now. the concept would probably look something like this: concept DistributedTransactionService{ typedef unspecified services; //MPL sequence containing distributed_transaction_service_tag typedef unspecified transaction; typedef unspecified transaction_id; //must be POD static bool const persistent=unspecified; bool prepare_transaction(transaction &); transaction_id get_transaction_id(transaction const &); template<class InputIterator> //must yield transaction_id void recover(InputIterator b,InputIterator e); }; see https://svn.boost.org/svn/boost/sandbox/persistent/libs/persistent/doc/html/... and https://svn.boost.org/svn/boost/sandbox/persistent/libs/persistent/doc/html/... for information about services. the one_phase_tag RMs discussed above would be a RM that doesn't offer the DistributedTransactionService.

Bob Walters

7:54 p.m.

New subject: [transaction] New Boost.Transaction library under discussion

On Tue, Jan 26, 2010 at 1:36 PM, <strasser@uni-bremen.de> wrote:

...

Zitat von Bob Walters <bob.s.walters@gmail.com>:

I agree. it's probably 5 syncs for a transaction across 2 RMs. the TM should also take into consideration if resources are persistent or not. Boost.STM for example is transactional memory only and does not maintain a log on disk, so when it is used together with 1 other persistent resource, the TM can perform a two-phase commit on the transient resource and a one-phase on the persistent resource.

transient_resource->prepare(); persistent_resource->commit(); //not prepared transient_resource_commit();

the same is probably the case when your library is used in a non-file region?

There could be variations which are non-persistent, yes.

...

do you see a reason why that would need support from the TM? when we've talked about the off-list I always assumed to implement that on the RM level. so when RM 1 offers service A and RM 2 offers service B, and they should share a log, a RM 3 is implemented that maintains the shared log and offers services A and B, forwarding service calls to RM 1 and 2.

Perfect. For some reason, I had assumed that different logs per RM were required as part of this design.

...

I don't use checkpoints, so I guess there would be no coordination needed, as long as your RM has access to the shared log?

Yes, I can deal with it in that way. You mention that you have no checkpoint, but I thought you would need occasional msync() of the backing store, in order to eliminate the need for some of the logs, and permit log archival. Regardless, I think the coordination around that could be the domain of the RM exclusively, and not part of the TM / RM interface.

...

I think it's definitely worth looking at as an optimization, but imho not as an alternative to the distributed transaction approach, as you cannot make all resources share a log, e.g. a RDBMS resource.

True. It would be great to ensure that when the different boost transaction-capable libraries are used together, the log can be shared. This is the first that I had heard that your scope included full distributed transaction support that could incorporate RDBMS systems.

strasser＠uni-bremen.de

10:15 p.m.

Zitat von Bob Walters <bob.s.walters@gmail.com>:

...

Yes, I can deal with it in that way. You mention that you have no checkpoint, but I thought you would need occasional msync() of the backing store, in order to eliminate the need for some of the logs,

is that what you mean by checkpoints? I assumed you meant exporting the entire state from time to time, so checkpoint + log = current state. anyway, this is currently managed by the storage log itself, though this is not set in stone: in addition to a commit message there is a "success" message written to the log, i.e. "this transaction has reached disk". the log only removes old logs when there's been a success message for each transaction in it. so the RM can delay the success messages (and therefore the sync) as long as it wants, it is only prompted by storage_log::overflow()==true to post its success messages. so I guess it won't be too much coordination if we end up using this log.

...

and permit log archival. Regardless, I think the coordination around that could be the domain of the RM exclusively, and not part of the TM / RM interface.

...
I think it's definitely worth looking at as an optimization, but imho not as an alternative to the distributed transaction approach, as you cannot make all resources share a log, e.g. a RDBMS resource.

True. It would be great to ensure that when the different boost transaction-capable libraries are used together, the log can be shared.

do you also include the TM log in this? e.g. when there is a RDBMS, and a logging boost library, the boost RM maintains a log and the TM does for the distributed transactions with the RDBMS RM. so the RM and the TM could also use a shared log, but I tend to think this is not worth the effort, as this would span the interface between RM and TM.

Bob Walters

11:01 p.m.

New subject: [transaction] New Boost.Transaction library under discussion

On Wed, Jan 27, 2010 at 5:15 PM, <strasser@uni-bremen.de> wrote:

...

Zitat von Bob Walters <bob.s.walters@gmail.com>:

...
Yes, I can deal with it in that way. You mention that you have no checkpoint, but I thought you would need occasional msync() of the backing store, in order to eliminate the need for some of the logs,

is that what you mean by checkpoints? I assumed you meant exporting the entire state from time to time, so checkpoint + log = current state.

Well, not the whole state, but rather just the changes since the last checkpoint. In effect, it is the equivalent of writing to the log, but doing a lazy msync() of the memory mapped region only once every N seconds so that if there is good spacial locality to the updates being done by the user, there is chance of reducing the I/O load, and also of using more sequential I/O. My checkpoint is (unfortunately) more a matter of explicitly writing out the changes, rather than just msync(), but the concept is the same, and so fits with the algorithm discussed below.

...

anyway, this is currently managed by the storage log itself, though this is not set in stone: in addition to a commit message there is a "success" message written to the log, i.e. "this transaction has reached disk". the log only removes old logs when there's been a success message for each transaction in it. so the RM can delay the success messages (and therefore the sync) as long as it wants, it is only prompted by storage_log::overflow()==true to post its success messages. so I guess it won't be too much coordination if we end up using this log.

It sounds like it. I can always have a thread which does periodic checkpointing, then interacts with the log when prompted by overlow to indicate the transactions which have been written to disk.

...

...
True. It would be great to ensure that when the different boost transaction-capable libraries are used together, the log can be shared.

do you also include the TM log in this?

No. I'm assuming here that one/both of us eventually gets an RM created which combines our two RMs under a common log, as you had mentioned previously. As a result, the TM would recognize only one RM, and thus could avoid any need for a log of it's own, and just do 1 phase commit calls (pass-through.) IIUC it also wouldn't need a log also in the case of 1 persistent RM and 1 non-persistent RM. So that means all 3 libraries under discussion could be used together without the overhead of a distributed TM having its own log and sync points.

...

e.g. when there is a RDBMS, and a logging boost library, the boost RM maintains a log and the TM does for the distributed transactions with the RDBMS RM.

so the RM and the TM could also use a shared log, but I tend to think this is not worth the effort, as this would span the interface between RM and TM.

I think it's been done both ways. i.e. TM has its own log resource, and also TM shared a log with one of the RMs it is managing. (e.g. Oracle RDBMS). However, any sharing probably isn't worth much, because the TM would still need its own sync()s as it orchestrated the different RMs, even if it was sharing a log with one of them.

...

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

strasser＠uni-bremen.de

28 Jan 28 Jan

10:21 p.m.

I was on a long train ride today so I started putting together an initial code base for Transaction. and as always, the most obvious problems were overlooked: we have to find another name for the library/namespace, or for the transaction class: namespace transaction{ class transaction; } using namespace boost; using namespace transaction; int main(){ boost::transaction::transaction tx; //namespace spec required } I don't think this is acceptable, and I'd like to keep the name "transaction" for the transaction class. I've seen someone use "transaction_scope" instead but I don't think that makes sense. any ideas for a namespace? also, would you like to make the usage of the transaction library by the individual libraries transparent to user wrt to namespaces? e.g. namespace persistent{ using namespace transaction; } namespace stldb{ using namespace transaction; } on the one hand it's easier for the user. a beginner using only one of the transactional libraries isn't bothered with only one simple class being part of another library, from his perspective. downsides on the other hand: - headers. you'd have boost/persistent/transaction.hpp, boost/stldb/transaction.hpp, ... and all they'd do is include boost/transaction/transaction.hpp and a "using" declaration. - it might be confusing that is_same<persistent::transaction,stldb::transaction>::value==true

vicente.botet

29 Jan 29 Jan

12:08 a.m.

New subject: [transaction] New Boost.Transaction library under discussion

----- Original Message ----- From: <strasser@uni-bremen.de> To: <boost@lists.boost.org> Sent: Thursday, January 28, 2010 11:21 PM Subject: Re: [boost] [transaction] New Boost.Transaction library under discussion

...

I was on a long train ride today so I started putting together an initial code base for Transaction. and as always, the most obvious problems were overlooked: we have to find another name for the library/namespace, or for the transaction class:

namespace transaction{ class transaction; }

using namespace boost; using namespace transaction;

int main(){ boost::transaction::transaction tx; //namespace spec required }

I don't think this is acceptable, and I'd like to keep the name "transaction" for the transaction class. I've seen someone use "transaction_scope" instead but I don't think that makes sense.

any ideas for a namespace?

transactions?

...

also, would you like to make the usage of the transaction library by the individual libraries transparent to user wrt to namespaces?

e.g.

namespace persistent{ using namespace transaction; } namespace stldb{ using namespace transaction; }

on the one hand it's easier for the user. a beginner using only one of the transactional libraries isn't bothered with only one simple class being part of another library, from his perspective.

downsides on the other hand: - headers. you'd have boost/persistent/transaction.hpp, boost/stldb/transaction.hpp, ... and all they'd do is include boost/transaction/transaction.hpp and a "using" declaration. - it might be confusing that is_same<persistent::transaction,stldb::transaction>::value==true

Well, this do not concerns Boost.Transaction. From my side, I would let stm::transaction to refer to the single-phase transaction for STM. Some users can prefer to depend on only one library on particular contexts. Of course you can do it for Boost.Persistent. Best, Vicente

strasser＠uni-bremen.de

5:38 a.m.

Zitat von "vicente.botet" <vicente.botet@wanadoo.fr>:

...

...
any ideas for a namespace?

transactions?

hmm, I assumed that was against some boost naming guideline, but I see there are a number of libraries that use the plural (for no apparent reason?). better. if we can come up with with something to the effect of "transaction processing library" (but not "tpl") I'd prefer that but I guess "transactions" is ok.

...

Well, this do not concerns Boost.Transaction. From my side, I would let stm::transaction to refer to the single-phase transaction for STM.

ok, agreed. but just out of curiosity, what is your reasoning for doing that? using boost.transactions should not have any overhead when used with a single resource(no lazy transaction start either), so why do you want to duplicate the code obtaining the active transaction etc. in your library?

vicente.botet

7:11 a.m.

New subject: [transaction] New Boost.Transaction library under discussion

----- Original Message ----- From: <strasser@uni-bremen.de> To: <boost@lists.boost.org> Sent: Friday, January 29, 2010 6:38 AM Subject: Re: [boost] [transaction] New Boost.Transaction library under discussion

...

Zitat von "vicente.botet" <vicente.botet@wanadoo.fr>:

...
...
any ideas for a namespace?

transactions?

hmm, I assumed that was against some boost naming guideline, but I see there are a number of libraries that use the plural (for no apparent reason?).

better. if we can come up with with something to the effect of "transaction processing library" (but not "tpl") I'd prefer that but I guess "transactions" is ok.

I'd also prefer a short name, but we need one that is enough significant to be adopted by the Boost community. Quite often transaction is abreviatted as "tx", what about "txl"?

...

...
Well, this do not concerns Boost.Transaction. From my side, I would let stm::transaction to refer to the single-phase transaction for STM.

ok, agreed. but just out of curiosity, what is your reasoning for doing that? using boost.transactions should not have any overhead when used with a single resource(no lazy transaction start either), so why do you want to duplicate the code obtaining the active transaction etc. in your library?

OK, I see. In this case it could be reasonable to add the shortcut. This depends if the user will need other classes from Boost.Transaction. Vicente

Giovanni Piero Deretta

11 a.m.

New subject: [transaction] New Boost.Transaction library under discussion

On Fri, Jan 29, 2010 at 7:11 AM, vicente.botet <vicente.botet@wanadoo.fr> wrote:

...

...
Zitat von "vicente.botet" <vicente.botet@wanadoo.fr>:

...
...
any ideas for a namespace?

transactions?

hmm, I assumed that was against some boost naming guideline, but I see there are a number of libraries that use the plural (for no apparent reason?).

The guidelines discourage giving plural names, but (quoting from the guidelines): "The library's primary namespace (in parent ::boost) is given that same name, except when there's a component with that name (e.g., boost::tuple), in which case the namespace name is pluralized. For example, ::boost::filesystem."

...

...
better. if we can come up with with something to the effect of "transaction processing library" (but not "tpl") I'd prefer that but I guess "transactions" is ok.

I'd also prefer a short name, but we need one that is enough significant to be adopted by the Boost community. Quite often transaction is abreviatted as "tx", what about "txl"?

you can always use namespace aliases. FWIW, when I use boost::filesystem, this is the first thing I do: namespace fs = boost::filesystem; Makes for a nice short name which can be still limited to a translation unit. -- gpd

Stewart, Robert

1:17 p.m.

New subject: [transaction] New Boost.Transaction library under discussion

vicente.botet wrote:

...

From: <strasser@uni-bremen.de>

...
Zitat von "vicente.botet" <vicente.botet@wanadoo.fr>:

...
...
any ideas for a namespace?

transactions?

hmm, I assumed that was against some boost naming guideline, but I see there are a number of libraries that use the plural (for no apparent reason?).

As already noted, it isn't against the guidelines, but is discouraged when another, suitable alternative exists.

...

...
better. if we can come up with with something to the effect of "transaction processing library" (but not "tpl") I'd prefer that but I guess "transactions" is ok.

I'm no fan of library names that include the word "library," even in abbreviated form. Besides, as noted elsewhere, short names can be produced via aliases. The full name should be distinct so as to avoid conflicts with future libraries. transact?

...

I'd also prefer a short name, but we need one that is enough significant to be adopted by the Boost community. Quite often transaction is abreviatted as "tx", what about "txl"?

Again, no "l" on the end. I've never seen "tx" as the abbreviation, though I have seen "txn." Still, I wouldn't want either to be the namespace name. I think "transactions" is fine, but "transact" is shorter and seems appropriate. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

strasser＠uni-bremen.de

3:18 p.m.

New subject: [transaction] New Boost.Transaction library under discussion

Zitat von "Stewart, Robert" <Robert.Stewart@sig.com>:

...

but "transact" is shorter and seems appropriate.

I also think Boost.Transact/namespace transact is the most appropriate. anyone object? (since I had to look it up myself, for the non-native speakers among us: it's not an abbreviation, but the corresponding verb of "transaction")

Vicente Botet Escriba

4:13 p.m.

New subject: [transaction] New Boost.Transaction library under discussion

strasser wrote:

...

Zitat von "Stewart, Robert" <Robert.Stewart@sig.com>:

...
but "transact" is shorter and seems appropriate.

I also think Boost.Transact/namespace transact is the most appropriate. anyone object?

(since I had to look it up myself, for the non-native speakers among us: it's not an abbreviation, but the corresponding verb of "transaction")

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

OK with Boost.Transact Vicente -- View this message in context: http://old.nabble.com/-transaction--New-Boost.Transaction-library-under-disc... Sent from the Boost - Dev mailing list archive at Nabble.com.

Bob Walters

2 Feb 2 Feb

3 a.m.

New subject: [transaction] New Boost.Transaction library under discussion

On Fri, Jan 29, 2010 at 11:13 AM, Vicente Botet Escriba <vicente.botet@wanadoo.fr> wrote:

...

OK with Boost.Transact

Likewise OK with Boost.Transact.

Bob Walters

2:32 a.m.

New subject: [transaction] New Boost.Transaction library under discussion

On Fri, Jan 29, 2010 at 2:11 AM, vicente.botet <vicente.botet@wanadoo.fr> wrote:

...

I'd also prefer a short name, but we need one that is enough significant to be adopted by the Boost community. Quite often transaction is abreviatted as "tx", what about "txl"?

I've seen 'txn' used as an abbreviation for transaction, and boost::txn might be a viable namespace.

Bob Walters

19 Jan 19 Jan

4:51 a.m.

New subject: [transaction] New Boost.Transaction library under discussion

On Mon, Jan 18, 2010 at 1:27 PM, Oliver Kowalke <k-oli@gmx.de> wrote:

...

I would suggest a three phase model for transactions operating on shared resource:

1.) prepare phase: multiple readers/one writer have access to the resource 2.) modification phase: only one writer has access and can can modify the resource 3.) commit/rollback phase: write publishes its local modifications and releases the execlusive lock

I'm not sure what the granularity of 'shared resource' is, in your statement. Currently, for a shared resource of "map", I do permit simultaneous, modifications of the map concurrently. However, what you describe is the model I'm using for modifications to individual entries within the map with the exception that I'm using multi-version concurrency control so that when one thread acquires exclusive write access (2) to an entry, read-only access by other threads continues without contention. The moment of commit (3) does include some exclusive access to the map during the write of all local modifications back into the shared resource.

Bob Walters

10:18 p.m.

New subject: [transaction] New Boost.Transaction library under discussion

On Mon, Jan 18, 2010 at 8:09 AM, vicente.botet <vicente.botet@wanadoo.fr> wrote:

...

I've created in the sandbox the transaction directory (http://svn.boost.org/svn/boost/sandbox/transaction/) so we can share our work. I've created also a wiki page (http://svn.boost.org/trac/boost/wiki/BoostTransaction) on which we can compile the result of our discussions, requirements for the library, design, ...

If you don't have right access to the Boost wiki pages, please request them throught this ML.

I do need write access granted to the wiki.

vicente.botet

11:46 p.m.

New subject: [transaction] New Boost.Transaction library underdiscussion

----- Original Message ----- From: "Bob Walters" <bob.s.walters@gmail.com> To: <boost@lists.boost.org> Sent: Tuesday, January 19, 2010 11:18 PM Subject: Re: [boost] [transaction] New Boost.Transaction library underdiscussion

...

On Mon, Jan 18, 2010 at 8:09 AM, vicente.botet <vicente.botet@wanadoo.fr> wrote:

...
I've created in the sandbox the transaction directory (http://svn.boost.org/svn/boost/sandbox/transaction/) so we can share our work. I've created also a wiki page (http://svn.boost.org/trac/boost/wiki/BoostTransaction) on which we can compile the result of our discussions, requirements for the library, design, ...

If you don't have right access to the Boost wiki pages, please request them throught this ML.

I do need write access granted to the wiki. _______________________________________________

Bob, it will be better if you request this on a specific post, with a significative subject. Vicente

5653

Age (days ago)

5668

Last active (days ago)

List overview

Download

42 comments

8 participants

participants (8)

Bob Walters
Giovanni Piero Deretta
Oliver Kowalke
Stefan Strasser
Stewart, Robert
strasser＠uni-bremen.de
Vicente Botet Escriba
vicente.botet