Retry: merging all utf8 files

Seems like the discussion about how to merge copies of utf8 code used by serialization and program_options has died. I'd like to revive it again, given that I've recieved yet another patch for that code, and the longer we have two versions, the harder it will be to merge them later. The basic question is whether to have a new compiled library in libs/detail, or just source files which will be included by serialization and program_options. The problem with compiled library is that users will have to know that in addition of -lboost_program_options they'll have to add -lboost_utf8, which is inconvenience. I've mentioned this in http://article.gmane.org/gmane.comp.lib.boost.devel/106047/ The proposed course of action is documented in: http://article.gmane.org/gmane.comp.lib.boost.devel/106122 Now: 1. Are there any objections? 2. Who does it? I can do it soon. Robert, what do you think? - Volodya

"Vladimir Prus" <ghost@cs.msu.su> wrote in message news:200412081123.32817.ghost@cs.msu.su...
The problem with compiled library is that users will have to know that in addition of -lboost_program_options they'll have to add -lboost_utf8, which is inconvenience. I've mentioned this in
http://article.gmane.org/gmane.comp.lib.boost.devel/106047/
The proposed course of action is documented in:
http://article.gmane.org/gmane.comp.lib.boost.devel/106122
1. put new header to boost/detail 2. put new source to libs/detail/utf 3. #include new source in program_options.
Objections? As I understood, Robert does not mind to have this issue handed to me.
No objection at all.
Robert, would you want me to change serialization to include the file, too?
Just create the new place. I'll tweak my stuff to include it from the new spot and remove the utf8 stuff from my project.
1. Are there any objections?
I still have a couple of questions: a) how do we insure that that if a program uses several libraries each of which in turn uses the common utf8 module, we don't get multiply defined symbols at b) Please compare your version and my version to make sure that the new version is the union of any changes. There are really annoying issues related to the namespaces where things like mbstate_t are found on different platforms. My version has been tested back to bcb 5.51 and I believe includes accomdations for all these platforms. Likely your version includes some differences to address issues that you've come accross. So I would like to see the the two versions "melded". I don't think this is hard as the differences if any should be very small. c) This brings up the question of testing. I wrote a test which I would like to see incorporated in the normal testing routine. This has been indispensable in smoking out small incompatibilities between library versions. In many cases this is the only test that detects this stuff as serialization and program options are the only libraries that really use wide character i/o. d) I would like to see the manual page moved into the appropriate spot as well. All this suggests its has all the features of a full blown library - with only one module. I also think its more than an implementation "detail". So I would like to see created boost/libs/utility/utf8/... and be sort of "official". There have been at least three recent proposals to make a more complete utf8 implementation. If any of these were to be completed and accepted into boost they could replace what's already in here. Making a libboost_utility would fix up the issue multiple definitions of the same module. It would also mean that utf8 isn't some weird special case that has different rules as to its usage compared to other boost libraries. In effect, as a practical matter this module has been accepted into boost as part of two libraries. Ample opportunity has been available to object to its inclusion during the course of two reviews. It should be considered accepted until something better comes along. When that happens, it can replaced with the "next great thing"
2. Who does it? I can do it soon.
I'm fine with this. Create the new spot. I'll switch over to it after any dust settles.
Robert, what do you think?
Robert Ramey

Robert Ramey wrote:
Robert, would you want me to change serialization to include the file, too?
Just create the new place. I'll tweak my stuff to include it from the new spot and remove the utf8 stuff from my project.
1. Are there any objections?
I still have a couple of questions:
a) how do we insure that that if a program uses several libraries each of which in turn uses the common utf8 module, we don't get multiply defined symbols at
The "common" files will be included into a wrapper file (one for each library), with different namespaceses: namespace boost { namespace program_options { #include "../../libs/detail/utf8/utf8_codecvt.cpp"
b) Please compare your version and my version to make sure that the new version is the union of any changes. There are really annoying issues related to the namespaces where things like mbstate_t are found on different platforms. My version has been tested back to bcb 5.51 and I believe includes accomdations for all these platforms. Likely your version includes some differences to address issues that you've come accross. So I would like to see the the two versions "melded". I don't think this is hard as the differences if any should be very small.
Sure, I'll do my best.
c) This brings up the question of testing. I wrote a test which I would like to see incorporated in the normal testing routine. This has been indispensable in smoking out small incompatibilities between library versions. In many cases this is the only test that detects this stuff as serialization and program options are the only libraries that really use wide character i/o.
Where the test is? You mean utf8_codecvt_test which is part of serialization? Sure, it should be included. The more tests, the better.
d) I would like to see the manual page moved into the appropriate spot as well.
All this suggests its has all the features of a full blown library - with only one module. I also think its more than an implementation "detail". So I would like to see created boost/libs/utility/utf8/... and be sort of "official". There have been at least three recent proposals to make a more complete utf8 implementation. If any of these were to be completed and accepted into boost they could replace what's already in here. Making a libboost_utility would fix up the issue multiple definitions of the same module. It would also mean that utf8 isn't some weird special case that has different rules as to its usage compared to other boost libraries.
I think I'm not at liberty to bypass the regular rules for library acceptance. I'd rather add it to detail. Moving it to a different location is always possible later.
In effect, as a practical matter this module has been accepted into boost as part of two libraries. Ample opportunity has been available to object to its inclusion during the course of two reviews. It should be considered accepted until something better comes along. When that happens, it can replaced with the "next great thing"
Given that those parts are implementation details, I think nobody cares about them much.
2. Who does it? I can do it soon.
I'm fine with this. Create the new spot. I'll switch over to it after any dust settles.
Will do this on Tuesday. (Monday is a holiday here). - Volodya

Vladimir Prus wrote:
Robert Ramey wrote:
a) how do we insure that that if a program uses several libraries each of which in turn uses the common utf8 module, we don't get multiply defined symbols at
The "common" files will be included into a wrapper file (one for each library), with different namespaceses:
namespace boost { namespace program_options { #include "../../libs/detail/utf8/utf8_codecvt.cpp"
wll this work without a problem? The first lines of utf8_codecvt_facet.cpp (my copy) contain: #include <boost/config.hpp> #ifdef BOOST_NO_STD_WSTREAMBUF #error "wide char i/o not supported on this platform" #else #include <cassert> #include <cstddef> #include <boost/utf8_codecvt_facet.hpp> So the above will expand to something like: namespace boost { namespace program_options { // #include "../../libs/detail/utf8/utf8_codecvt.cpp" #include <boost/config.hpp> #ifdef BOOST_NO_STD_WSTREAMBUF #error "wide char i/o not supported on this platform" #else #include <cassert> #include <cstddef> #include <boost/utf8_codecvt_facet.hpp> // } // program_options // } // boost which would but all the lowerlevel includes in the program options namespace as well. Idon't see how that can work. Robet Ramey

On Friday 10 December 2004 20:11, Robert Ramey wrote:
The "common" files will be included into a wrapper file (one for each library), with different namespaceses:
namespace boost { namespace program_options { #include "../../libs/detail/utf8/utf8_codecvt.cpp"
wll this work without a problem? The first lines of utf8_codecvt_facet.cpp (my copy) contain:
...
So the above will expand to something like:
namespace boost { namespace program_options {
// #include "../../libs/detail/utf8/utf8_codecvt.cpp"
#include <boost/config.hpp> #ifdef BOOST_NO_STD_WSTREAMBUF #error "wide char i/o not supported on this platform" #else #include <cassert> #include <cstddef> #include <boost/utf8_codecvt_facet.hpp>
// } // program_options // } // boost
which would but all the lowerlevel includes in the program options namespace as well. Idon't see how that can work.
I'd use: #include <boost/config.hpp> #ifdef BOOST_NO_STD_WSTREAMBUF #error "wide char i/o not supported on this platform" #else #include <cassert> #include <cstddef> #include <boost/utf8_codecvt_facet.hpp> namespace boost { namespace program_options { #include "../../libs/detail/utf8/utf8_codecvt.cpp" } // program_options } // boost There's only single include inside namespace declaration. Also #include <boost/utf8_codecvt_facet.hpp> will have to be changed too. - Volodya

On 12/8/04 3:23 AM, "Vladimir Prus" <ghost@cs.msu.su> wrote: [SNIP]
The proposed course of action is documented in:
http://article.gmane.org/gmane.comp.lib.boost.devel/106122
Now:
1. Are there any objections?
Yes. I'm looking at: //==================================================== Yeah, I think that's possible. So I'm going to: 1. put new header to boost/detail 2. put new source to libs/detail/utf 3. #include new source in program_options. //==================================================== I don't think any #include to the "libs" directory is a good idea. It works only if an expanded Boost archive stays as-is. If the sub-directories are scattered, e.g. to meet Unix header placements, then the idea fails. I think some existing code tries to #include "libs," that code should be changed. This could be a further argument to finally move mandatory source files to a distinct root-level directory.
2. Who does it? I can do it soon. Robert, what do you think?
-- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com

On Tuesday 21 December 2004 03:33, Daryle Walker wrote:
The proposed course of action is documented in:
http://article.gmane.org/gmane.comp.lib.boost.devel/106122
Now:
1. Are there any objections?
Yes. I'm looking at:
//==================================================== Yeah, I think that's possible. So I'm going to:
1. put new header to boost/detail 2. put new source to libs/detail/utf 3. #include new source in program_options. //====================================================
I don't think any #include to the "libs" directory is a good idea. It works only if an expanded Boost archive stays as-is. If the sub-directories are scattered, e.g. to meet Unix header placements, then the idea fails. I think some existing code tries to #include "libs," that code should be changed. This could be a further argument to finally move mandatory source files to a distinct root-level directory.
That include of file in "libs" will be made from .cpp file. After installation, that line will be already compiled to .obj, included in .so and the include will not be visible by the user. Do you still have the objection? - Volodya

On 12/21/04 2:33 AM, "Vladimir Prus" <ghost@cs.msu.su> wrote:
On Tuesday 21 December 2004 03:33, Daryle Walker wrote:
The proposed course of action is documented in:
http://article.gmane.org/gmane.comp.lib.boost.devel/106122
Now:
1. Are there any objections?
Yes. I'm looking at:
//==================================================== Yeah, I think that's possible. So I'm going to:
1. put new header to boost/detail 2. put new source to libs/detail/utf 3. #include new source in program_options. //====================================================
I don't think any #include to the "libs" directory is a good idea. It works only if an expanded Boost archive stays as-is. If the sub-directories are scattered, e.g. to meet Unix header placements, then the idea fails. I think some existing code tries to #include "libs," that code should be changed. This could be a further argument to finally move mandatory source files to a distinct root-level directory.
That include of file in "libs" will be made from .cpp file. After installation, that line will be already compiled to .obj, included in .so and the include will not be visible by the user.
So the #includes would be one file in "libs" including a sibling file in "libs," right?
Do you still have the objection?
Depends if compilers can easily be made to look for a source file's siblings. (Since source files are listed in projects/makefiles directly, their containing directories don't have to be in any search path.) -- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com

Daryle Walker wrote:
That include of file in "libs" will be made from .cpp file. After installation, that line will be already compiled to .obj, included in .so and the include will not be visible by the user.
So the #includes would be one file in "libs" including a sibling file in "libs," right?
Right.
Do you still have the objection?
Depends if compilers can easily be made to look for a source file's siblings. (Since source files are listed in projects/makefiles directly, their containing directories don't have to be in any search path.)
There will be an include with explicit path ("../../detail/utf8/utf8_codecvt.cpp"). If there's any compiler which does not consider "." for quoted includes, it's always possible to add "." to the search path. - Volodya
participants (3)
-
Daryle Walker
-
Robert Ramey
-
Vladimir Prus