Regression Tests, Boost.Serialization

Hi, Houston, we have a problem. The 209 recently added B.S tests apparently result in ~20MB binaries each for gcc-3.x compilers. For me this means that the nests will need ~20GB more disk space. All the other tests with all compilers I'm testing for together take ~10GB. This is not only slightly unbalanced, it also came as a surprise to me when I suddenly saw the unexpected disk full errors. Why does this kind of things always happen so short before a release? Please, everyone, if you do something that will have serious impact on the tests being run, please, notify the testers in advance. (I still think we should have a separate list for regression test related stuff, btw.) I had a few GB of disk space left a few days ago. Now, I'm unable to run all the tests. I could drop one of the gcc-3.4 versions. However, I'd still be short by ~10GB after that. Is there a way of reducing the number of tests in Boost.Serialization? Would it be viable to strip the test binaries (thereby losing debug information)? Any other suggestions? Compilers do drop? Regards, m PS: Yes, I'm already trying to clean my disks ;-)

Martin Wille <mw8329@yahoo.com.au> writes:
Hi,
Houston, we have a problem.
The 209 recently added B.S tests apparently result in ~20MB binaries each for gcc-3.x compilers. For me this means that the nests will need ~20GB more disk space. All the other tests with all compilers I'm testing for together take ~10GB.
This is not only slightly unbalanced, it also came as a surprise to me when I suddenly saw the unexpected disk full errors. Why does this kind of things always happen so short before a release?
Please, everyone, if you do something that will have serious impact on the tests being run, please, notify the testers in advance. (I still think we should have a separate list for regression test related stuff, btw.)
I don't recall ever seeing the suggestion before. I'd be happy to set it up. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

David Abrahams <dave@boost-consulting.com> writes:
Martin Wille <mw8329@yahoo.com.au> writes:
Hi,
[...]
I still think>> we should have a separate list for regression test related stuff, btw.)
Yes, we think it is a good idea too. That would help us and all the regression testers to communicate more effectively.
I don't recall ever seeing the suggestion before. I'd be happy to set it up.
Please. Could you also set it up to be mirrored on news.gmane.org? -- Misha Bergal MetaCommunications Engineering

Misha Bergal <mbergal@meta-comm.com> writes:
David Abrahams <dave@boost-consulting.com> writes:
Martin Wille <mw8329@yahoo.com.au> writes:
Hi,
[...]
I still think>> we should have a separate list for regression test related stuff, btw.)
Yes, we think it is a good idea too. That would help us and all the regression testers to communicate more effectively.
I don't recall ever seeing the suggestion before. I'd be happy to set it up.
It's been set up for a while, but I don't remember the admin password so I'm not sure if you can post. I'm getting that from the OSL guys.
Please. Could you also set it up to be mirrored on news.gmane.org?
Please submit the request to GMane yourself; instructions are on the site. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

On Thu, 15 Jul 2004 22:45:11 +0200, Martin Wille wrote
Hi,
Houston, we have a problem.
The 209 recently added B.S tests apparently result in ~20MB binaries each for gcc-3.x compilers. For me this means that the nests will need ~20GB more disk space. All the other tests with all compilers I'm testing for together take ~10GB.
This is not only slightly unbalanced, it also came as a surprise to me when I suddenly saw the unexpected disk full errors. Why does this kind of things always happen so short before a release?
Please, everyone, if you do something that will have serious impact on the tests being run, please, notify the testers in advance. (I still think we should have a separate list for regression test related stuff, btw.)
Martin - I think there has been some prior notification on the list that serialization is going to be an impact. I don't think we anticipated diskspace as an issue, but we did see that serialization was going to impact regression testers. The first time Robert and I raised the issue, no one came to the discussion http://lists.boost.org/MailArchives/boost/msg64471.php The second time there was an extended discussion under the title: [Boost.Test] New testing procedure I'll let you search for the beginning if you want. Bottom line was there was an acknowledgement of enhancements that would help, but no one really has had time to do anything.
I had a few GB of disk space left a few days ago. Now, I'm unable to run all the tests. I could drop one of the gcc-3.4 versions. However, I'd still be short by ~10GB after that.
It would be nice if we could drop serialization on compilers that just aren't going to work.
Is there a way of reducing the number of tests in Boost.Serialization? Would it be viable to strip the test binaries (thereby losing debug information)? Any other suggestions? Compilers do drop?
This is where we were hoping to be able to allow regression levels like 'basic' and 'torture' that would provide standard ways of controlling the number of tests. But none of this is currently available. As for what we should do now, it would be nice to have an answer to the question I posed this morning: is serialization going into 1.32 or are we cutting it out in the final release: http://lists.boost.org/MailArchives/boost/msg67299.php If we are cutting it out the we can just remove it from status bjam now.... Jeff

Jeff Garland wrote:
On Thu, 15 Jul 2004 22:45:11 +0200, Martin Wille wrote
Hi,
Houston, we have a problem.
The 209 recently added B.S tests apparently result in ~20MB binaries each for gcc-3.x compilers. For me this means that the nests will need ~20GB more disk space. All the other tests with all compilers I'm testing for together take ~10GB.
This is not only slightly unbalanced, it also came as a surprise to me when I suddenly saw the unexpected disk full errors. Why does this kind of things always happen so short before a release?
Please, everyone, if you do something that will have serious impact on the tests being run, please, notify the testers in advance. (I still think we should have a separate list for regression test related stuff, btw.)
Martin -
I think there has been some prior notification on the list that serialization is going to be an impact. I don't think we anticipated diskspace as an issue, but we did see that serialization was going to impact regression testers. The first time Robert and I raised the issue, no one came to the discussion
Yes, I remember that message and I appreciate the suggestions made. I didn't consider time as an issue, either. Apparently, nobody gave disk space a thought at that time.
The second time there was an extended discussion under the title:
[Boost.Test] New testing procedure
I'll let you search for the beginning if you want. Bottom line was there was an acknowledgement of enhancements that would help, but no one really has had time to do anything.
Right. I'd appreciate some warning message when someone actually is going to commit changes to the CVS which have serious impact on the tests, like adding a collection of tests, removing tests, renaming tests, moving tests do a different location or changes to build system (the latter actions require the old test results to be deleted to avoid stale results).
I had a few GB of disk space left a few days ago. Now, I'm unable to run all the tests. I could drop one of the gcc-3.4 versions. However, I'd still be short by ~10GB after that.
It would be nice if we could drop serialization on compilers that just aren't going to work.
Right. I once suggested that this should be implemented for all libraries. Its nonsense to run the tests for libraries which are marked as non-working for certain compilers. This should be a feature of the build system. We don't have that feature, yet. I'm not aware of anyone working on it. Meanwhile we could #ifdef the tests. Spirit does that, btw, by #including a config.hpp file which produces an #error for unsupported compilers. I think this would be helpful for a user, too.
Is there a way of reducing the number of tests in Boost.Serialization? Would it be viable to strip the test binaries (thereby losing debug information)? Any other suggestions? Compilers do drop?
This is where we were hoping to be able to allow regression levels like 'basic' and 'torture' that would provide standard ways of controlling the number of tests. But none of this is currently available.
Well, this would with respect to time issues. I don't see how this would help with respect to space. At release time, all the tests have to be run.
As for what we should do now, it would be nice to have an answer to the question I posed this morning: is serialization going into 1.32 or are we cutting it out in the final release:
http://lists.boost.org/MailArchives/boost/msg67299.php
If we are cutting it out the we can just remove it from status bjam now....
That should be decided by the release manager and the authors. Regards, m

Martin Wille <mw8329@yahoo.com.au> writes:
It would be nice if we could drop serialization on compilers that just aren't going to work.
Right. I once suggested that this should be implemented for all libraries. Its nonsense to run the tests for libraries which are marked as non-working for certain compilers. This should be a feature of the build system. We don't have that feature, yet. I'm not aware of anyone working on it.
I'd be willing to try, but I think we might get more bang-for-the-buck if I set things up so that failed tests don't run until they're outdated. That part can be done entirely within Boost.Build rather than trying to figure out how to combine some XML markup with it... unless something in the Jamfile that causes the test to be skipped for certain toolsets is enough for you. Thoughts? -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

David Abrahams writes:
Martin Wille <mw8329@yahoo.com.au> writes:
It would be nice if we could drop serialization on compilers that just aren't going to work.
Right. I once suggested that this should be implemented for all libraries. Its nonsense to run the tests for libraries which are marked as non-working for certain compilers. This should be a feature of the build system. We don't have that feature, yet. I'm not aware of anyone working on it.
I'd be willing to try, but I think we might get more bang-for-the-buck if I set things up so that failed tests don't run until they're outdated.
That won't help with clean runs, though, and it would be really wonderful to have them speed up a little.
That part can be done entirely within Boost.Build rather than trying to figure out how to combine some XML markup with it...
I think duplicating markup in Jamfiles or, preferrably, near them (in some form) won't be too bad. E.g., in the library "test" directory we could have a simple "unusable-tools.jam" which could go like this: unusable-tools = borland-5.5.1 msvc msvc-stlport ; If we can do something like that, it's even not necessarily a duplication -- for XSL reports, we can always walk through the library directories, collect the markup and transform it into an XML for later processing. In fact, we already to something like this anyway.
unless something in the Jamfile that causes the test to be skipped for certain toolsets is enough for you.
It's the opposite -- a toolset marked as unusable should be skipped. -- Aleksey Gurtovoy MetaCommunications Engineering

Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
David Abrahams writes:
Martin Wille <mw8329@yahoo.com.au> writes:
It would be nice if we could drop serialization on compilers that just aren't going to work.
Right. I once suggested that this should be implemented for all libraries. Its nonsense to run the tests for libraries which are marked as non-working for certain compilers. This should be a feature of the build system. We don't have that feature, yet. I'm not aware of anyone working on it.
I'd be willing to try, but I think we might get more bang-for-the-buck if I set things up so that failed tests don't run until they're outdated.
That won't help with clean runs, though, and it would be really wonderful to have them speed up a little.
OK.
That part can be done entirely within Boost.Build rather than trying to figure out how to combine some XML markup with it...
I think duplicating markup in Jamfiles or, preferrably, near them (in some form) won't be too bad. E.g., in the library "test" directory we could have a simple "unusable-tools.jam" which could go like this:
unusable-tools = borland-5.5.1 msvc msvc-stlport ;
If we can do something like that, it's even not necessarily a duplication -- for XSL reports, we can always walk through the library directories, collect the markup and transform it into an XML for later processing. In fact, we already to something like this anyway.
unless something in the Jamfile that causes the test to be skipped for certain toolsets is enough for you.
It's the opposite -- a toolset marked as unusable should be skipped.
What does it mean for the toolset to be marked, and how does it differ from what I suggested? I've just checked in changes as follows: To disable building any target, add <build>no to its requirements. Obviously, you'd want to do that with a qualified requirement, something like <msvc><*><build>no You can also use rule names in requirement lists, for example: rule disable-intel ( toolset variant : requirements ) { if [ MATCH ([Ii]ntel) : $(toolset) ] { requirements += <build>no ; } return requirements ; } exe foo : foo.cpp : disable-intel <define>FOO ; run my-test.cpp : : : disable-intel ; That sort of matching is likely to be more useful for our tests. If you supply --dump-unbuilt on the command-line, Boost.Build will write a line like **** skipping build of $(target); toolset= $(toolset) variant= $(variant) **** ; For each explicitly-skipped target. HTH, -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

David Abrahams writes:
I'd be willing to try, but I think we might get more bang-for-the-buck if I set things up so that failed tests don't run until they're outdated.
That won't help with clean runs, though, and it would be really wonderful to have them speed up a little.
OK.
That part can be done entirely within Boost.Build rather than trying to figure out how to combine some XML markup with it...
I think duplicating markup in Jamfiles or, preferrably, near them (in some form) won't be too bad. E.g., in the library "test" directory we could have a simple "unusable-tools.jam" which could go like this:
unusable-tools = borland-5.5.1 msvc msvc-stlport ;
If we can do something like that, it's even not necessarily a duplication -- for XSL reports, we can always walk through the library directories, collect the markup and transform it into an XML for later processing. In fact, we already to something like this anyway.
unless something in the Jamfile that causes the test to be skipped for certain toolsets is enough for you.
It's the opposite -- a toolset marked as unusable should be skipped.
What does it mean for the toolset to be marked, and how does it differ from what I suggested?
I've just checked in changes as follows:
To disable building any target, add
<build>no
to its requirements. Obviously, you'd want to do that with a qualified requirement, something like
<msvc><*><build>no
You can also use rule names in requirement lists, for example:
rule disable-intel ( toolset variant : requirements ) { if [ MATCH ([Ii]ntel) : $(toolset) ] { requirements += <build>no ; } return requirements ; }
exe foo : foo.cpp : disable-intel <define>FOO ; run my-test.cpp : : : disable-intel ;
That sort of matching is likely to be more useful for our tests.
So, if I, let's say, want to skip algorithm/minmax test suite: subproject libs/algorithm/minmax/test ; # bring in rules for testing import testing ; # Make tests run by default. DEPENDS all : test ; { test-suite algorithm/minmax : [ run minmax_element_test.cpp : : : : minmax_element ] [ run minmax_test.cpp : : : : minmax ] ; } for "msvc", what do I put where?
If you supply --dump-unbuilt on the command-line, Boost.Build will write a line like
**** skipping build of $(target); toolset= $(toolset) variant= $(variant) **** ;
For each explicitly-skipped target.
Super, and thank you! -- Aleksey Gurtovoy MetaCommunications Engineering

David Abrahams writes:
Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
David Abrahams writes:
Martin Wille <mw8329@yahoo.com.au> writes:
It would be nice if we could drop serialization on compilers that just aren't going to work.
Right. I once suggested that this should be implemented for all libraries. Its nonsense to run the tests for libraries which are marked as non-working for certain compilers. This should be a feature of the build system. We don't have that feature, yet. I'm not aware of anyone working on it.
I'd be willing to try, but I think we might get more bang-for-the-buck if I set things up so that failed tests don't run until they're outdated.
That won't help with clean runs, though, and it would be really wonderful to have them speed up a little.
OK.
That part can be done entirely within Boost.Build rather than trying to figure out how to combine some XML markup with it...
I think duplicating markup in Jamfiles or, preferrably, near them (in some form) won't be too bad. E.g., in the library "test" directory we could have a simple "unusable-tools.jam" which could go like this:
unusable-tools = borland-5.5.1 msvc msvc-stlport ;
If we can do something like that, it's even not necessarily a duplication -- for XSL reports, we can always walk through the library directories, collect the markup and transform it into an XML for later processing. In fact, we already to something like this anyway.
unless something in the Jamfile that causes the test to be skipped for certain toolsets is enough for you.
It's the opposite -- a toolset marked as unusable should be skipped.
What does it mean for the toolset to be marked,
Sorry, imprecise wording. Of course it's the library that is marked as unusable with a particular toolset.
and how does it differ from what I suggested?
You seemed to imply that it's skipping of individual tests for particular toolsets that is of our interest. I was trying to make a point that the most important use case is to mark the whole library's *test suite* as unusable. May be from implementation standpoint there is no difference -- but since I have no idea, I thought I'd point it out. -- Aleksey Gurtovoy MetaCommunications Engineering

Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
What does it mean for the toolset to be marked,
Sorry, imprecise wording. Of course it's the library that is marked as unusable with a particular toolset.
and how does it differ from what I suggested?
You seemed to imply that it's skipping of individual tests for particular toolsets that is of our interest. I was trying to make a point that the most important use case is to mark the whole library's *test suite* as unusable. May be from implementation standpoint there is no difference -- but since I have no idea, I thought I'd point it out.
You have to use a template or a variable to get <build>no into the requirements of each element of the suite. We don't have any shorthand way to inject requirements into a suite as a whole. It might be possible to write something but I'm going on vacation this afternoon, so, I guess someone else will have to do it. Cheers, -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

Martin Wille <mw8329@yahoo.com.au> writes:
Martin Wille wrote:
Any other suggestions?
Would it be viable to modify the build system such that test binaries which ran successfully get stripped after they ran?
It might be possible...
Having the then unneeded debug symbols removed would help a lot.
... but it might be painful. What are the required commands to strip an executable? -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

On Fri, Jul 16, 2004 at 07:41:23AM -0400, David Abrahams wrote:
Martin Wille <mw8329@yahoo.com.au> writes:
Martin Wille wrote:
Any other suggestions?
Would it be viable to modify the build system such that test binaries which ran successfully get stripped after they ran?
It might be possible...
Having the then unneeded debug symbols removed would help a lot.
... but it might be painful.
What are the required commands to strip an executable?
with the GNU binutils the most simple command is: strip <filename> There is a whole bunch of command line switches that gives you more control over what gets stripped from the file, see strip(1). But I can't remember ever using this command myself whence I can't say which switches are more helpful than others. Regards Christoph -- http://www.informatik.tu-darmstadt.de/TI/Mitarbeiter/cludwig.html LiDIA: http://www.informatik.tu-darmstadt.de/TI/LiDIA/Welcome.html

Christoph Ludwig wrote:
On Fri, Jul 16, 2004 at 07:41:23AM -0400, David Abrahams wrote:
...
What are the required commands to strip an executable?
with the GNU binutils the most simple command is:
strip <filename>
There is a whole bunch of command line switches that gives you more control over what gets stripped from the file, see strip(1). But I can't remember ever using this command myself whence I can't say which switches are more helpful than others.
I don't think I ever used strip with any switches. For template-heavy code strip shrinks the sizes of executables compiled in debug mode by 90% and more. Regards, m

David Abrahams wrote:
Martin Wille writes:
Martin Wille wrote:
Any other suggestions?
Would it be viable to modify the build system such that test binaries which ran successfully get stripped after they ran?
It might be possible...
Having the then unneeded debug symbols removed would help a lot.
... but it might be painful.
For test binaries which ran successfully? What's the use of debug symbols in a test binary that did not produce any errors? I'm not thinking of removing debug information from test binaries which produced errors.
What are the required commands to strip an executable?
strip filename Test results should be available within a few hours (including Serialization. I removed gcc-3.4.0 from the list of compilers. 3.4.1 is still present). While tests are running, I'm seeing link errors for Boost.Serialization: /boost/head-regression/boost/boost/test/test_tools.hpp:179: undefined reference to `pthread_mutex_destroy' I think my installation of libstdc++ requires -pthread on the commandline. Is this a Boost.Build problem, a Boost.Serialization problem or do I just have to add some options to my toolset definition for gcc-3.4.1? Regards, m

On Fri, Jul 16, 2004 at 08:34:30PM +0200, Martin Wille wrote: [...]
While tests are running, I'm seeing link errors for Boost.Serialization: /boost/head-regression/boost/boost/test/test_tools.hpp:179: undefined reference to `pthread_mutex_destroy'
I think my installation of libstdc++ requires -pthread on the commandline. Is this a Boost.Build problem, a Boost.Serialization problem or do I just have to add some options to my toolset definition for gcc-3.4.1?
Maybe you encountered the problem discussed in the thread that started with http://tinyurl.com/454e5 ? Bottom line, if you don't call gcc with -pthread then you need to explicitly define BOOST_DISABLE_THREADS. Regards Christoph -- http://www.informatik.tu-darmstadt.de/TI/Mitarbeiter/cludwig.html LiDIA: http://www.informatik.tu-darmstadt.de/TI/LiDIA/Welcome.html

Martin Wille <mw8329@yahoo.com.au> writes:
Martin Wille wrote:
Any other suggestions?
Would it be viable to modify the build system such that test binaries which ran successfully get stripped after they ran? Having the then unneeded debug symbols removed would help a lot.
I just modified it so that by default, they're deleted after they run successfully. This only deletes top-level targets (executables most of the time, shared objects for Python tests), so object files sit around sucking up disk space. You can turn this behavior off by adding --preserve-test-targets to the bjam command-line. I did this after running out of disk myself ;-) -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

David Abrahams wrote:
Martin Wille writes:
...
Would it be viable to modify the build system such that test binaries which ran successfully get stripped after they ran? Having the then unneeded debug symbols removed would help a lot.
I just modified it so that by default, they're deleted after they run successfully. This only deletes top-level targets (executables most of the time, shared objects for Python tests), so object files sit around sucking up disk space. You can turn this behavior off by adding --preserve-test-targets to the bjam command-line.
Great! Thank you very much! This reduces the space requirements for testing by 13 GB on my box.
I did this after running out of disk myself ;-)
That motivates ;-) Regards, m

Le jeu 15/07/2004 à 22:45, Martin Wille a écrit :
Hi,
Houston, we have a problem.
The 209 recently added B.S tests apparently result in ~20MB binaries each for gcc-3.x compilers. For me this means that the nests will need ~20GB more disk space. All the other tests with all compilers I'm testing for together take ~10GB.
[...]
Is there a way of reducing the number of tests in Boost.Serialization? Would it be viable to strip the test binaries (thereby losing debug information)? Any other suggestions? Compilers do drop?
I also thought about it. But why simply strip the test binaries? Once the test was successful, the regression data are generated, and the binary and object files are not necessary anymore. They could simply be erased. If the files existence is necessary, they could be emptied (or replaced by dummy versions). A 1k dummy file is a lot nicer than a 4M test binary never used again. Regards, Guillaume PS: Regression testing works just fine after emptying files: find bin/ -type f -name "*.o" -exec cp /dev/null "{}" ";" find bin/ -type f -perm +100 -exec cp /dev/null "{}" ";"

Guillaume Melquiond <guillaume.melquiond@ens-lyon.fr> writes:
Le jeu 15/07/2004 à 22:45, Martin Wille a écrit :
Hi,
Houston, we have a problem.
The 209 recently added B.S tests apparently result in ~20MB binaries each for gcc-3.x compilers. For me this means that the nests will need ~20GB more disk space. All the other tests with all compilers I'm testing for together take ~10GB.
[...]
Is there a way of reducing the number of tests in Boost.Serialization? Would it be viable to strip the test binaries (thereby losing debug information)? Any other suggestions? Compilers do drop?
I also thought about it. But why simply strip the test binaries? Once the test was successful, the regression data are generated, and the binary and object files are not necessary anymore. They could simply be erased. If the files existence is necessary, they could be emptied (or replaced by dummy versions). A 1k dummy file is a lot nicer than a 4M test binary never used again.
Regards,
Guillaume
PS: Regression testing works just fine after emptying files:
find bin/ -type f -name "*.o" -exec cp /dev/null "{}" ";" find bin/ -type f -perm +100 -exec cp /dev/null "{}" ";"
Right now if tests have input files the test executable won't be rebuilt when the input files change. I know that's not a very common case, but I don't think you can throw everything (including objects) out for small size and also get high speed. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com
participants (7)
-
Aleksey Gurtovoy
-
Christoph Ludwig
-
David Abrahams
-
Guillaume Melquiond
-
Jeff Garland
-
Martin Wille
-
Misha Bergal