[1.35.0] Release criteria compilers

In doing a postmortem of the past couple of releases with Thomas Witt, he made a very strong case that testing is a major bottleneck. If developers and release managers have to wait several days to find out if a fix works, it slows progress to a crawl. One of the things we can do to eliminate that bottleneck is to cut the number of release criteria compilers down to a more manageable number, and to compilers where testing is very reliably and runs several times a day. My candidates for the release criteria compilers are: * Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin I've got all of those here, so can verify test results myself if necessary. I expect people will want to add several others, and that's OK. But for this first release on our quarterly schedule, let's keep the number down to be sure we meet our targets. Comments? --Beman

Beman Dawes wrote:
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
For Darwin we need to distinguish which variant of the compiler, GNU or Apple. I think we only regularly test the Apple variant so the GNU variant would not be a release toolset. There's also the difference between x86 and PPC. For GCC on Linux, we need to be specific as to which versions. My incremental tests run only the latest 4.1 and 4.2 on x86 which have traditionally been release toolsets. -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

Rene Rivera wrote:
Beman Dawes wrote:
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
For Darwin we need to distinguish which variant of the compiler, GNU or Apple.
Apple. Latest Xcode release. I think we only regularly test the Apple variant so the GNU
variant would not be a release toolset. There's also the difference between x86 and PPC.
I've got one of the original PPC mac minis. Sloooow. But it is nice to have a little-endian machine to test on.
For GCC on Linux, we need to be specific as to which versions. My incremental tests run only the latest 4.1 and 4.2 on x86 which have traditionally been release toolsets.
4.2.1 is the latest release, so that should definitely be a target. GCC is important enough I'm willing to include 4.1. Likewise VC++ is so widely used I'm willing to target both 8.0 and 7.1. Maybe one more? Sun? --Beman

At 9:30 PM -0400 10/4/07, Beman Dawes wrote:
Rene Rivera wrote:
Beman Dawes wrote:
For Darwin we need to distinguish which variant of the compiler, GNU or Apple.
Apple. Latest Xcode release.
Note that the "latest Xcode release" is likely to change in the next few weeks..... -- -- Marshall Marshall Clow Idio Software <mailto:marshall@idio.com> It is by caffeine alone I set my mind in motion. It is by the beans of Java that thoughts acquire speed, the hands acquire shaking, the shaking becomes a warning. It is by caffeine alone I set my mind in motion.

Marshall Clow wrote:
At 9:30 PM -0400 10/4/07, Beman Dawes wrote:
Rene Rivera wrote:
Beman Dawes wrote:
For Darwin we need to distinguish which variant of the compiler, GNU or Apple. Apple. Latest Xcode release.
Note that the "latest Xcode release" is likely to change in the next few weeks.....
Good point! It should be the compiler's current release at the time the Boost release cycle starts. Release criteria compilers should not be changed in the middle of a Boost release cycle. --Beman

On 5 Oct 2007, at 03:30, Beman Dawes wrote:
Rene Rivera wrote:
Beman Dawes wrote:
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
For Darwin we need to distinguish which variant of the compiler, GNU or Apple.
Apple. Latest Xcode release.
I think we only regularly test the Apple variant so the GNU
variant would not be a release toolset. There's also the difference between x86 and PPC.
I've got one of the original PPC mac minis. Sloooow. But it is nice to have a little-endian machine to test on.
I still have the last model of dual-CPU dual-core PPC PowerMacs on which I could run tests.

Matthias Troyer wrote:
On 5 Oct 2007, at 03:30, Beman Dawes wrote:
Rene Rivera wrote:
Beman Dawes wrote:
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin For Darwin we need to distinguish which variant of the compiler, GNU or Apple. Apple. Latest Xcode release.
I think we only regularly test the Apple variant so the GNU
variant would not be a release toolset. There's also the difference between x86 and PPC. I've got one of the original PPC mac minis. Sloooow. But it is nice to have a little-endian machine to test on.
I still have the last model of dual-CPU dual-core PPC PowerMacs on which I could run tests.
We are going to need more testers because of the need to test both the trunk and the release branch. Anyone interested might want to subscribe to boost-testing@lists.boost.org where we are about to have a discussion of testing during release cycles. Thanks, --Beman

On 05/10/2007, Beman Dawes <bdawes@acm.org> wrote:
Rene Rivera wrote:
Beman Dawes wrote:
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
For Darwin we need to distinguish which variant of the compiler, GNU or Apple.
Apple. Latest Xcode release.
I think we only regularly test the Apple variant so the GNU
variant would not be a release toolset. There's also the difference between x86 and PPC.
I've got one of the original PPC mac minis. Sloooow. But it is nice to have a little-endian machine to test on.
For GCC on Linux, we need to be specific as to which versions. My incremental tests run only the latest 4.1 and 4.2 on x86 which have traditionally been release toolsets.
4.2.1 is the latest release, so that should definitely be a target.
GCC is important enough I'm willing to include 4.1. Likewise VC++ is so widely used I'm willing to target both 8.0 and 7.1.
Maybe one more? Sun?
Instead why not have at least one compiler for a platform i.e. VC++ 8.0 SP1 on Windows, gcc 4.2.1 on Linux, Sun Studio 12 on Solaris 8, xlC 9.0 on AIX and aCC on HP-UX. Additional compiler/platform combination can be taken up later.
--Beman
-- regards, Prashant Thakre

Prashant Thakre wrote:
On 05/10/2007, Beman Dawes <bdawes@acm.org> wrote: ...
Maybe one more? Sun?
Instead why not have at least one compiler for a platform i.e. VC++ 8.0 SP1 on Windows, gcc 4.2.1 on Linux, Sun Studio 12 on Solaris 8, xlC 9.0 on AIX and aCC on HP-UX. Additional compiler/platform combination can be taken up later.
That's certainly the eventual goal. It's really a question of how much to try to bite off for the 1.35.0 release. It does seem a bit unfair to include two versions of a compiler in the release criteria while there are other important compilers not included at all. --Beman

On Oct 4, 2007 4:20 PM, Rene Rivera <grafikrobot@gmail.com> wrote:
Beman Dawes wrote:
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
For Darwin we need to distinguish which variant of the compiler, GNU or Apple. I think we only regularly test the Apple variant so the GNU variant would not be a release toolset. There's also the difference between x86 and PPC. [snip]
In my opinion, the test configuration that would give the most useful coverage would be 32-bit universal binaries built against the 10.4u SDK. I believe that those options would be spelled something like: architecture=combined address-model=32 macosx-version=10.4 - Mat

Beman Dawes wrote:
In doing a postmortem of the past couple of releases with Thomas Witt, he made a very strong case that testing is a major bottleneck. If developers and release managers have to wait several days to find out if a fix works, it slows progress to a crawl.
One of the things we can do to eliminate that bottleneck is to cut the number of release criteria compilers down to a more manageable number, and to compilers where testing is very reliably and runs several times a day.
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32
My feeling is that VC++ 7.1 is at least as commonly used as 8.0. Unmanaged C++ users never really had any major reason to upgrade from 7.1. / Johan

Johan Nilsson wrote:
Beman Dawes wrote:
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32
My feeling is that VC++ 7.1 is at least as commonly used as 8.0. Unmanaged C++ users never really had any major reason to upgrade from 7.1.
Yup. The latest SP for 8.0 have improved things, but I think there are a fair number of ppl out there that hasn't switched just yet. (Counting myself. :) Cheers /Marcus

Beman Dawes:
In doing a postmortem of the past couple of releases with Thomas Witt, he made a very strong case that testing is a major bottleneck. If developers and release managers have to wait several days to find out if a fix works, it slows progress to a crawl.
One of the things we can do to eliminate that bottleneck is to cut the number of release criteria compilers down to a more manageable number, and to compilers where testing is very reliably and runs several times a day.
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
If these compilers provide reliable testing, running several times a day, they will deliver their results regardless of whether they are the only ones in the list. Trimming the list will not increase their reliability, it will just decrease the reliability of the compilers that are left off the list. In short, I believe that this is a step backwards. It does enable us to get a release out that doesn't work on an important compiler that is not on the list. Is this a feature?

Peter Dimov wrote:
Beman Dawes:
In doing a postmortem of the past couple of releases with Thomas Witt, he made a very strong case that testing is a major bottleneck. If developers and release managers have to wait several days to find out if a fix works, it slows progress to a crawl.
One of the things we can do to eliminate that bottleneck is to cut the number of release criteria compilers down to a more manageable number, and to compilers where testing is very reliably and runs several times a day.
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
If these compilers provide reliable testing, running several times a day, they will deliver their results regardless of whether they are the only ones in the list.
Trimming the list will not increase their reliability, it will just decrease the reliability of the compilers that are left off the list.
In short, I believe that this is a step backwards. It does enable us to get a release out that doesn't work on an important compiler that is not on the list. Is this a feature?
Depends. If we could get a release out on a certain date, regardless of how many compilers are considered in the release criteria, then it would certainly be better to include more compilers. But that isn't the case. Every added compiler delays the release. So the question becomes what compilers can we include and still grind out a release by, say, mid-November? I'm not willing to wait a year for the next release of Boost. So I'm pushing hard to limit our goals to what we can do, rather than what we would like to do. If anyone feels strongly that compiler X on platform Y should be included, they should consider becoming a tester for compiler X on platform Y. And for reliability, we really need more than one tester for compiler X on platform Y, unless past history shows a particular tester is unusually reliable. --Beman

Beman Dawes wrote: [...]
But that isn't the case. Every added compiler delays the release. So the question becomes what compilers can we include and still grind out a release by, say, mid-November?
You seem to be saying that the last release was delayed for so long because there were so many platforms with too long turn around times for testing. I don't think that was the case. It for sure isn't what my experience as a long term regression tester for an exotic platform was. In the year it took for the release to happen, I had Tru64/CXX down to zero regressions quite a few times, despite the long test cycle. Then, the code would site there for a few weeks without any changes, only to be broken by some minor or major check-in afterwards. [...] Markus

Beman Dawes <bdawes <at> acm.org> writes: [...]
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
I've got all of those here, so can verify test results myself if necessary.
I expect people will want to add several others, and that's OK. But for this first release on our quarterly schedule, let's keep the number down to be sure we meet our targets.
CodeGear (Borland) C++ Builder 2007 on Win32 I know it's not the most compliant compiler in the world, but it's improving. Furthermore, I believe Alisdair Meredith has some patches to submit and I have a few myself in store. As soon as all the patches I'm aware of are in, I can stop testing the earlier releases and start testing the latest one more often. Cheers, Nicola Musatti

Nicola Musatti wrote:
Beman Dawes <bdawes <at> acm.org> writes: [...]
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
I've got all of those here, so can verify test results myself if necessary.
I expect people will want to add several others, and that's OK. But for this first release on our quarterly schedule, let's keep the number down to be sure we meet our targets.
CodeGear (Borland) C++ Builder 2007 on Win32
I know it's not the most compliant compiler in the world, but it's improving. Furthermore, I believe Alisdair Meredith has some patches to submit and I have a few myself in store.
As soon as all the patches I'm aware of are in, I can stop testing the earlier releases and start testing the latest one more often.
We really can't include a compiler and/or platform that is only being tested every few days. We just have to have reliable daily tests at the minimum. And for compilers/platforms with a lot of failures, we need even more testing, so developers have a reasonable shot at fixing the failures. I see that it is now official that Borland has hired Alisdair as C++ product manager. Perhaps that will lead to better compliance for their compiler, too. But until that plays out I'm really having trouble seeing how we can include Borland as a release criteria compiler. --Beman

Beman Dawes wrote: [...]
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
Which means that you would be dropping official boost support for a wide range of platforms, if I understand you correctly. This would make the cross platform aspect of boost kind of mood, wouldn't it? Markus

Markus Schöpflin wrote:
Beman Dawes wrote:
[...]
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
Which means that you would be dropping official boost support for a wide range of platforms, if I understand you correctly.
*No* It only means that for the 1.35.0 release we are only testing on platforms where enough testing resources have been contributed, and are running reliably.
This would make the cross platform aspect of boost kind of mood, wouldn't it?
No. Boost developers aren't going to deliberately cripple cross-platform support in their libraries. They will continue to watch the tests results for all platforms, and apply reasonable fixes when they can. --Beman

Beman Dawes wrote:
*No* It only means that for the 1.35.0 release we are only testing on platforms where enough testing resources have been contributed, and are running reliably.
I've been running Boost tests on HP-UX/acc (full, non-incremental test run) on a daily basis for more than a year, with a few exceptions: for example, this weekend, the lab machines will be shut down for infrastructure upgrade. This is what allowed HP-UX/acc to become a release platform in 1.34. Since you did not include HP-UX/acc in the list of release compilers for 1.35.0, you, probably, think that this platform does not have "enough testing resources". What makes you think so? Thanks, Boris ----- Original Message ----- From: "Beman Dawes" <bdawes@acm.org> To: <boost@lists.boost.org> Sent: Friday, October 05, 2007 11:22 AM Subject: Re: [boost] [1.35.0] Release criteria compilers Markus Schöpflin wrote:
Beman Dawes wrote:
[...]
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
Which means that you would be dropping official boost support for a wide range of platforms, if I understand you correctly.
*No* It only means that for the 1.35.0 release we are only testing on platforms where enough testing resources have been contributed, and are running reliably.
This would make the cross platform aspect of boost kind of mood, wouldn't it?
No. Boost developers aren't going to deliberately cripple cross-platform support in their libraries. They will continue to watch the tests results for all platforms, and apply reasonable fixes when they can. --Beman _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Boris Gubenko wrote:
Beman Dawes wrote:
*No* It only means that for the 1.35.0 release we are only testing on platforms where enough testing resources have been contributed, and are running reliably.
I've been running Boost tests on HP-UX/acc (full, non-incremental test run) on a daily basis for more than a year, with a few exceptions: for example, this weekend, the lab machines will be shut down for infrastructure upgrade. This is what allowed HP-UX/acc to become a release platform in 1.34.
Since you did not include HP-UX/acc in the list of release compilers for 1.35.0, you, probably, think that this platform does not have "enough testing resources". What makes you think so?
I started with a small list that I know can be supported because I've personally got easy access to the systems involved. As I said in other posts, I'm willing to add a few more. But I'm concerned about systems with only one tester. In those cases I'd like someone to step forward (as you seem to be doing) and volunteer to ensure tests run regularly. Can you test on both trunk and release branch? Or do you know someone who can test the release branch? Incremental tests are OK. --Beman

Beman Dawes wrote:
[...] But I'm concerned about systems with only one tester. In those cases I'd like someone to step forward (as you seem to be doing) and volunteer to ensure tests run regularly.
Yes, I'll make sure tests continue to run regularly.
Can you test on both trunk and release branch?
Yes I can. If testing on both becomes difficult, I can drop gcc testing, but I hope I won't have to. Thanks, Boris ----- Original Message ----- From: "Beman Dawes" <bdawes@acm.org> To: <boost@lists.boost.org> Sent: Friday, October 05, 2007 2:19 PM Subject: Re: [boost] [1.35.0] Release criteria compilers
Boris Gubenko wrote:
Beman Dawes wrote:
*No* It only means that for the 1.35.0 release we are only testing on platforms where enough testing resources have been contributed, and are running reliably.
I've been running Boost tests on HP-UX/acc (full, non-incremental test run) on a daily basis for more than a year, with a few exceptions: for example, this weekend, the lab machines will be shut down for infrastructure upgrade. This is what allowed HP-UX/acc to become a release platform in 1.34.
Since you did not include HP-UX/acc in the list of release compilers for 1.35.0, you, probably, think that this platform does not have "enough testing resources". What makes you think so?
I started with a small list that I know can be supported because I've personally got easy access to the systems involved. As I said in other posts, I'm willing to add a few more. But I'm concerned about systems with only one tester. In those cases I'd like someone to step forward (as you seem to be doing) and volunteer to ensure tests run regularly.
Can you test on both trunk and release branch? Or do you know someone who can test the release branch? Incremental tests are OK.
--Beman _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Boris Gubenko wrote:
Beman Dawes wrote:
[...] But I'm concerned about systems with only one tester. In those cases I'd like someone to step forward (as you seem to be doing) and volunteer to ensure tests run regularly.
Yes, I'll make sure tests continue to run regularly.
Good!
Can you test on both trunk and release branch?
Yes I can. If testing on both becomes difficult, I can drop gcc testing, but I hope I won't have to.
Great! I just took a look at the acc tests, and they are looking very good. There are only four libraries failing, and two of those are also failing on many other platforms so will probably start passing on acc as soon as the underlying problem gets fixed. So no problem including acc as a release compiler. --Beman

Beman Dawes wrote:
I started with a small list that I know can be supported because I've personally got easy access to the systems involved. As I said in other posts, I'm willing to add a few more. But I'm concerned about systems with only one tester. In those cases I'd like someone to step forward (as you seem to be doing) and volunteer to ensure tests run regularly.
FWIW, I'm planning to resume testing soon. (Release branch, likely. Probably a reduced set of compilers.) Regards, m

Martin Wille wrote:
Beman Dawes wrote:
I started with a small list that I know can be supported because I've personally got easy access to the systems involved. As I said in other posts, I'm willing to add a few more. But I'm concerned about systems with only one tester. In those cases I'd like someone to step forward (as you seem to be doing) and volunteer to ensure tests run regularly.
FWIW, I'm planning to resume testing soon. (Release branch, likely. Probably a reduced set of compilers.)
Great! There is a discussion going on now on the testing list on "Release branch testing". --Beman

Beman Dawes wrote:
Markus Schöpflin wrote:
Beman Dawes wrote:
[...]
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin Which means that you would be dropping official boost support for a wide range of platforms, if I understand you correctly.
*No* It only means that for the 1.35.0 release we are only testing on platforms where enough testing resources have been contributed, and are running reliably.
This would make the cross platform aspect of boost kind of mood, wouldn't it?
No. Boost developers aren't going to deliberately cripple cross-platform support in their libraries. They will continue to watch the tests results for all platforms, and apply reasonable fixes when they can.
Let me put it another way. We will continue testing and fixing on the all compilers/platforms for which we have volunteers. The difference is, that we aren't going to hold the release up to ensure that less standard compliant compilers are all green. By reducing the number of platforms we can reduce the cycle time for developers to be sure changes are working. This allows us to get new libraries and fixes released to the whole of the Boost community in a more timely fashion. Right now we are essentially delaying releases to provide for a minority of our users. And ultimately, these compilers/platforms aren't left out in any way because fixes/ports can go into the next quarterly release. The main thing we need to do at this juncture is 'whatever it takes' to break out of the current release quagmire -- we have important libraries and fixes that have been backed up for over a year now. Jeff

Jeff Garland:
Let me put it another way. We will continue testing and fixing on the all compilers/platforms for which we have volunteers. The difference is, that we aren't going to hold the release up to ensure that less standard compliant compilers are all green. By reducing the number of platforms we can reduce the cycle time for developers to be sure changes are working.
I agree that dropping compilers that do not provide rapid testing turnaround will improve matters, but I'm not seeing the connection with the level of standard compliance or the number of the release platforms. The tests run in parallel, so 1 day is 1 day. Some platforms may prove problematic for other reasons, of course, but we can and should deal with that if and when it happens, not preemptively.

Peter Dimov wrote:
Jeff Garland:
Let me put it another way. We will continue testing and fixing on the all compilers/platforms for which we have volunteers. The difference is, that we aren't going to hold the release up to ensure that less standard compliant compilers are all green. By reducing the number of platforms we can reduce the cycle time for developers to be sure changes are working.
I agree that dropping compilers that do not provide rapid testing turnaround will improve matters, but I'm not seeing the connection with the level of standard compliance or the number of the release platforms. The tests run in parallel, so 1 day is 1 day.
Simple -- if you write a standards compliant C++ library than porting to a 'closely compliant' compiler requires little to no effort -- very few macro hacks, etc. Of course there's still some differences with the newer and better compilers that can cause failures, but it's few a far between as compared to trying to port to a non-compliant compiler. So, focusing our effort on standard compliant compilers will save time for new libraries that are being added to Boost. Also, changes to existing libraries can move forward even if they break the non-compliant platforms -- again less work for the developers, less 1 day cycles to get the testing board all green.
Some platforms may prove problematic for other reasons, of course, but we can and should deal with that if and when it happens, not preemptively.
Again, we're not stopping the testing on these platforms just de-focusing developer effort on fixing any failure results for the 1.35 release. If a non-designated platform turns out to be highly compliant then we can certainly change the list at a later time. And again, it doesn't mean developers won't fix issues on those compilers/platforms...just that the release team isn't going to pester them about it or remove their code from the release because that compiler/platform is broken. Jeff

Peter Dimov <pdimov <at> pdimov.com> writes:
Jeff Garland:
Let me put it another way. We will continue testing and fixing on the all compilers/platforms for which we have volunteers. The difference is, that we aren't going to hold the release up to ensure that less standard compliant compilers are all green. By reducing the number of platforms we can reduce the cycle time for developers to be sure changes are working.
I agree that dropping compilers that do not provide rapid testing turnaround will improve matters, but I'm not seeing the connection with the level of standard compliance or the number of the release platforms. The tests run in parallel, so 1 day is 1 day. Some platforms may prove problematic for other reasons, of course, but we can and should deal with that if and when it happens, not preemptively.
I must say I feel very similar. I believe Bemans approach is way too pessimistic. And kinda deminishing to the configuration he has no direct access to ;) How VC 8.0 failure is more critical than let's say gcc on sparc solaris? Assuming that both have similar tester resources. What I beliee we need is more objective criteria for what is "release compiler". Specific requirements to number of testers and testing turnaround time comes to mind. Additionally it's important to split new failures from regression. New failures we just mark as expected on day of release and ignore. At the beggining of each release we can cleanup "expected failure" status from the test and it becomes new failure again(unless it fixed of cource). Regressions does need to be fixed before release. And here the only place were we may go into grey area. If library A is failing on platform P, while it was working before, what should we do? I personally believe we don't have many cases like this. What I can propose is some kind of tolerance based approach: 1) If platform P has more than V1% of failures - we name platform P as unsupported 2) if library failes on more than V2% of platforms - we revert the library. 3) Otherwise we name the test as expected and add into special section of release notes (saying this test for the library A is now failing on platform P). Values V1 and V2 we can agree upon. Note that all this can and should be done in one shot on day of release. No second chances guys ;) Gennadiy

Gennadiy Rozental wrote:
At the beggining of each release we can cleanup "expected failure" status from the test and it becomes new failure again(unless it fixed of cource).
I think this is very important idea. Otherwise, after we mark a failure expected on some compiler just to get release out of the door, nobody will ever look at that failure. There should be form of periodic nagging for all failures. - Volodya

on Sun Oct 07 2007, Vladimir Prus <ghost-AT-cs.msu.su> wrote:
Gennadiy Rozental wrote:
At the beggining of each release we can cleanup "expected failure" status from the test and it becomes new failure again(unless it fixed of cource).
I think this is very important idea. Otherwise, after we mark a failure expected on some compiler just to get release out of the door, nobody will ever look at that failure.
But we should never do that, should we? -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

Gennadiy Rozental wrote:
I must say I feel very similar.
I believe Bemans approach is way too pessimistic. And kinda deminishing to the configuration he has no direct access to ;) How VC 8.0 failure is more critical than let's say gcc on sparc solaris? Assuming that both have similar tester resources.
They haven't typically had similar resources. We've usually had 2-3 VC 8 testers and maybe one gcc on solaris.
What I beliee we need is more objective criteria for what is "release compiler". Specific requirements to number of testers and testing turnaround time comes to mind.
Instead of getting bogged down in trying to define requirements, I think we should simply agree on a the 'primary platform list' (essentially what Beman is doing). Don't think of this list as release compilers, just a good cross section of the platforms that provide a) good testing support and b) high standards compliance -- thus minimizing the typical hackery needed to port to this compiler. Again, no compiler/platform is being excluded from testing and we want library authors and people with an interest in a platform to continue to port at their discretion -- but I hope by now we'd all agree that bugging authors of new libs to port to VC6 and holding the release is a bad policy because it harms more users than it helps.
Additionally it's important to split new failures from regression. New failures we just mark as expected on day of release and ignore. At the
Well, this doesn't quite work for me. If a new library can't pass tests on the 'primary platform list' then it needs to be removed from the release because it's not ready.
beggining of each release we can cleanup "expected failure" status from the test and it becomes new failure again(unless it fixed of cource). Regressions does need to be fixed before release. And here the only place were we may go into grey area. If library A is failing on platform P, while it was working before, what should we do? I personally believe we don't have many cases like this. What I can propose is some kind of tolerance based approach: 1) If platform P has more than V1% of failures - we name platform P as unsupported 2) if library failes on more than V2% of platforms - we revert the library. 3) Otherwise we name the test as expected and add into special section of release notes (saying this test for the library A is now failing on platform P).
Values V1 and V2 we can agree upon.
I don't really object to the idea except that we all know it's more complicated because of library dependencies. For example, date-time is failing on the trunk right now -- it's source is unchanged since 1.34 -- so it's some other change that has broken some of the regressions. You can't revert date-time because it hasn't changed. We need to track down the change that's breaking it and revert...of course, that might break some new feature some other lib depends on. The bottom line here is that check-ins that break other libraries won't be tolerated for long...
Note that all this can and should be done in one shot on day of release. No second chances guys ;)
I don't think it can work like that, sorry. Jeff

2007/10/7, Jeff Garland <jeff@crystalclearsoftware.com>:
Gennadiy Rozental wrote:
Additionally it's important to split new failures from regression. New failures we just mark as expected on day of release and ignore. At the
Well, this doesn't quite work for me. If a new library can't pass tests on the 'primary platform list' then it needs to be removed from the release because it's not ready.
The definition of "New Failure" might be problematic. E.g. If a test is added in 1.35, due to a bug found (and not fixed) in 1.34, a new failure occurs in the test output. If functionality with poor quality is added to an old library, then the code should not be accepted, not just marked as an expected failure. /$

Henrik Sundberg <storangen <at> gmail.com> writes:
2007/10/7, Jeff Garland <jeff <at> crystalclearsoftware.com>:
Gennadiy Rozental wrote:
Additionally it's important to split new failures from regression. New failures we just mark as expected on day of release and ignore. At the
Well, this doesn't quite work for me. If a new library can't pass tests on the 'primary platform list' then it needs to be removed from the release because it's not ready.
The definition of "New Failure" might be problematic. E.g. If a test is added in 1.35, due to a bug found (and not fixed) in 1.34, a new failure occurs in the test output. If functionality with poor quality is added to an old library, then the code should not be accepted, not just marked as an expected failure.
Why? What ff I added feature that works on gcc 4.0, but do not have time to port it on VC 7.1? I've added corresponding test. No egressions appear. IMO what should be done is that this test should be marked as expected to fail everywhere where it fails and next release I'll try to port it to VC7.1. Next to CW and so on. Gennadiy

on Sun Oct 07 2007, Gennadiy Rozental <rogeeff-AT-gmail.com> wrote:
Henrik Sundberg <storangen <at> gmail.com> writes:
2007/10/7, Jeff Garland <jeff <at> crystalclearsoftware.com>:
Gennadiy Rozental wrote:
Additionally it's important to split new failures from regression. New failures we just mark as expected on day of release and ignore. At the
Well, this doesn't quite work for me. If a new library can't pass tests on the 'primary platform list' then it needs to be removed from the release because it's not ready.
The definition of "New Failure" might be problematic. E.g. If a test is added in 1.35, due to a bug found (and not fixed) in 1.34, a new failure occurs in the test output. If functionality with poor quality is added to an old library, then the code should not be accepted, not just marked as an expected failure.
Why?
What ff I added feature that works on gcc 4.0, but do not have time to port it on VC 7.1? I've added corresponding test. No egressions appear. IMO what should be done is that this test should be marked as expected to fail everywhere where it fails
Why? Just to get a green field of tests? There's a difference between features that can't be made to work due to compiler bugs and those that you just haven't had the time to implement portably. The former is not expected to ever pass for that platform unless someone discovers new hacks. The latter is essentially in a (hopefully temporarily) broken state, and shouldn't look like a healthy test.
and next release I'll try to port it to VC7.1. Next to CW and so on.
If we have a set of primary release platforms, I want to be able to claim that Boost is portable to those environments, not that some features work here and there. If you can't get the feature working on all the release platforms, it should be considered "not yet portable" and held back from the release. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

Jeff Garland wrote:
Gennadiy Rozental wrote:
I must say I feel very similar.
I believe Bemans approach is way too pessimistic. And kinda deminishing to the configuration he has no direct access to ;) How VC 8.0 failure is more critical than let's say gcc on sparc solaris? Assuming that both have similar tester resources.
They haven't typically had similar resources. We've usually had 2-3 VC 8 testers and maybe one gcc on solaris.
What I beliee we need is more objective criteria for what is "release compiler". Specific requirements to number of testers and testing turnaround time comes to mind.
Instead of getting bogged down in trying to define requirements, I think we should simply agree on a the 'primary platform list' (essentially what Beman is doing). Don't think of this list as release compilers, just a good cross section of the platforms that provide a) good testing support and b) high standards compliance -- thus minimizing the typical hackery needed to port to this compiler. Again, no compiler/platform is being excluded from testing and we want library authors and people with an interest in a platform to continue to port at their discretion -- but I hope by now we'd all agree that bugging authors of new libs to port to VC6 and holding the release is a bad policy because it harms more users than it helps.
From an end user's perspective I think it is really important, for a given release and library of Boost, to know whether or not a particular compiler is supposed to work for a particular library. The regression tests do not, unfortunately, always give the end user that information. While I understand the concern for getting out timely releases of Boost which are guaranteed to support a subset of highly conformant compilers, this way of doing things will only make it even more difficult for end-users of those compilers which are not part of that subset to determine whether the latest release of a particular Boost library is supported when using their compiler.

Edward Diener wrote:
From an end user's perspective I think it is really important, for a given release and library of Boost, to know whether or not a particular compiler is supposed to work for a particular library. The regression tests do not, unfortunately, always give the end user that information. While I understand the concern for getting out timely releases of Boost which are guaranteed to support a subset of highly conformant compilers, this way of doing things will only make it even more difficult for end-users of those compilers which are not part of that subset to determine whether the latest release of a particular Boost library is supported when using their compiler.
While I agree this might make understanding the status a tiny bit harder, it's worth it for the majority of users to actually have access to the new libraries that have been accepted into boost -- even if regression status is a bit harder to understand. asio was reviewed in Jan of 2006 and if we don't take some radical steps it won't be in a Boost release in 2007 -- almost 2 years later -- the current state of affairs is simply unacceptable. Jeff

I think the whole idea of providing the test matrix as a guide to which compilers work with which libraries is misguided. Installation procedure should include a local test procedure which builds the test matrix for the user's local environment. This was my motivation for "library_status" executable and script. This would give a couple of advantages over the current approach. a) Users would be able to verify that the installation is correct. b) Users would have a test matrix which is applicable to their particular environment. This can be referred to in the future to verify that that the boost facilities that they want to work or don't work in thier environment. c) I would permit more testing - release as well as debug, static as well as shared libraries, etc. d) It would make support of user problems much easier. If someone says X in library Y doesn't work, some who want's to help can respond "what are the results of test X on library Y". e) someone was really ambitious he could arrange to have the test results shipped to a central location for display - to maintain a more complete test matrix. But this is just a refinement not essential. Robert Ramey Jeff Garland wrote:
Edward Diener wrote:
From an end user's perspective I think it is really important, for a given release and library of Boost, to know whether or not a particular compiler is supposed to work for a particular library. The regression tests do not, unfortunately, always give the end user that information. While I understand the concern for getting out timely releases of Boost which are guaranteed to support a subset of highly conformant compilers, this way of doing things will only make it even more difficult for end-users of those compilers which are not part of that subset to determine whether the latest release of a particular Boost library is supported when using their compiler.
While I agree this might make understanding the status a tiny bit harder, it's worth it for the majority of users to actually have access to the new libraries that have been accepted into boost -- even if regression status is a bit harder to understand. asio was reviewed in Jan of 2006 and if we don't take some radical steps it won't be in a Boost release in 2007 -- almost 2 years later -- the current state of affairs is simply unacceptable.
Jeff _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Robert Ramey wrote:
I think the whole idea of providing the test matrix as a guide to which compilers work with which libraries is misguided.
Installation procedure should include a local test procedure which builds the test matrix for the user's local environment. This was my motivation for "library_status" executable and script. This would give a couple of advantages over the current approach.
I support such an idea also, but I do not think that installation should automatically do anything since building a test matrix, even for a single compiler, appears to take quite a bit of time and the end user may not want, or need, to determine every library's workability with a particular compiler. I would love, however, to be able to run some program, just after installation, to conclusively determine whether the core functionality of a particular library works with a particular compiler. Even a program which just spit out "Yes" or "No" would be better than the current hit or miss guessing which the regression tests often currently supply for libraries and compilers. Of course that "program" would be different for each library, unless one could run a test matrix for all libraries in some batch way if one desired. It would nevertheless be immensely helpful, since it is often a real PITA to determine whether or not a particular library in a given release of Boost works with a partcular compiler.

"Robert Ramey" <ramey@rrsd.com> wrote in message news:fee6uc$5au$1@sea.gmane.org...
I think the whole idea of providing the test matrix as a guide to which compilers work with which libraries is misguided.
Installation procedure should include a local test procedure which builds the test matrix for the user's local environment. This was my motivation for "library_status" executable and script. This would give a couple of advantages over the current approach.
I am not sure I agree with that. I would expect at least 8 out of 10 boost users not willing to spend hours after downloading and unpacking the library (which already takes quite some time) to see how boost fairs for their compilers. After all we don't expect significant differences in between closly matching confgurations. It does indeed looks like nice to have feature, but definetly not a replacement for compiler status pages for configurations we test against. Gennadiy

on Mon Oct 08 2007, "Robert Ramey" <ramey-AT-rrsd.com> wrote:
I think the whole idea of providing the test matrix as a guide to which compilers work with which libraries is misguided.
I agree. The test matrix as it currently stands isn't even really useful to developers, though that's a different issue (there's just too much information there).
Installation procedure should include a local test procedure which builds the test matrix for the user's local environment. This was my motivation for "library_status" executable and script. This would give a couple of advantages over the current approach.
a) Users would be able to verify that the installation is correct.
Yeah, one of our problems currently is that our testing system doesn't install the libraries and then test the libraries as installed. CTest works that way. Our current tools could theoretically also be changed to do that.
b) Users would have a test matrix which is applicable to their particular environment. This can be referred to in the future to verify that that the boost facilities that they want to work or don't work in thier environment. c) I would permit more testing - release as well as debug, static as well as shared libraries, etc. d) It would make support of user problems much easier. If someone says X in library Y doesn't work, some who want's to help can respond "what are the results of test X on library Y". e) someone was really ambitious he could arrange to have the test results shipped to a central location for display - to maintain a more complete test matrix. But this is just a refinement not essential.
I agree that something along those lines is a good approach. However, I do think that Boost needs to make claims about what will work that are useful to a large proportion of our users, so they can decide whether to make an investment without building and running tests themselves. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

on Sun Oct 07 2007, Edward Diener <eldiener-AT-tropicsoft.com> wrote:
Jeff Garland wrote:
Gennadiy Rozental wrote:
I must say I feel very similar.
I believe Bemans approach is way too pessimistic. And kinda deminishing to the configuration he has no direct access to ;) How VC 8.0 failure is more critical than let's say gcc on sparc solaris? Assuming that both have similar tester resources.
They haven't typically had similar resources. We've usually had 2-3 VC 8 testers and maybe one gcc on solaris.
What I beliee we need is more objective criteria for what is "release compiler". Specific requirements to number of testers and testing turnaround time comes to mind.
Instead of getting bogged down in trying to define requirements, I think we should simply agree on a the 'primary platform list' (essentially what Beman is doing). Don't think of this list as release compilers, just a good cross section of the platforms that provide a) good testing support and b) high standards compliance -- thus minimizing the typical hackery needed to port to this compiler. Again, no compiler/platform is being excluded from testing and we want library authors and people with an interest in a platform to continue to port at their discretion -- but I hope by now we'd all agree that bugging authors of new libs to port to VC6 and holding the release is a bad policy because it harms more users than it helps.
From an end user's perspective I think it is really important, for a given release and library of Boost, to know whether or not a particular compiler is supposed to work for a particular library. The regression tests do not, unfortunately, always give the end user that information.
I agree with Edward that that information is crucial and currently lacking. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

Jeff Garland <jeff <at> crystalclearsoftware.com> writes:
Gennadiy Rozental wrote:
I must say I feel very similar.
I believe Bemans approach is way too pessimistic. And kinda deminishing to
the
configuration he has no direct access to ;) How VC 8.0 failure is more critical than let's say gcc on sparc solaris? Assuming that both have similar tester resources.
They haven't typically had similar resources. We've usually had 2-3 VC 8 testers and maybe one gcc on solaris.
What I beliee we need is more objective criteria for what is "release compiler". Specific requirements to number of testers and testing turnaround time comes to mind.
If one tester with daily turnarond time is agreed upon as "release compiler" requirement, they both match the criteria and are in equal positions. IMO the more platforms we can report as beeing tested on - the better. You may argue that we test on platform P anyway, but this is not a big help for many guys trying to push for boost acceptance, unless this confguration is mentions in bold on release notes page. We just need to send clear message that it's not promissed that all the libraries works on all tested configurations and the compiler status page gives full picture.
Instead of getting bogged down in trying to define requirements, I think we should simply agree on a the 'primary platform list' (essentially what Beman is doing). Don't think of this list as release compilers, just a good cross section of the platforms that provide a) good testing support and
I agree it's simpler. I don't agree it's correct. This list needs constant revision and collective decision on which compilers are "more important". This is too subjective IMO. For someone, most important compiler is Inter on AIX and all other are irrelevant. And percents won't help here either. It's just strait road to holy wars.
b) high standards compliance -- thus minimizing the typical hackery needed to port to this compiler. Again, no compiler/platform is being excluded from testing and we want library authors and people with an interest in a platform to continue to port at their discretion -- but
I don't believe this is correct criteria at all. The fact that compiler ABC is bad doesn't stop us from saying "we tested the release against the compiler ABC and here results". In my opinion it's completely orthogonal. From were I stand negative results are as good as positive.
I hope by now we'd all agree that bugging authors of new libs to port to VC6 and holding the release is a bad policy because it harms more users than it helps.
Yes. And I never suggested that. My position is that, following XP guidelines any new library is expected to fail everytwhere :0), until it start works somewhere. It's regressions we need to pay some attentions to. And to some degree as well. If it become unreasonably difficult so failures are frequent we drop the configuration even if it has enough testing resources. Let's say the compiler A on platform P failes more than 50% of tests (including expected failures), than we might decide that regressions detected at release date doesn't worth fixing and we drop the configuration instead.
Additionally it's important to split new failures from regression. New failures we just mark as expected on day of release and ignore. At the
Well, this doesn't quite work for me. If a new library can't pass tests on the 'primary platform list' then it needs to be removed from the release because it's not ready.
beggining of each release we can cleanup "expected failure" status from
test and it becomes new failure again(unless it fixed of cource). Regressions does need to be fixed before release. And here the only place were we may go into grey area. If library A is failing on platform P, while it was working before, what should we do? I personally believe we don't have many cases
this. What I can propose is some kind of tolerance based approach: 1) If platform P has more than V1% of failures - we name platform P as unsupported 2) if library failes on more than V2% of platforms - we revert the
3) Otherwise we name the test as expected and add into special section of release notes (saying this test for the library A is now failing on
As I mentioned before, in my opinion it's completely irrelevant. Library is accepted during review, not by release manager. It's not his call to decide which compilers the library should support. Let's say tommorow we accept the library that employs rvalue references. It passes tests only on some experimental compiler we don't test against for release. I don't see it as showstopper. We mark the library tests as expected to fail on all compilers for now and move on. Release manager should stop fighting for "world peace". In other words it's not his job to make sure library passes the tests. This responceability lies on the library author. His job is to ensure backward compartibility. the like library. platform
P).
Values V1 and V2 we can agree upon.
I don't really object to the idea except that we all know it's more complicated because of library dependencies. For example, date-time is failing on the trunk right now -- it's source is unchanged since 1.34 -- so it's some other change that has broken some of the regressions. You can't revert date-time because it hasn't changed. We need to track down the change that's breaking it and revert...of course, that might break some new feature some other lib depends on. The bottom line here is that check-ins that break other libraries won't be tolerated for long...
It's all true, but IMO irrelevant to the subject. Dependency handling and deciding what needs to be rolled is not simple (at best). But this is the issue we are facing irrespective which compilers are used for release.
Note that all this can and should be done in one shot on day of release. No second chances guys ;)
I don't think it can work like that, sorry.
I do not see why yet. We do need to esteblish mechanism for rolling back library (along with all the libraries it depends on and that depends on it), but I think it should be rare event, which, however complicated, can and should be done in one shot.
Jeff
Gennadiy

on Sun Oct 07 2007, Gennadiy Rozental <rogeeff-AT-gmail.com> wrote:
Jeff Garland <jeff <at> crystalclearsoftware.com> writes:
Gennadiy Rozental wrote:
I must say I feel very similar.
I believe Bemans approach is way too pessimistic. And kinda deminishing to
the
configuration he has no direct access to ;) How VC 8.0 failure is more critical than let's say gcc on sparc solaris? Assuming that both have similar tester resources.
They haven't typically had similar resources. We've usually had 2-3 VC 8 testers and maybe one gcc on solaris.
What I beliee we need is more objective criteria for what is "release compiler". Specific requirements to number of testers and testing turnaround time comes to mind.
If one tester with daily turnarond time is agreed upon as "release compiler" requirement, they both match the criteria and are in equal positions. IMO the more platforms we can report as beeing tested on - the better.
Sure, but there should be a set of platforms that constitute a standard on which all Boost libraries are required to be functional (modulo compiler bugs that can't be worked around). Otherwise we'll have hugely frustrated users who can't find a single compiler on which the two libraries they care about using will work together.
You may argue that we test on platform P anyway, but this is not a big help for many guys trying to push for boost acceptance, unless this confguration is mentions in bold on release notes page. We just need to send clear message that it's not promissed that all the libraries works on all tested configurations and the compiler status page gives full picture.
It doesn't give a useful picture for users, especially with the expected failure markups turning squares green.
Instead of getting bogged down in trying to define requirements, I think we should simply agree on a the 'primary platform list' (essentially what Beman is doing). Don't think of this list as release compilers, just a good cross section of the platforms that provide a) good testing support and
I agree it's simpler. I don't agree it's correct. This list needs constant revision and collective decision on which compilers are "more important". This is too subjective IMO. For someone, most important compiler is Inter on AIX and all other are irrelevant. And percents won't help here either. It's just strait road to holy wars.
b) high standards compliance -- thus minimizing the typical hackery needed to port to this compiler. Again, no compiler/platform is being excluded from testing and we want library authors and people with an interest in a platform to continue to port at their discretion -- but
I don't believe this is correct criteria at all. The fact that compiler ABC is bad doesn't stop us from saying "we tested the release against the compiler ABC and here results". In my opinion it's completely orthogonal. From were I stand negative results are as good as positive.
Sure, but we also need an objective standard by which to judge whether a library or feature's code is ready for release. There are a complete suite of tests and all tests pass on the official set of release compilers is a good criterion.
I hope by now we'd all agree that bugging authors of new libs to port to VC6 and holding the release is a bad policy because it harms more users than it helps.
Yes. And I never suggested that. My position is that, following XP guidelines any new library is expected to fail everytwhere :0), until it start works somewhere.
This approach won't work for Boost, for three reasons: 1) We need some baseline criterion by which to determine that a library or feature is ready for release. The above allows arbitrarily broken code to be released. 2) Library authors who want to support the full set of release compilers need to have some way to decide which other libraries to depend on. If they can't rely on dependency libraries working on the compilers they want to support, they will end up reimplementing all the functionality themselves. 3) The trunk (or our primary integration branch) needs to be a valid environment for determining whether code is releasable. If dependency libraries are constantly breaking on the trunk, we won't get useful information about the releasability of libraries that depend on them. Furthermore, and I apologize in advance because I know this is a religious issue for some, but it's just plain wrongheaded. It's one of the false claims of some XP sects that if code passes all its tests on some platform, it works. Tests are good insurance, but they don't check everything, and one needs to be able to reason about the code and understand why it should work portably (essentially, prove to onesself that it is correct). If you've done that properly, the code will work except where platform bugs interfere. The idea is to choose the release platforms so that these events are sufficiently rare that we can write portable C++ code without expressive restriction.
It's regressions we need to pay some attentions to. And to some degree as well. If it become unreasonably difficult so failures are frequent we drop the configuration even if it has enough testing resources. Let's say the compiler A on platform P failes more than 50% of tests (including expected failures), than we might decide that regressions detected at release date doesn't worth fixing and we drop the configuration instead.
What does "drop the configuration" mean?
Additionally it's important to split new failures from regression. New failures we just mark as expected on day of release and ignore. At the
Well, this doesn't quite work for me. If a new library can't pass tests on the 'primary platform list' then it needs to be removed from the release because it's not ready.
As I mentioned before, in my opinion it's completely irrelevant. Library is accepted during review, not by release manager. It's not his call to decide which compilers the library should support.
I agree, but whose call is it? Especially as Boost grows larger, users need to have a simple story about whether Boost works or not on a given platform, so they can understand whether it's worth the trouble, in broad strokes, and whether they can expect support from Boost if they find problems. If we leave it up to every individual library author, we will not be able to make any simple claims about functionality or portability.
Let's say tommorow we accept the library that employs rvalue references. It passes tests only on some experimental compiler we don't test against for release. I don't see it as showstopper. We mark the library tests as expected to fail on all compilers for now and move on. Release manager should stop fighting for "world peace". In other words it's not his job to make sure library passes the tests. This responceability lies on the library author. His job is to ensure backward compartibility.
And how will he do that? Run the previous version's tests against this version of Boost?
beggining of each release we can cleanup "expected failure" status from the test and it becomes new failure again(unless it fixed of cource). Regressions does need to be fixed before release. And here the only place were we may go into grey area. If library A is failing on platform P, while it was working before, what should we do? I personally believe we don't have many cases like this. What I can propose is some kind of tolerance based approach: 1) If platform P has more than V1% of failures - we name platform P as unsupported 2) if library failes on more than V2% of platforms - we revert the library. 3) Otherwise we name the test as expected and add into special section of release notes (saying this test for the library A is now failing on platform P).
Values V1 and V2 we can agree upon.
IMO that's much too fuzzy to allow us to make any broad portability claims.
I don't really object to the idea except that we all know it's more complicated because of library dependencies. For example, date-time is failing on the trunk right now -- it's source is unchanged since 1.34 -- so it's some other change that has broken some of the regressions. You can't revert date-time because it hasn't changed. We need to track down the change that's breaking it and revert...of course, that might break some new feature some other lib depends on. The bottom line here is that check-ins that break other libraries won't be tolerated for long...
It's all true, but IMO irrelevant to the subject. Dependency handling and deciding what needs to be rolled is not simple (at best). But this is the issue we are facing irrespective which compilers are used for release.
Note that all this can and should be done in one shot on day of release. No second chances guys ;)
I don't think it can work like that, sorry.
I do not see why yet. We do need to esteblish mechanism for rolling back library (along with all the libraries it depends on and that depends on it), but I think it should be rare event, which, however complicated, can and should be done in one shot.
It should be a rare event because library authors should never check in code they expect to break on the trunk (or whatever our primary integration branch is). Anyway, what would your criteria for rollback be, if a library is expected to be broken everywhere except where it works? -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

Jeff Garland schrieb:
Let me put it another way. We will continue testing and fixing on the all compilers/platforms for which we have volunteers. The difference is, that we aren't going to hold the release up to ensure that less standard compliant compilers are all green.
Do you have any evidence that the last release has been hold up because of less standard compliant compilers? (Note that I already mentioned the following in another mail which didn't make it to the list up to now so I'm going to repeat it here.) My impression as a long time regression tester has been, that the last release simply was held up by a lack of time and/or interest in getting the release out of the door. I know some people have put a lot of time and effort in the release, but I also noticed (in the first nine months after the release branch was created) that nothing visible happened for weeks (which allowed me to fix a lot of things for the platform I care about), only to wittness some small or not so small modification causing major breakage for many platforms. And things would stay like this for a few weeks, before someone seemed to care.
By reducing the number of platforms we can reduce the cycle time for developers to be sure changes are working. This allows us to get new libraries and fixes released to the whole of the Boost community in a more timely fashion. Right now we are essentially delaying releases to provide for a minority of our users.
Because of the things I have written above, I challenge the statement that it's the support for the exotic or less compliant platforms that has held up the release.
And ultimately, these compilers/platforms aren't left out in any way because fixes/ports can go into the next quarterly release. The main thing we need to do at this juncture is 'whatever it takes' to break out of the current release quagmire -- we have important libraries and fixes that have been backed up for over a year now.
I understand that you don't want to repeat the fiasco of the last release, but I think right now you're throwing the baby out with the bath-water. Markus

Markus Schöpflin wrote:
Jeff Garland schrieb:
Let me put it another way. We will continue testing and fixing on the all compilers/platforms for which we have volunteers. The difference is, that we aren't going to hold the release up to ensure that less standard compliant compilers are all green.
Do you have any evidence that the last release has been hold up because of less standard compliant compilers?
See my response to Peter -- simply less things to get right before the release ships.
(Note that I already mentioned the following in another mail which didn't make it to the list up to now so I'm going to repeat it here.)
My impression as a long time regression tester has been, that the last release simply was held up by a lack of time and/or interest in getting the release out of the door. I know some people have put a lot of time and effort in the release, but I also noticed (in the first nine months after the release branch was created) that nothing visible happened for weeks (which allowed me to fix a lot of things for the platform I care about), only to wittness some small or not so small modification causing major breakage for many platforms. And things would stay like this for a few weeks, before someone seemed to care.
There's many many reasons behind the 1.34 release delay. Nothing visibly happening when something major is broken simply won't be allowed for 1.35 -- we will revert changes if need be to get this release done in a reasonable time. That includes not including new libraries that can't pass with the core set of compilers.
By reducing the number of platforms we can reduce the cycle time for developers to be sure changes are working. This allows us to get new libraries and fixes released to the whole of the Boost community in a more timely fashion. Right now we are essentially delaying releases to provide for a minority of our users.
Because of the things I have written above, I challenge the statement that it's the support for the exotic or less compliant platforms that has held up the release.
Keep in mind we have something like 7 or 8 libraries to add into 1.35 since library additions have essentially been disallowed for 1.5 years. We can't let this continue -- we have to catch up and get these libraries available for Boost users. Any failures in 'exotic' platforms slow that process down and ultimately delays the library availability for everyone. If we get quarterly releases going correctly this whole issue will go away because the failures for the platform you care about can be in the release in a few months...instead of years....
And ultimately, these compilers/platforms aren't left out in any way because fixes/ports can go into the next quarterly release. The main thing we need to do at this juncture is 'whatever it takes' to break out of the current release quagmire -- we have important libraries and fixes that have been backed up for over a year now.
I understand that you don't want to repeat the fiasco of the last release, but I think right now you're throwing the baby out with the bath-water.
I don't believe so. I think we are making precisely the attitude change we need to get the release process working. The bottom line is that we need to take decisive action with the release process or Boost as a project will fail under it's own weight. I'd ask that you keep and open mind and give this process a chance -- if it fails we will try other things, because this really, really needs to be fixed. If the process succeeds I really believe everyone will be better off -- including the folks on exotic platforms. Jeff

On 10/4/07, Beman Dawes <bdawes@acm.org> wrote:
Comments?
GCC on Solaris/x86 and possibly SPARC. I run Soalris/x86 tests daily and they have a very low "platform" failure rate. Tests that fail here generally fail on other platforms as well. I can crank up the frequency if thats a requirement. -- Caleb Epstein

Beman Dawes schrieb: [...]
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
[...] AFAICT, there is no 64 bit platform in this list. IMO, at the very least a 64 bit platform should be included. Markus

Beman Dawes <bdawes@acm.org> writes:
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
Comments?
Can we test both debug and release variants this time round? Anthony -- Anthony Williams Just Software Solutions Ltd - http://www.justsoftwaresolutions.co.uk Registered in England, Company Number 5478976. Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL

On 10/4/07, Beman Dawes <bdawes@acm.org> wrote:
[snip]
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
[snip]
Comments?
I believe we should drop Release criteria compilers concept, and have recommended compilers. Each library mantainer would then choose to mantain or not a particular compiler and document all the compilers he actively mantains. If a new library comes for review which can just run on some set of compilers, excluding maybe two release compilers, should we not accept it? The most important is documenting which compilers are accepted by which libraries.
--Beman
Regards, -- Felipe Magno de Almeida

on Mon Oct 08 2007, "Felipe Magno de Almeida" <felipe.m.almeida-AT-gmail.com> wrote:
On 10/4/07, Beman Dawes <bdawes@acm.org> wrote:
[snip]
My candidates for the release criteria compilers are:
* Microsoft VC++ 8.0 on Win32 * Intel 10.0 on Win32 * GCC on Linux * GCC on Darwin
[snip]
Comments?
I believe we should drop Release criteria compilers concept, and have recommended compilers. Each library mantainer would then choose to mantain or not a particular compiler and document all the compilers he actively mantains. If a new library comes for review which can just run on some set of compilers, excluding maybe two release compilers, should we not accept it?
Maybe only conditionally on support of those compilers. If we choose the release criteria platforms properly, they shouldn't present serious portability problems for any library. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

On 10/16/07, David Abrahams <dave@boost-consulting.com> wrote:
on Mon Oct 08 2007, "Felipe Magno de Almeida" <felipe.m.almeida-AT-gmail.com> wrote:
[snip]
I believe we should drop Release criteria compilers concept, and have recommended compilers. Each library mantainer would then choose to mantain or not a particular compiler and document all the compilers he actively mantains. If a new library comes for review which can just run on some set of compilers, excluding maybe two release compilers, should we not accept it?
Maybe only conditionally on support of those compilers. If we choose the release criteria platforms properly, they shouldn't present serious portability problems for any library.
What if the library isn't suppose to work on those? It may be useful anyway.
-- Dave Abrahams Boost Consulting http://www.boost-consulting.com
Best regards, -- Felipe Magno de Almeida

on Tue Oct 16 2007, "Felipe Magno de Almeida" <felipe.m.almeida-AT-gmail.com> wrote:
On 10/16/07, David Abrahams <dave@boost-consulting.com> wrote:
on Mon Oct 08 2007, "Felipe Magno de Almeida" <felipe.m.almeida-AT-gmail.com> wrote:
[snip]
I believe we should drop Release criteria compilers concept, and have recommended compilers. Each library mantainer would then choose to mantain or not a particular compiler and document all the compilers he actively mantains. If a new library comes for review which can just run on some set of compilers, excluding maybe two release compilers, should we not accept it?
Maybe only conditionally on support of those compilers. If we choose the release criteria platforms properly, they shouldn't present serious portability problems for any library.
What if the library isn't suppose to work on those? It may be useful anyway.
Sure, but usefulness is not the only criteria we use to decide that a library is ready to be part of a Boost release. A library could even be badly broken on all platforms and still useful. Let's stop dealing in abstracts. Are there any Boost libraries that would have problems passing all tests (except those like the native typeof tests that use nonstandard features) cleanly on all the proposed release platforms? -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

On 10/16/07, David Abrahams <dave@boost-consulting.com> wrote:
[snip]
Sure, but usefulness is not the only criteria we use to decide that a library is ready to be part of a Boost release. A library could even be badly broken on all platforms and still useful.
Let's stop dealing in abstracts. Are there any Boost libraries that would have problems passing all tests (except those like the native typeof tests that use nonstandard features) cleanly on all the proposed release platforms?
Not completely concrete, but I'm thinking about libraries specific for mobiles, and what you already mentioned: ones using nonstandard features. It looks we are agreeing that libraries that fail on proposed release platforms could be adhered to boost anyway. Isn't it?
-- Dave Abrahams Boost Consulting http://www.boost-consulting.com
Regards, -- Felipe Magno de Almeida

on Tue Oct 16 2007, "Felipe Magno de Almeida" <felipe.m.almeida-AT-gmail.com> wrote:
It looks we are agreeing that libraries that fail on proposed release platforms could be adhered to boost anyway. Isn't it?
Sorry, I don't understand what you are saying we're agreeing to. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

On 10/16/07, David Abrahams <dave@boost-consulting.com> wrote:
on Tue Oct 16 2007, "Felipe Magno de Almeida" <felipe.m.almeida-AT-gmail.com> wrote:
It looks we are agreeing that libraries that fail on proposed release platforms could be adhered to boost anyway. Isn't it?
Sorry, I don't understand what you are saying we're agreeing to.
Sorry, I completely misread what you wrote about usefulness not being the only criteria for inclusion.
-- Dave Abrahams Boost Consulting http://www.boost-consulting.com
Regards, -- Felipe Magno de Almeida
participants (24)
-
Anthony Williams
-
Beman Dawes
-
Boris Gubenko
-
Caleb Epstein
-
David Abrahams
-
Edward Diener
-
Felipe Magno de Almeida
-
Gennadiy Rozental
-
Henrik Sundberg
-
Jeff Garland
-
Johan Nilsson
-
Marcus Lindblom
-
Markus Schöpflin
-
Markus Schöpflin
-
Marshall Clow
-
Martin Wille
-
Mat Marcus
-
Matthias Troyer
-
Nicola Musatti
-
Peter Dimov
-
Prashant Thakre
-
Rene Rivera
-
Robert Ramey
-
Vladimir Prus