Re: [boost] [1.35.0] Release criteria compilers

16 Oct 2007

      on Sun Oct 07 2007, Gennadiy Rozental <rogeeff-AT-gmail.com> wrote:
...
Jeff Garland <jeff <at> crystalclearsoftware.com> writes:
...
Gennadiy Rozental wrote:
...
I must say I feel very similar.
I believe Bemans approach is way too pessimistic. And kinda deminishing to
the
...
...
configuration he has no direct access to ;) How VC 8.0 failure is more 
critical than let's say gcc on sparc solaris? Assuming that both have 
similar 
tester resources.
They haven't typically had similar resources.  We've usually had 2-3 VC 
8 testers and maybe one gcc on solaris.
...
What I beliee we need is more objective criteria for what is "release 
compiler". Specific requirements to number of testers and testing 
turnaround 
time comes to mind.
If one tester with daily turnarond time is agreed upon as "release compiler"  
requirement, they both match the criteria and are in equal positions. IMO the 
more platforms we can report as beeing tested on - the better.
Sure, but there should be a set of platforms that constitute a
standard on which all Boost libraries are required to be functional
(modulo compiler bugs that can't be worked around).  Otherwise we'll
have hugely frustrated users who can't find a single compiler on which
the two libraries they care about using will work together.
...
You may argue that we test on platform P anyway, but this is not a
big help for many guys trying to push for boost acceptance, unless
this confguration is mentions in bold on release notes page. We just
need to send clear message that it's not promissed that all the
libraries works on all tested configurations and the compiler status
page gives full picture.
It doesn't give a useful picture for users, especially with the
expected failure markups turning squares green.
...
...
Instead of getting bogged down in trying to define requirements, I think 
we should simply agree on a the 'primary platform list' (essentially 
what Beman is doing).  Don't think of this list as release compilers, 
just a good cross section of the platforms that provide a) good testing 
support and
I agree it's simpler. I don't agree it's correct. This list needs constant 
revision and collective decision on which compilers are "more important". This 
is too subjective IMO. For someone, most important compiler is Inter on AIX 
and all other are irrelevant. And percents won't help here either. It's just 
strait road to holy wars.
...
b) high standards compliance -- thus minimizing the typical 
hackery needed to port to this compiler. Again, no compiler/platform is 
being excluded from testing and we want library authors and people with 
an interest in a platform to continue to port at their discretion -- but
I don't believe this is correct criteria at all. The fact that compiler ABC is 
bad doesn't stop us from saying "we tested the release against the compiler 
ABC and here results". In my opinion it's completely orthogonal. From were I 
stand negative results are as good as positive.
Sure, but we also need an objective standard by which to judge whether
a library or feature's code is ready for release.  There are a
complete suite of tests and all tests pass on the official set of
release compilers is a good criterion.
...
...
I hope by now we'd all agree that bugging authors of new libs to port to 
VC6 and holding the release is a bad policy because it harms more users 
than it helps.
Yes. And I never suggested that. My position is that, following XP guidelines 
any new library is expected to fail everytwhere :0), until it start works 
somewhere.
This approach won't work for Boost, for three reasons:

1) We need some baseline criterion by which to determine that a
   library or feature is ready for release.  The above allows
   arbitrarily broken code to be released.

2) Library authors who want to support the full set of release
   compilers need to have some way to decide which other libraries to
   depend on.  If they can't rely on dependency libraries working on
   the compilers they want to support, they will end up reimplementing
   all the functionality themselves.

3) The trunk (or our primary integration branch) needs to be a valid
   environment for determining whether code is releasable.  If
   dependency libraries are constantly breaking on the trunk, we won't
   get useful information about the releasability of libraries that
   depend on them.

Furthermore, and I apologize in advance because I know this is a
religious issue for some, but it's just plain wrongheaded.  It's one
of the false claims of some XP sects that if code passes all its tests
on some platform, it works.  Tests are good insurance, but they don't
check everything, and one needs to be able to reason about the code
and understand why it should work portably (essentially, prove to
onesself that it is correct).  If you've done that properly, the code
will work except where platform bugs interfere.  The idea is to choose
the release platforms so that these events are sufficiently rare that
we can write portable C++ code without expressive restriction.
...
It's regressions we need to pay some attentions to. And to some 
degree as well. If it become unreasonably difficult so failures are frequent 
we drop the configuration even if it has enough testing resources. Let's say 
the compiler A on platform P failes more than 50% of tests (including expected 
failures), than we might decide that regressions detected at release date 
doesn't worth fixing and we drop the configuration instead.
What does "drop the configuration" mean?
...
...
...
Additionally it's important to split new failures from regression. New 
failures we just mark as expected on day of release and ignore. At the
Well, this doesn't quite work for me.  If a new library can't pass tests 
on the 'primary platform list' then it needs to be removed from the 
release because it's not ready.
As I mentioned before, in my opinion it's completely irrelevant. Library is 
accepted during review, not by release manager. It's not his call to decide 
which compilers the library should support.
I agree, but whose call is it?  Especially as Boost grows larger,
users need to have a simple story about whether Boost works or not on
a given platform, so they can understand whether it's worth the
trouble, in broad strokes, and whether they can expect support from
Boost if they find problems. If we leave it up to every individual
library author, we will not be able to make any simple claims about
functionality or portability.
...
Let's say tommorow we accept the library that employs rvalue
references. It passes tests only on some experimental compiler we
don't test against for release. I don't see it as showstopper. We
mark the library tests as expected to fail on all compilers for now
and move on.  Release manager should stop fighting for "world
peace". In other words it's not his job to make sure library passes
the tests. This responceability lies on the library author. His job
is to ensure backward compartibility.
And how will he do that?  Run the previous version's tests against
this version of Boost?
...
...
...
beggining of each release we can cleanup "expected failure"
status from the test and it becomes new failure again(unless it
fixed of cource).  Regressions does need to be fixed before
release. And here the only place were we may go into grey
area. If library A is failing on platform P, while it was working
before, what should we do? I personally believe we don't have
many cases like this.  What I can propose is some kind of
tolerance based approach: 1) If platform P has more than V1% of
failures - we name platform P as unsupported 2) if library failes
on more than V2% of platforms - we revert the library.  3)
Otherwise we name the test as expected and add into special
section of release notes (saying this test for the library A is
now failing on platform P).
Values V1 and V2 we can agree upon.
IMO that's much too fuzzy to allow us to make any broad portability
claims.
...
...
I don't really object to the idea except that we all know it's more 
complicated because of library dependencies.  For example, date-time is 
failing on the trunk right now -- it's source is unchanged since 1.34 -- 
so it's some other change that has broken some of the regressions.  You 
can't revert date-time because it hasn't changed.  We need to track down 
the change that's breaking it and revert...of course, that might break 
some new feature some other lib depends on.  The bottom line here is 
that check-ins that break other libraries won't be tolerated for long...
It's all true, but IMO irrelevant to the subject. Dependency handling and 
deciding what needs to be rolled is not simple (at best). But this is the 
issue we are facing irrespective which compilers are used for release.
...
...
Note that all this can and should be done in one shot on day of
release.  No second chances guys ;)
I don't think it can work like that, sorry.
I do not see why yet. We do need to esteblish mechanism for rolling back 
library (along with all the libraries it depends on and that depends on it), 
but I think it should be rare event, which, however complicated, can and 
should be done in one shot.
It should be a rare event because library authors should never check
in code they expect to break on the trunk (or whatever our primary
integration branch is).  

Anyway, what would your criteria for rollback be, if a library is
expected to be broken everywhere except where it works?

-- 
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

Re: [boost] [1.35.0] Release criteria compilers

David Abrahams