
on Sun Oct 07 2007, Gennadiy Rozental <rogeeff-AT-gmail.com> wrote:
Jeff Garland <jeff <at> crystalclearsoftware.com> writes:
Gennadiy Rozental wrote:
I must say I feel very similar.
I believe Bemans approach is way too pessimistic. And kinda deminishing to
the
configuration he has no direct access to ;) How VC 8.0 failure is more critical than let's say gcc on sparc solaris? Assuming that both have similar tester resources.
They haven't typically had similar resources. We've usually had 2-3 VC 8 testers and maybe one gcc on solaris.
What I beliee we need is more objective criteria for what is "release compiler". Specific requirements to number of testers and testing turnaround time comes to mind.
If one tester with daily turnarond time is agreed upon as "release compiler" requirement, they both match the criteria and are in equal positions. IMO the more platforms we can report as beeing tested on - the better.
Sure, but there should be a set of platforms that constitute a standard on which all Boost libraries are required to be functional (modulo compiler bugs that can't be worked around). Otherwise we'll have hugely frustrated users who can't find a single compiler on which the two libraries they care about using will work together.
You may argue that we test on platform P anyway, but this is not a big help for many guys trying to push for boost acceptance, unless this confguration is mentions in bold on release notes page. We just need to send clear message that it's not promissed that all the libraries works on all tested configurations and the compiler status page gives full picture.
It doesn't give a useful picture for users, especially with the expected failure markups turning squares green.
Instead of getting bogged down in trying to define requirements, I think we should simply agree on a the 'primary platform list' (essentially what Beman is doing). Don't think of this list as release compilers, just a good cross section of the platforms that provide a) good testing support and
I agree it's simpler. I don't agree it's correct. This list needs constant revision and collective decision on which compilers are "more important". This is too subjective IMO. For someone, most important compiler is Inter on AIX and all other are irrelevant. And percents won't help here either. It's just strait road to holy wars.
b) high standards compliance -- thus minimizing the typical hackery needed to port to this compiler. Again, no compiler/platform is being excluded from testing and we want library authors and people with an interest in a platform to continue to port at their discretion -- but
I don't believe this is correct criteria at all. The fact that compiler ABC is bad doesn't stop us from saying "we tested the release against the compiler ABC and here results". In my opinion it's completely orthogonal. From were I stand negative results are as good as positive.
Sure, but we also need an objective standard by which to judge whether a library or feature's code is ready for release. There are a complete suite of tests and all tests pass on the official set of release compilers is a good criterion.
I hope by now we'd all agree that bugging authors of new libs to port to VC6 and holding the release is a bad policy because it harms more users than it helps.
Yes. And I never suggested that. My position is that, following XP guidelines any new library is expected to fail everytwhere :0), until it start works somewhere.
This approach won't work for Boost, for three reasons: 1) We need some baseline criterion by which to determine that a library or feature is ready for release. The above allows arbitrarily broken code to be released. 2) Library authors who want to support the full set of release compilers need to have some way to decide which other libraries to depend on. If they can't rely on dependency libraries working on the compilers they want to support, they will end up reimplementing all the functionality themselves. 3) The trunk (or our primary integration branch) needs to be a valid environment for determining whether code is releasable. If dependency libraries are constantly breaking on the trunk, we won't get useful information about the releasability of libraries that depend on them. Furthermore, and I apologize in advance because I know this is a religious issue for some, but it's just plain wrongheaded. It's one of the false claims of some XP sects that if code passes all its tests on some platform, it works. Tests are good insurance, but they don't check everything, and one needs to be able to reason about the code and understand why it should work portably (essentially, prove to onesself that it is correct). If you've done that properly, the code will work except where platform bugs interfere. The idea is to choose the release platforms so that these events are sufficiently rare that we can write portable C++ code without expressive restriction.
It's regressions we need to pay some attentions to. And to some degree as well. If it become unreasonably difficult so failures are frequent we drop the configuration even if it has enough testing resources. Let's say the compiler A on platform P failes more than 50% of tests (including expected failures), than we might decide that regressions detected at release date doesn't worth fixing and we drop the configuration instead.
What does "drop the configuration" mean?
Additionally it's important to split new failures from regression. New failures we just mark as expected on day of release and ignore. At the
Well, this doesn't quite work for me. If a new library can't pass tests on the 'primary platform list' then it needs to be removed from the release because it's not ready.
As I mentioned before, in my opinion it's completely irrelevant. Library is accepted during review, not by release manager. It's not his call to decide which compilers the library should support.
I agree, but whose call is it? Especially as Boost grows larger, users need to have a simple story about whether Boost works or not on a given platform, so they can understand whether it's worth the trouble, in broad strokes, and whether they can expect support from Boost if they find problems. If we leave it up to every individual library author, we will not be able to make any simple claims about functionality or portability.
Let's say tommorow we accept the library that employs rvalue references. It passes tests only on some experimental compiler we don't test against for release. I don't see it as showstopper. We mark the library tests as expected to fail on all compilers for now and move on. Release manager should stop fighting for "world peace". In other words it's not his job to make sure library passes the tests. This responceability lies on the library author. His job is to ensure backward compartibility.
And how will he do that? Run the previous version's tests against this version of Boost?
beggining of each release we can cleanup "expected failure" status from the test and it becomes new failure again(unless it fixed of cource). Regressions does need to be fixed before release. And here the only place were we may go into grey area. If library A is failing on platform P, while it was working before, what should we do? I personally believe we don't have many cases like this. What I can propose is some kind of tolerance based approach: 1) If platform P has more than V1% of failures - we name platform P as unsupported 2) if library failes on more than V2% of platforms - we revert the library. 3) Otherwise we name the test as expected and add into special section of release notes (saying this test for the library A is now failing on platform P).
Values V1 and V2 we can agree upon.
IMO that's much too fuzzy to allow us to make any broad portability claims.
I don't really object to the idea except that we all know it's more complicated because of library dependencies. For example, date-time is failing on the trunk right now -- it's source is unchanged since 1.34 -- so it's some other change that has broken some of the regressions. You can't revert date-time because it hasn't changed. We need to track down the change that's breaking it and revert...of course, that might break some new feature some other lib depends on. The bottom line here is that check-ins that break other libraries won't be tolerated for long...
It's all true, but IMO irrelevant to the subject. Dependency handling and deciding what needs to be rolled is not simple (at best). But this is the issue we are facing irrespective which compilers are used for release.
Note that all this can and should be done in one shot on day of release. No second chances guys ;)
I don't think it can work like that, sorry.
I do not see why yet. We do need to esteblish mechanism for rolling back library (along with all the libraries it depends on and that depends on it), but I think it should be rare event, which, however complicated, can and should be done in one shot.
It should be a rare event because library authors should never check in code they expect to break on the trunk (or whatever our primary integration branch is). Anyway, what would your criteria for rollback be, if a library is expected to be broken everywhere except where it works? -- Dave Abrahams Boost Consulting http://www.boost-consulting.com