
On Sat, 22 May 2004 17:18:04 -0500, Aleksey Gurtovoy wrote
Anyway, I remember others (Beman) have previously expressed concern about the length of the test cycle.
It is a problem if you are running them on your "primary" machine during the day. I don't think we can do much about it -- just compiling the tests takes about half of the whole cycle's time, and personally I see little value in regressions that at least didn't compile every test.
Well, I think there is. The additional value of compiling and running the exact same test for date_time in the dll and static link version is exactly the sort of thing that could reduce the compile and runtimes for 'primary machine' testers. I suppose I could start customizing my Jamfile to only run multi-threaded dll on windows, but now I'm deciding what the regression testers can afford and it stops you (Meta-Comm) from running all the variations -- which I still want to see.
Jeff wrote: Serialization tests alone dramatically increase the length of the time to run the regression if we always run the full test. ...snip... Sure, I was just saying that the library author can deal with it on its own -- just make several sections in the bjam file and enable/disable them depending on your current needs.
Sure, but really I'm proposing we turn that around. If the regression tester has the hardware resources to run a torture test with 3 different linking variations then they should be able to do that. As soon as Robert enables the 2.5 hour torture test regression testers might suddenly have an objection to 'author only' control.
Perhaps. From my view things seem pretty thin already.
If we provide a documented way to setup the whole thing, and post "A Call for Regression Runners", I am sure we'll get some response.
There was some discussion during the last release that some testers had removed the python tests because they were taking too long.
Well, you are right that right now the resources are a little spare, but IMO it's just because we didn't work on it.
You could be right.
BTW, just to pile on, wouldn't it be nice if we had testing of the sandbox libraries as well? This would really help those new libraries get ported sooner rather than later...
IMO that's asking too much. Many of them never get submitted.
Many are extensions to existing libraries under active development -- likely to get moved to final CVS. I think it would be nice for libraries coming up for review to get the benefit of the regression system. Clearly we might need to subset what gets run and clean out the old stuff, but I think this would smooth the integration of new libraries.
"Basic" (supposedly what we have now) versus "drastic" (supposedly what's coming with serialization) distinction definitely makes sense. I am not arguing against this one, rather against lowering down our current standards.
I don't want to lower the current standard either. With the Basic option, however, some current libraries might define a smaller test suite speeding up the core tests. Of course, if there it is impossible to subset, then fine they could stay where they are now. Those regression sites that have the horsepower to run the torture test with all variations can still go for that option. Of course we will prefer that, but some might choose to run the torture test once per week (say over a weekend) and the regular tests during the week.
If the test takes 5 to 6 hours to run a single compiler we might lose the one contributor we have.
True, if they are forced to run the drastic test, which IMO shouldn't be the case -- it should be entirely up to the regression runner to decide when and if they have the resources to do that.
Well as soon as Robert wants to run the torture test he's going to get it at all sites if he controls it via his Jamfile. So we need some boost-wide option to define these variations. Hopefully my other email clarifies the idea.
For awhile I'm likely to setup only a single compiler (gcc 3.3.1) on my Mandrake 9 machine. With that approach I should be able to cycle more frequently. Incremental testing is probably a good thing to try out as well.
It produces less reliable results, but the roots of it needs to be tracked and fixed, so yes, it would be good to start looking into it.
Ok will do... Jeff