Feedback requested: Demo build server

older
boost auto_link error when using...

Jason Sankey

14 Sep 2007 14 Sep '07

10:25 a.m.

Hi all, You may have noticed the discussion going on re: developer feedback systems. As part of this I have just started setting up a demo build server at: http://pulse.zutubi.com/ At this stage it is an early demonstration of the server (Pulse) building the boost trunk. There is only one machine, but for illustration purposes I am running both the master and one agent on it. This shows the ability to build on multiple agents in parallel (to test the different OS/compiler combos). In this case I am testing two different gcc versions (the easiest to setup for the demo). You will notice that the build fails, with both a bunch of errors coming out and test failures. This isn't so bad for the demo since it shows how these things are reported. In reality work would need to be done to either fix the problems or have Pulse recognise them as expected failures. You might also notice the Pulse is kicking off a build when it detects any change, and shows the change information (also linked to Trac for diff views etc). This should keep the machine busy, since a build takes over 2 hours (partly because two builds are running in parallel, but mostly just because the build takes that long). Perhaps there is a shorted build/test cycle that should be run on every change for faster feedback. On the subject of feedback, you may also want to try creating an account so you can log in. Just click the login link (top right corner) and you will see a link to sign up. It is best to choose your user name to match your Subversion user name, as then Pulse can tell which changes are yours. Once signed up you get a dashboard view along with preferences that allow you to sign up for email notifications. It would be great if people could take a look and let me know: 1) If you think this is useful and worth continuing with. 2) What important features you think are currently missing. 3) How some of the errors/failing tests can be resolved. Thanks, Jason

Show replies by date

John Maddock

14 Sep 14 Sep

11:46 a.m.

Jason Sankey wrote:

...

Hi all,

You may have noticed the discussion going on re: developer feedback systems. As part of this I have just started setting up a demo build server at:

http://pulse.zutubi.com/

At this stage it is an early demonstration of the server (Pulse) building the boost trunk. There is only one machine, but for illustration purposes I am running both the master and one agent on it. This shows the ability to build on multiple agents in parallel (to test the different OS/compiler combos). In this case I am testing two different gcc versions (the easiest to setup for the demo).

You will notice that the build fails, with both a bunch of errors coming out and test failures. This isn't so bad for the demo since it shows how these things are reported. In reality work would need to be done to either fix the problems or have Pulse recognise them as expected failures.

You might also notice the Pulse is kicking off a build when it detects any change, and shows the change information (also linked to Trac for diff views etc). This should keep the machine busy, since a build takes over 2 hours (partly because two builds are running in parallel, but mostly just because the build takes that long). Perhaps there is a shorted build/test cycle that should be run on every change for faster feedback.

On the subject of feedback, you may also want to try creating an account so you can log in. Just click the login link (top right corner) and you will see a link to sign up. It is best to choose your user name to match your Subversion user name, as then Pulse can tell which changes are yours. Once signed up you get a dashboard view along with preferences that allow you to sign up for email notifications.

It would be great if people could take a look and let me know:

1) If you think this is useful and worth continuing with.

Definitely.

...

2) What important features you think are currently missing.

I found navigation a tad tricky at first: it would be useful if I could get to "all the failures for library X" in the last build directly and easily. As it is I had to drill down into the detail report and then scroll through until I found what I was after. Or maybe I'm missing something obvious? :-) Can you explain more about the "My Builds" page and what it offers, as it looks interesting? Nice work, John Maddock.

Jason Sankey

1:09 p.m.

John Maddock wrote:

...

Jason Sankey wrote:

...
Hi all,

You may have noticed the discussion going on re: developer feedback systems. As part of this I have just started setting up a demo build server at:

<snip>

...

...
1) If you think this is useful and worth continuing with.

Definitely.

Thanks for the vote of confidence :).

...

...
2) What important features you think are currently missing.

I found navigation a tad tricky at first: it would be useful if I could get to "all the failures for library X" in the last build directly and easily. As it is I had to drill down into the detail report and then scroll through until I found what I was after. Or maybe I'm missing something obvious? :-)

You are not missing anything, the drilling is necessary at the moment. This is largely because of the structure of boost, where all libraries are built in one hit. Hence they are all currently in one Pulse project, and the result set is large. One possible solution is to have a view of tests organised per suite (in Pulse terminology: each library is grouped into one suite) rather than the current view which is organised by stage (another Pulse term which refers to a single run on a specific agent). It is also possible to capture HTML reports and serve them up via Pulse, so the build process itself could generate different types of reports as necessary. Pulse supports permalinks to such artifacts so you could always find the latest report at a known URL.

...

Can you explain more about the "My Builds" page and what it offers, as it looks interesting?

This page shows you the results of your "personal" builds. Personal builds are a way for you to test outstanding changes before you commit them to subversion. Basically, you install a Pulse command line client on your dev box. Then, when you have a change ready to commit, instead of running "svn commit" you run "pulse personal" and your change is packed into a zip and uploaded to the Pulse server. Pulse checks out the latest source, applies the changes and runs a regular build and test (possibly on multiple agents). If the build passes you can commit with confidence; if it fails you can fix it and no other developers have been affected. This is a feature that really comes into its own when you need to test on multiple environments. Developers usually don't have an easy way to test all environments, so would usually just commit and find they have broken some other platform later. IMO it would be a great tool for boost developers, but there would be challenges: 1) Having enough machines available to cope with the load of both regular and personal builds. 2) The long build time. This could be mitigated if just the library of interest was built, which is possible if we configure recipes in Pulse for each library.

...

Nice work, John Maddock.

Cheers, Jason

Jeff Garland

1:53 p.m.

Jason Sankey wrote:

...

John Maddock wrote:

...
Jason Sankey wrote:

...
Hi all,

You may have noticed the discussion going on re: developer feedback systems. As part of this I have just started setting up a demo build server at:

<snip>

...
...
1) If you think this is useful and worth continuing with. Definitely.

Thanks for the vote of confidence :).

+1 -- this looks very cool.

...

...
...
2) What important features you think are currently missing. I found navigation a tad tricky at first: it would be useful if I could get to "all the failures for library X" in the last build directly and easily. As it is I had to drill down into the detail report and then scroll through until I found what I was after. Or maybe I'm missing something obvious? :-)

You are not missing anything, the drilling is necessary at the moment. This is largely because of the structure of boost, where all libraries are built in one hit. Hence they are all currently in one Pulse project, and the result set is large. One possible solution is to have a view of tests organised per suite (in Pulse terminology: each library is grouped into one suite) rather than the current view which is organised by stage (another Pulse term which refers to a single run on a specific agent). It is also possible to capture HTML reports and serve them up via Pulse, so the build process itself could generate different types of reports as necessary. Pulse supports permalinks to such artifacts so you could always find the latest report at a known URL.

Maybe there's a way the tests could be broken down by library by library? In case you don't already know, you can test a single library by going to the subdirectory (eg: libs/date_time/test) and running bjam. I expect it might take some scripting, but you're obviously doing some to set all this up anyway. Anyway, that way each library could show up as a 'success/fail' line in the table. And we'd also get a measurement of how long it took to build/run the test for each library. This would be huge...

...

...
Can you explain more about the "My Builds" page and what it offers, as it looks interesting?

This page shows you the results of your "personal" builds. Personal builds are a way for you to test outstanding changes before you commit them to subversion. Basically, you install a Pulse command line client ...snip..

...

This is a feature that really comes into its own when you need to test on multiple environments. Developers usually don't have an easy way to test all environments, so would usually just commit and find they have broken some other platform later. IMO it would be a great tool for boost developers, but there would be challenges:

1) Having enough machines available to cope with the load of both regular and personal builds. 2) The long build time. This could be mitigated if just the library of interest was built, which is possible if we configure recipes in Pulse for each library.

We've been in need of such a mechanism for a very long time. We're trying to move to a place where developers only check-in 'release-ready' code -- even though I think most of us have been trying to do that I think the reality is that with so many platforms it's really easy to break something. We're also getting more and more libs that provide a layer on OS services -- so we need more platforms to support developers trying to port these libs. Bottom line is that this would be major breakthru for boost to have an 'on-demand' ability to run a test on a particular platform with developer code before it gets checked in. Ideally we'd be able to do this on a single library -- that might also help on the resources. Jeff

Jason Sankey

2:56 p.m.

Jeff Garland wrote:

...

Jason Sankey wrote:

...
John Maddock wrote:

...
Jason Sankey wrote:

...
Hi all,

You may have noticed the discussion going on re: developer feedback systems. As part of this I have just started setting up a demo build server at: <snip>

...
...
1) If you think this is useful and worth continuing with. Definitely. Thanks for the vote of confidence :).

+1 -- this looks very cool.

Excellent!

...

...
...
...
2) What important features you think are currently missing. I found navigation a tad tricky at first: it would be useful if I could get to "all the failures for library X" in the last build directly and easily. As it is I had to drill down into the detail report and then scroll through until I found what I was after. Or maybe I'm missing something obvious? :-) You are not missing anything, the drilling is necessary at the moment. This is largely because of the structure of boost, where all libraries are built in one hit. Hence they are all currently in one Pulse project, and the result set is large. One possible solution is to have a view of tests organised per suite (in Pulse terminology: each library is grouped into one suite) rather than the current view which is organised by stage (another Pulse term which refers to a single run on a specific agent). It is also possible to capture HTML reports and serve them up via Pulse, so the build process itself could generate different types of reports as necessary. Pulse supports permalinks to such artifacts so you could always find the latest report at a known URL.

Maybe there's a way the tests could be broken down by library by library? In case you don't already know, you can test a single library by going to the subdirectory (eg: libs/date_time/test) and running bjam. I expect it might take some scripting, but you're obviously doing some to set all this up anyway. Anyway, that way each library could show up as a 'success/fail' line in the table. And we'd also get a measurement of how long it took to build/run the test for each library. This would be huge...

It had occurred to me that this may be a better approach, although I have not experimented with it yet. One thing I am uncertain of is the dependencies between libraries and what it would mean for building them separately. Overall I think breaking it down this way would be much better if dependencies are manageable.

...

...
...
Can you explain more about the "My Builds" page and what it offers, as it looks interesting? This page shows you the results of your "personal" builds. Personal builds are a way for you to test outstanding changes before you commit them to subversion. Basically, you install a Pulse command line client ...snip..

...
This is a feature that really comes into its own when you need to test on multiple environments. Developers usually don't have an easy way to test all environments, so would usually just commit and find they have broken some other platform later. IMO it would be a great tool for boost developers, but there would be challenges:

1) Having enough machines available to cope with the load of both regular and personal builds. 2) The long build time. This could be mitigated if just the library of interest was built, which is possible if we configure recipes in Pulse for each library.

We've been in need of such a mechanism for a very long time. We're trying to move to a place where developers only check-in 'release-ready' code -- even though I think most of us have been trying to do that I think the reality is that with so many platforms it's really easy to break something. We're also getting more and more libs that provide a layer on OS services -- so we need more platforms to support developers trying to port these libs. Bottom line is that this would be major breakthru for boost to have an 'on-demand' ability to run a test on a particular platform with developer code before it gets checked in. Ideally we'd be able to do this on a single library -- that might also help on the resources.

Right, this is exactly what I am talking about. In my experience without this facility the less popular platforms end up being constantly broken which is very frustrating for the people working on those platforms. Cheers, Jason

David Abrahams

24 Sep 24 Sep

6:27 p.m.

on Fri Sep 14 2007, Jeff Garland <jeff-AT-crystalclearsoftware.com> wrote:

...

Maybe there's a way the tests could be broken down by library by library? In case you don't already know, you can test a single library by going to the subdirectory (eg: libs/date_time/test) and running bjam. I expect it might take some scripting

Yes... specifically, the current regression.py is set up to test everything, and running bjam by itself doesn't generate any XML. But I don't think it should be a huge challenge to stitch together a bjam invocation and process_jam_log (which generates the XML) yourself. I'm not sure of the current status, but we *do* have a project in place that will have bjam invocations generating XML directly. Rene? -- Dave Abrahams Boost Consulting http://www.boost-consulting.com The Astoria Seminar ==> http://www.astoriaseminar.com

Jeff Garland

14 Sep 14 Sep

2:01 p.m.

Jason - Couple other thoughts. If you haven't already you should probably subscribe to the boost-testing list -- much of the discussion of testing tools goes on there. Also Rene Rivera is also working in this area -- see: http://beta.boost.org:8081/Boost_HEAD/Dashboard/ Don't know if you should combine efforts...or try different things, but in any case you should be aware that there is other activity on the build/test/dashboard front... Jeff

Jason Sankey

3:04 p.m.

Jeff Garland wrote:

...

Couple other thoughts. If you haven't already you should probably subscribe to the boost-testing list -- much of the discussion of testing tools goes on there.

Thanks for the tip, I will subscribe to this list also.

...

Also Rene Rivera is also working in this area -- see:

http://beta.boost.org:8081/Boost_HEAD/Dashboard/

Don't know if you should combine efforts...or try different things, but in any case you should be aware that there is other activity on the build/test/dashboard front...

I had noticed from Rene's recent mail that there was some more active work going on. I had not realised this originally so perhaps I am duplicating effort here. At this early stage I guess it is fair enough to experiment and see what works. Depending on how much overlap there is it may make sense to combine forces. Cheers Jason

David Abrahams

24 Sep 24 Sep

6:11 p.m.

on Fri Sep 14 2007, Jason Sankey <jason-AT-zutubi.com> wrote:

...

Hi all,

You may have noticed the discussion going on re: developer feedback systems. As part of this I have just started setting up a demo build server at:

http://pulse.zutubi.com/

At this stage it is an early demonstration of the server (Pulse) building the boost trunk. There is only one machine, but for illustration purposes I am running both the master and one agent on it. This shows the ability to build on multiple agents in parallel (to test the different OS/compiler combos). In this case I am testing two different gcc versions (the easiest to setup for the demo).

Very cool to see it working. Sorry it's taken me so long to respond.

...

You will notice that the build fails, with both a bunch of errors coming out and test failures. This isn't so bad for the demo since it shows how these things are reported. In reality work would need to be done to either fix the problems or have Pulse recognise them as expected failures.

Right.

...

You might also notice the Pulse is kicking off a build when it detects any change, and shows the change information (also linked to Trac for diff views etc). This should keep the machine busy, since a build takes over 2 hours (partly because two builds are running in parallel, but mostly just because the build takes that long). Perhaps there is a shorted build/test cycle that should be run on every change for faster feedback.

I don't know how you're invoking the build, but if you're using regression.py, there is an --incremental flag you can pass that avoids rebuilding things whose dependencies haven't changed.

...

On the subject of feedback, you may also want to try creating an account so you can log in. Just click the login link (top right corner) and you will see a link to sign up. It is best to choose your user name to match your Subversion user name, as then Pulse can tell which changes are yours. Once signed up you get a dashboard view along with preferences that allow you to sign up for email notifications.

Oh, that is awesome! I chose a different user name from my svn name, but then added an alias. Will that work?

...

It would be great if people could take a look and let me know:

1) If you think this is useful and worth continuing with.

Definitely worth continuing with. I don't think it's useful yet, but if you continue it will be.

...

2) What important features you think are currently missing.

Integration with the XML failure markup is the most crucial thing.

...

3) How some of the errors/failing tests can be resolved.

Not connected to the 'net as I write this; I might be able to look later. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com The Astoria Seminar ==> http://www.astoriaseminar.com

Jason Sankey

27 Sep 27 Sep

5:12 a.m.

David Abrahams wrote:

...

on Fri Sep 14 2007, Jason Sankey <jason-AT-zutubi.com> wrote:

...
You may have noticed the discussion going on re: developer feedback systems. As part of this I have just started setting up a demo build server at:

http://pulse.zutubi.com/

At this stage it is an early demonstration of the server (Pulse) building the boost trunk. There is only one machine, but for illustration purposes I am running both the master and one agent on it. This shows the ability to build on multiple agents in parallel (to test the different OS/compiler combos). In this case I am testing two different gcc versions (the easiest to setup for the demo).

Very cool to see it working. Sorry it's taken me so long to respond.

OK, I thought for a bit that enthusiasm had been lost. There were a couple of quick and positive responses, though, and I'm glad you got a chance to take a look too. <snip>

...

...
You might also notice the Pulse is kicking off a build when it detects any change, and shows the change information (also linked to Trac for diff views etc). This should keep the machine busy, since a build takes over 2 hours (partly because two builds are running in parallel, but mostly just because the build takes that long). Perhaps there is a shorted build/test cycle that should be run on every change for faster feedback.

I don't know how you're invoking the build, but if you're using regression.py, there is an --incremental flag you can pass that avoids rebuilding things whose dependencies haven't changed.

I am actually invoking things directly using Pulse. Pulse checks out the source from svn and I use Pulse commands to run the build, in a similar way to how other testing scripts appear to work: http://pulse.zutubi.com/viewBuildFile.action?id=1015903 I had some trouble figuring out the latest and best way to run tests, but this seems to work. The default Pulse behaviour is to do a full clean checkout and build. However, there is an option to switch to incremental builds, where the same working copy is used for every build after an svn update to the desired revision. The reason I steered clear is that I noticed a mention somewhere in the regression testing documentation that incremental builds were not 100% reliable. As suggested elsewhere, breaking things down library by library would also help. I have noticed a bit of discussion going around about this lately, and have to say that I think it would be very helpful for integration with Pulse. Apart from faster builds, it would also make it easier to see the status of each library if it were a top-level Pulse project, and developers could then subscribed to email/jabber/RSS notifications for just the libraries they are interested in.

...

...
On the subject of feedback, you may also want to try creating an account so you can log in. Just click the login link (top right corner) and you will see a link to sign up. It is best to choose your user name to match your Subversion user name, as then Pulse can tell which changes are yours. Once signed up you get a dashboard view along with preferences that allow you to sign up for email notifications.

Oh, that is awesome! I chose a different user name from my svn name, but then added an alias. Will that work?

Glad you like it :). Using an alias will work fine, that is what aliases were added for.

...

...
It would be great if people could take a look and let me know:

1) If you think this is useful and worth continuing with.

Definitely worth continuing with. I don't think it's useful yet, but if you continue it will be.

OK, sounds good.

...

...
2) What important features you think are currently missing.

Integration with the XML failure markup is the most crucial thing.

OK. I need to understand these a bit better before I continue. I am not sure at what stage in the process these normally take effect. I guess a lot of the failures I am getting now are actually known and included in this markup? I need to find some time to dig into this.

...

...
3) How some of the errors/failing tests can be resolved.

Not connected to the 'net as I write this; I might be able to look later.

OK, thanks. Getting to a state where a normal build is green will make things a whole lot more useful. Cheers, Jason

David Abrahams

29 Sep 29 Sep

10:49 p.m.

on Wed Sep 26 2007, Jason Sankey <jason-AT-zutubi.com> wrote:

...

David Abrahams wrote:

...
on Fri Sep 14 2007, Jason Sankey <jason-AT-zutubi.com> wrote:

...
You may have noticed the discussion going on re: developer feedback systems. As part of this I have just started setting up a demo build server at:

http://pulse.zutubi.com/

At this stage it is an early demonstration of the server (Pulse) building the boost trunk. There is only one machine, but for illustration purposes I am running both the master and one agent on it. This shows the ability to build on multiple agents in parallel (to test the different OS/compiler combos). In this case I am testing two different gcc versions (the easiest to setup for the demo).

Very cool to see it working. Sorry it's taken me so long to respond.

OK, I thought for a bit that enthusiasm had been lost. There were a couple of quick and positive responses, though, and I'm glad you got a chance to take a look too.

Yeah, sorry -- things have been a little crazy over here.

...

...
...
You might also notice the Pulse is kicking off a build when it detects any change, and shows the change information (also linked to Trac for diff views etc). This should keep the machine busy, since a build takes over 2 hours (partly because two builds are running in parallel, but mostly just because the build takes that long). Perhaps there is a shorted build/test cycle that should be run on every change for faster feedback.

I don't know how you're invoking the build, but if you're using regression.py, there is an --incremental flag you can pass that avoids rebuilding things whose dependencies haven't changed.

I am actually invoking things directly using Pulse. Pulse checks out the source from svn and I use Pulse commands to run the build, in a similar way to how other testing scripts appear to work:

http://pulse.zutubi.com/viewBuildFile.action?id=1015903

I had some trouble figuring out the latest and best way to run tests, but this seems to work.

Seems OK.

...

The default Pulse behaviour is to do a full clean checkout and build. However, there is an option to switch to incremental builds, where the same working copy is used for every build after an svn update to the desired revision. The reason I steered clear is that I noticed a mention somewhere in the regression testing documentation that incremental builds were not 100% reliable.

It has the same unreliability that most projects' builds do: the #include dependency checkers can be fooled by directives of the form #include MACRO_INVOCATION() It's still very useful to do incremental builds, but it makes sense to build from scratch once a day.

...

As suggested elsewhere, breaking things down library by library would also help. I have noticed a bit of discussion going around about this lately, and have to say that I think it would be very helpful for integration with Pulse.

That's good to know.

...

Apart from faster builds, it would also make it easier to see the status of each library if it were a top-level Pulse project, and developers could then subscribed to email/jabber/RSS notifications for just the libraries they are interested in.

Interesting. So what, exactly, does Pulse need in order to achieve these benefits? Reorganization of SVN? Separate build commands for each library?

...

...
...
2) What important features you think are currently missing.

Integration with the XML failure markup is the most crucial thing.

OK. I need to understand these a bit better before I continue. I am not sure at what stage in the process these normally take effect.

IIUC they are processed by the code in tools/regression/xsl_reports/, which currently runs on the servers that display our test results.

...

I guess a lot of the failures I am getting now are actually known and included in this markup?

You can check by looking at status/explicit-failures-markup.xml in whatever SVN subtree you're testing.

...

I need to find some time to dig into this.

...
...
3) How some of the errors/failing tests can be resolved.

Not connected to the 'net as I write this; I might be able to look later.

OK, thanks. Getting to a state where a normal build is green will make things a whole lot more useful.

If you're testing trunk, you may never get there because IIUC it isn't very stable. I suggest you run your tests on the 1.34.1 release tag at least until you see all green. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

Jason Sankey

30 Sep 30 Sep

3:35 a.m.

David Abrahams wrote:

...

on Wed Sep 26 2007, Jason Sankey <jason-AT-zutubi.com> wrote:

...
David Abrahams wrote:

...
on Fri Sep 14 2007, Jason Sankey <jason-AT-zutubi.com> wrote:

...
You may have noticed the discussion going on re: developer feedback systems. As part of this I have just started setting up a demo build server at:

http://pulse.zutubi.com/

At this stage it is an early demonstration of the server (Pulse) building the boost trunk. There is only one machine, but for illustration purposes I am running both the master and one agent on it. This shows the ability to build on multiple agents in parallel (to test the different OS/compiler combos). In this case I am testing two different gcc versions (the easiest to setup for the demo). Very cool to see it working. Sorry it's taken me so long to respond. OK, I thought for a bit that enthusiasm had been lost. There were a couple of quick and positive responses, though, and I'm glad you got a chance to take a look too.

Yeah, sorry -- things have been a little crazy over here.

No problem! We all know what it's like. <snip>

...

...
The default Pulse behaviour is to do a full clean checkout and build. However, there is an option to switch to incremental builds, where the same working copy is used for every build after an svn update to the desired revision. The reason I steered clear is that I noticed a mention somewhere in the regression testing documentation that incremental builds were not 100% reliable.

It has the same unreliability that most projects' builds do: the #include dependency checkers can be fooled by directives of the form

#include MACRO_INVOCATION()

It's still very useful to do incremental builds, but it makes sense to build from scratch once a day.

I see. I guess the problem is if incremental builds are known to have issues, will the attitude to them be different? If people are used to builds failing due to incremental problems they may begin to ignore failures. This can be especially true of people who have previously wasted time tracking down a failure that turned out to be due to incremental issues. If the problems are extremely rare this might not be an issue, and I can definitely set it up to see what happens. The potential benefits of incremental builds are certainly worth a try.

...

...
As suggested elsewhere, breaking things down library by library would also help. I have noticed a bit of discussion going around about this lately, and have to say that I think it would be very helpful for integration with Pulse.

That's good to know.

...
Apart from faster builds, it would also make it easier to see the status of each library if it were a top-level Pulse project, and developers could then subscribed to email/jabber/RSS notifications for just the libraries they are interested in.

Interesting. So what, exactly, does Pulse need in order to achieve these benefits? Reorganization of SVN? Separate build commands for each library?

The most important thing would be the ability to build and test a single library. In the simplest case this could involve checking out all of boost and having all dependent libraries built on demand when building the library of interest. Then the tests for the library of interest could be executed and the results output in some readable format (like the current test_log.xml files). This wouldn't necessarily require any reorganisation of Boost: I guess that building a library independently is already possible, I'm just not sure about running the tests. Further on, further optimisations could be done. Reorganising Subversion to allow just the library of interest to be checked could help a little (although this won't save much real time). More important would be allowing pre-built versions of the dependencies of the library to be picked up so that the build time is reduced.

...

...
...
...
2) What important features you think are currently missing. Integration with the XML failure markup is the most crucial thing. OK. I need to understand these a bit better before I continue. I am not sure at what stage in the process these normally take effect.

IIUC they are processed by the code in tools/regression/xsl_reports/, which currently runs on the servers that display our test results.

...
I guess a lot of the failures I am getting now are actually known and included in this markup?

You can check by looking at status/explicit-failures-markup.xml in whatever SVN subtree you're testing.

OK, thanks for the pointers. Hopefully I will have a chance this week to take a look.

...

...
I need to find some time to dig into this.

...
...
3) How some of the errors/failing tests can be resolved. Not connected to the 'net as I write this; I might be able to look later. OK, thanks. Getting to a state where a normal build is green will make things a whole lot more useful.

If you're testing trunk, you may never get there because IIUC it isn't very stable. I suggest you run your tests on the 1.34.1 release tag at least until you see all green.

OK. This is interesting, because in my experience this will greatly reduce the value of automated builds of the trunk. The problem is basically broken window syndrome: if it is normal for the build to be broken people care less about breaking it even further. Perhaps it is expected that the trunk is unstable and other branches are used to stabilise releases. Even then, though, if people are not taking care then the trunk can drift further and further from stable making it a real pain to bring everything up to scratch for a release. For this reason my personal preference is to have the trunk (as the main development branch) be stable and green at all times and for any unstable work to happen on isolated branches. Of course all of this is just my opinion so feel free to ignore me :). This may also be another argument for splitting things up by library. At least that way libraries that do keep a green trunk can get the benefits without the noise of failures in other libraries. Cheers, Jason

David Abrahams

2 Oct 2 Oct

2:19 a.m.

on Sat Sep 29 2007, Jason Sankey <jason-AT-zutubi.com> wrote:

...

...
...
The default Pulse behaviour is to do a full clean checkout and build. However, there is an option to switch to incremental builds, where the same working copy is used for every build after an svn update to the desired revision. The reason I steered clear is that I noticed a mention somewhere in the regression testing documentation that incremental builds were not 100% reliable.

It has the same unreliability that most projects' builds do: the #include dependency checkers can be fooled by directives of the form

#include MACRO_INVOCATION()

It's still very useful to do incremental builds, but it makes sense to build from scratch once a day.

I see. I guess the problem is if incremental builds are known to have issues, will the attitude to them be different? If people are used to builds failing due to incremental problems they may begin to ignore failures.

I think such failures are relatively rare. It's more likely to mask a failure than to report a false positive, because a test that should be re-run will be skipped when its dependency on a changed header is missed.

...

...
...
As suggested elsewhere, breaking things down library by library would also help. I have noticed a bit of discussion going around about this lately, and have to say that I think it would be very helpful for integration with Pulse.

That's good to know.

...
Apart from faster builds, it would also make it easier to see the status of each library if it were a top-level Pulse project, and developers could then subscribed to email/jabber/RSS notifications for just the libraries they are interested in.

Interesting. So what, exactly, does Pulse need in order to achieve these benefits? Reorganization of SVN? Separate build commands for each library?

The most important thing would be the ability to build and test a single library.

We already have that for nearly all libraries -- just go into libs/<libraryname>/test and run bjam there. Getting it for the rest is a fairly trivial change.

...

In the simplest case this could involve checking out all of boost and having all dependent libraries built on demand when building the library of interest.

That happens automatically when you run bjam.

...

Then the tests for the library of interest could be executed and the results output in some readable format (like the current test_log.xml files). This wouldn't necessarily require any reorganisation of Boost: I guess that building a library independently is already possible, I'm just not sure about running the tests.

It's possible as described above.

...

Further on, further optimisations could be done. Reorganising Subversion to allow just the library of interest to be checked could help a little (although this won't save much real time). More important would be allowing pre-built versions of the dependencies of the library to be picked up so that the build time is reduced.

If you use the same build directory for all the libraries (without cleaning it out between tests), you'll get that automatically.

...

...
...
...
...
3) How some of the errors/failing tests can be resolved. Not connected to the 'net as I write this; I might be able to look later. OK, thanks. Getting to a state where a normal build is green will make things a whole lot more useful.

If you're testing trunk, you may never get there because IIUC it isn't very stable. I suggest you run your tests on the 1.34.1 release tag at least until you see all green.

OK. This is interesting, because in my experience this will greatly reduce the value of automated builds of the trunk.

I know.

...

The problem is basically broken window syndrome: if it is normal for the build to be broken people care less about breaking it even further.

Yep.

...

Perhaps it is expected that the trunk is unstable and other branches are used to stabilise releases. Even then, though, if people are not taking care then the trunk can drift further and further from stable making it a real pain to bring everything up to scratch for a release. For this reason my personal preference is to have the trunk (as the main development branch) be stable and green at all times and for any unstable work to happen on isolated branches. Of course all of this is just my opinion so feel free to ignore me :).

Some of us are very motivated to fix that for Boost. But until it's fixed, you'll be better off testing your work against a known-good release branch.

...

This may also be another argument for splitting things up by library. At least that way libraries that do keep a green trunk can get the benefits without the noise of failures in other libraries.

Unfortunately, such dependencies as exist in Boost cause that noise anyway. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

Jason Sankey

2:07 p.m.

David Abrahams wrote:

...

on Sat Sep 29 2007, Jason Sankey <jason-AT-zutubi.com> wrote:

...
...
...
The default Pulse behaviour is to do a full clean checkout and build. However, there is an option to switch to incremental builds, where the same working copy is used for every build after an svn update to the desired revision. The reason I steered clear is that I noticed a mention somewhere in the regression testing documentation that incremental builds were not 100% reliable. It has the same unreliability that most projects' builds do: the #include dependency checkers can be fooled by directives of the form

#include MACRO_INVOCATION()

It's still very useful to do incremental builds, but it makes sense to build from scratch once a day. I see. I guess the problem is if incremental builds are known to have issues, will the attitude to them be different? If people are used to builds failing due to incremental problems they may begin to ignore failures.

I think such failures are relatively rare. It's more likely to mask a failure than to report a false positive, because a test that should be re-run will be skipped when its dependency on a changed header is missed.

OK. I will just flick the incremental switch and we can worry about problems if they happen :).

...

...
...
...
As suggested elsewhere, breaking things down library by library would also help. I have noticed a bit of discussion going around about this lately, and have to say that I think it would be very helpful for integration with Pulse. That's good to know.

...
Apart from faster builds, it would also make it easier to see the status of each library if it were a top-level Pulse project, and developers could then subscribed to email/jabber/RSS notifications for just the libraries they are interested in. Interesting. So what, exactly, does Pulse need in order to achieve these benefits? Reorganization of SVN? Separate build commands for each library? The most important thing would be the ability to build and test a single library.

We already have that for nearly all libraries -- just go into libs/<libraryname>/test and run bjam there. Getting it for the rest is a fairly trivial change.

Ah - OK. Simple. I guess I should really just get on and try this!

...

...
In the simplest case this could involve checking out all of boost and having all dependent libraries built on demand when building the library of interest.

That happens automatically when you run bjam.

...
Then the tests for the library of interest could be executed and the results output in some readable format (like the current test_log.xml files). This wouldn't necessarily require any reorganisation of Boost: I guess that building a library independently is already possible, I'm just not sure about running the tests.

It's possible as described above.

...
Further on, further optimisations could be done. Reorganising Subversion to allow just the library of interest to be checked could help a little (although this won't save much real time). More important would be allowing pre-built versions of the dependencies of the library to be picked up so that the build time is reduced.

If you use the same build directory for all the libraries (without cleaning it out between tests), you'll get that automatically.

True. Using the same build directory for different Pulse projects is not possible "out-of-the-box" though. I think I should be able to find a way of achieving a similar effect, though.

...

...
...
...
...
...
3) How some of the errors/failing tests can be resolved. Not connected to the 'net as I write this; I might be able to look later. OK, thanks. Getting to a state where a normal build is green will make things a whole lot more useful. If you're testing trunk, you may never get there because IIUC it isn't very stable. I suggest you run your tests on the 1.34.1 release tag at least until you see all green. OK. This is interesting, because in my experience this will greatly reduce the value of automated builds of the trunk.

I know.

...
The problem is basically broken window syndrome: if it is normal for the build to be broken people care less about breaking it even further.

Yep.

...
Perhaps it is expected that the trunk is unstable and other branches are used to stabilise releases. Even then, though, if people are not taking care then the trunk can drift further and further from stable making it a real pain to bring everything up to scratch for a release. For this reason my personal preference is to have the trunk (as the main development branch) be stable and green at all times and for any unstable work to happen on isolated branches. Of course all of this is just my opinion so feel free to ignore me :).

Some of us are very motivated to fix that for Boost. But until it's fixed, you'll be better off testing your work against a known-good release branch.

Ah - I see. I can get down off my soapbox then, as obviously I'm not saying anything you don't already know. I will start playing with individual libraries off the 1.34.1 branch as suggested.

...

...
This may also be another argument for splitting things up by library. At least that way libraries that do keep a green trunk can get the benefits without the noise of failures in other libraries.

Unfortunately, such dependencies as exist in Boost cause that noise anyway.

This is true if the dependent libraries remain in a broken state. However, if a point was reached where green builds happened reasonably frequently then it would be good to be able to build against the "last known good" version of the dependent libraries. That way a breakage in one library does not cascade down the dependency chain: depending libraries can continue to build and test against the last stable version. Something to keep in mind as a possibility down the track anyhow. Cheers, Jason

6533

Age (days ago)

6551

Last active (days ago)

List overview

Download

13 comments

4 participants

participants (4)

David Abrahams
Jason Sankey
Jeff Garland
John Maddock