[Git] Regression testing modular Boost

Starting a new thread - this is far to important to bury in the long documentation thread! On Tue, Dec 11, 2012 at 10:52 AM, Rene Rivera <grafikrobot@gmail.com> wrote:
What is the procedure that testers will follow?
I've been assuming testers will follow the same general procedure. But other than verifying that hand-invoked b2 runs tests as expected, nothing else has been done that I'm aware of.
Is there documentation equivalent to < http://www.boost.org/development/running_regression_tests.html> for the new setup?
None yet. I've been assuming we will just update www.boost.org/development/running_regression_tests.html
What changes to the current tools need to happen to the adjust to the new setup?
As a first cut, branch the current tools and convert any code that currently uses svn commands to use git commands. But before any testing is done, it would be helpful if Boost.Build was updated to handle the generation of boost-root/boost header file links, rather than relying on the workaround cmake script. Do you feel comfortable enough with git to handle the conversion? --Beman

On Wed, Dec 12, 2012 at 9:49 AM, Beman Dawes <bdawes@acm.org> wrote:
Starting a new thread - this is far to important to bury in the long documentation thread!
On Tue, Dec 11, 2012 at 10:52 AM, Rene Rivera <grafikrobot@gmail.com> wrote:
What is the procedure that testers will follow?
I've been assuming testers will follow the same general procedure. But other than verifying that hand-invoked b2 runs tests as expected, nothing else has been done that I'm aware of.
By that you mean invoking b2 on what project? Did you run anything before that to set up the tree? Did you run b2 in the boost-root/status directory? Or to simplify the question.. What precisely has been done already? As I don't want to duplicate effort.
Is there documentation equivalent to < http://www.boost.org/development/running_regression_tests.html> for the new setup?
None yet. I've been assuming we will just update www.boost.org/development/running_regression_tests.html
What changes to the current tools need to happen to the adjust to the new setup?
As a first cut, branch the current tools and convert any code that currently uses svn commands to use git commands.
Hm.. That's barely a step :-\ ..And there's no need to branch. The tools already support multiple transport methods so we can just add another. Which brings me to one of the transport methods regression testing currently supports.. Downloading the current trunk/release as a ZIP archive. I was hoping to use the github facility that exists for downloading ZIPs of the repos. But unfortunately I couldn't make it attached the contents of the indirect references to the library subrepos. Hence the complexity of supporting testing with ZIPs is now a magnitude larger as it means dealing with fetching more than a hundred individual repos :-( But before any testing is done, it would be helpful if Boost.Build was
updated to handle the generation of boost-root/boost header file links, rather than relying on the workaround cmake script.
Well.. In an ideal world it would be possible to have a fully integrated "monolithic" repo that the testers can just use as that is the simplest and likely most repliable path. But, alas, this hope of mine was essentially dismissed during the DVCS/git discussions.
Do you feel comfortable enough with git to handle the conversion?
I am familiar with DVCS concepts.. And intimately versed in VCS systems. But, no, I am not comfortable with git command specifics. And frankly.. I don't want to be. But also the requirements of regression testing are fairly minimal when it comes to interfacing with the repo that the non-expertise on my part doesn't really matter. And if it starts to matter then I would consider it a failure of the system.. And would look for ways to make the testing simpler at the cost of management at the github level. Or to put it another way.. I will avoid complex testing tools at considerable cost as they make for fragile testing. -- -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

Hi, On Mon, Dec 17, 2012 at 1:43 PM, Rene Rivera <grafikrobot@gmail.com> wrote:
Downloading the current trunk/release as a ZIP archive. I was hoping to use the github facility that exists for downloading ZIPs of the repos. But unfortunately I couldn't make it attached the contents of the indirect references to the library subrepos. Hence the complexity of supporting testing with ZIPs is now a magnitude larger as it means dealing with fetching more than a hundred individual repos :-(
FYI (you might know already), github recently announced that manual uploading of files to github is deprecated [1]. If github's 'downloadable source code archives' does not allow zipping all subrepos (as you found), boost.org server may need to run some cron script to create and host the daily snapshot of 'monolithic' source code. [1] https://github.com/blog/1302-goodbye-uploads Best regards, -- Ryo IGARASHI, Ph.D. rigarash@gmail.com

on Sun Dec 16 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On Wed, Dec 12, 2012 at 9:49 AM, Beman Dawes <bdawes@acm.org> wrote:
Starting a new thread - this is far to important to bury in the long documentation thread!
On Tue, Dec 11, 2012 at 10:52 AM, Rene Rivera <grafikrobot@gmail.com> wrote:
What is the procedure that testers will follow?
I've been assuming testers will follow the same general procedure. But other than verifying that hand-invoked b2 runs tests as expected, nothing else has been done that I'm aware of.
By that you mean invoking b2 on what project? Did you run anything before that to set up the tree? Did you run b2 in the boost-root/status directory? Or to simplify the question.. What precisely has been done already? As I don't want to duplicate effort.
Is there documentation equivalent to < http://www.boost.org/development/running_regression_tests.html> for the new setup?
None yet. I've been assuming we will just update www.boost.org/development/running_regression_tests.html
What changes to the current tools need to happen to the adjust to the new setup?
As a first cut, branch the current tools and convert any code that currently uses svn commands to use git commands.
Hm.. That's barely a step :-\ ..And there's no need to branch. The tools already support multiple transport methods so we can just add another. Which brings me to one of the transport methods regression testing currently supports.. Downloading the current trunk/release as a ZIP archive. I was hoping to use the github facility that exists for downloading ZIPs of the repos. But unfortunately I couldn't make it attached the contents of the indirect references to the library subrepos.
That's right, unfortunately. However, we can get the exact URLs of the ZIP files from the GitHub API. I've recently done some scripting with that, e.g. https://github.com/ryppl/ryppl/blob/develop/scripts/github2bitbucket.py#L40 In fact, I think someone has coded up what's needed to make a monolithic zip here: https://github.com/quarnster/sublime_package_control/commit/9fe2fc2cad9bd2e7...
Hence the complexity of supporting testing with ZIPs is now a magnitude larger as it means dealing with fetching more than a hundred individual repos :-(
But before any testing is done, it would be helpful if Boost.Build was updated to handle the generation of boost-root/boost header file links, rather than relying on the workaround cmake script.
Well.. In an ideal world it would be possible to have a fully integrated "monolithic" repo that the testers can just use as that is the simplest and likely most repliable path. But, alas, this hope of mine was essentially dismissed during the DVCS/git discussions.
This isn't about DVCS but about whether we're going to have real modularity.
Do you feel comfortable enough with git to handle the conversion?
I am familiar with DVCS concepts.. And intimately versed in VCS systems. But, no, I am not comfortable with git command specifics. And frankly.. I don't want to be.
Let's work together; I can help with that stuff if necessary. -- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost

On 12/17/2012 12:25 PM, Dave Abrahams wrote:
on Sun Dec 16 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On Wed, Dec 12, 2012 at 9:49 AM, Beman Dawes <bdawes@acm.org> wrote:
On Tue, Dec 11, 2012 at 10:52 AM, Rene Rivera <grafikrobot@gmail.com> wrote:
Hm.. That's barely a step :-\ ..And there's no need to branch. The tools already support multiple transport methods so we can just add another. Which brings me to one of the transport methods regression testing currently supports.. Downloading the current trunk/release as a ZIP archive. I was hoping to use the github facility that exists for downloading ZIPs of the repos. But unfortunately I couldn't make it attached the contents of the indirect references to the library subrepos.
That's right, unfortunately. However, we can get the exact URLs of the ZIP files from the GitHub API. I've recently done some scripting with that, e.g. https://github.com/ryppl/ryppl/blob/develop/scripts/github2bitbucket.py#L40
In fact, I think someone has coded up what's needed to make a monolithic zip here: https://github.com/quarnster/sublime_package_control/commit/9fe2fc2cad9bd2e7...
After looking at both of those I see no point in using the github api (or additional structure data from sublime -- not totally sure where the submodule info comes from in this case though) for this as it provides no additional information than one can get from just parsing the ".gitmodules" file.
Hence the complexity of supporting testing with ZIPs is now a magnitude larger as it means dealing with fetching more than a hundred individual repos :-(
Which now seems the only choice. At the tester side I will have to get the boost-master archive. Then parse out the ".gitmodules" file. And get each subrepo archive individually. Which increases the likelihood of failure considerably. And of course after all that, even for direct git access, recreate a boost-header tree (either moving files or symlinks). I repeat.. More testing complexity :-(
But before any testing is done, it would be helpful if Boost.Build was updated to handle the generation of boost-root/boost header file links, rather than relying on the workaround cmake script.
Well.. In an ideal world it would be possible to have a fully integrated "monolithic" repo that the testers can just use as that is the simplest and likely most repliable path. But, alas, this hope of mine was essentially dismissed during the DVCS/git discussions.
This isn't about DVCS but about whether we're going to have real modularity.
I don't know what you mean by "real modularity". But the testers *must* test what Boost delivers as a package. At some point end users get a Boost installed. And that's what we have to test. If we don't test that we will have unknown issues to deal with. -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org (msn) - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim,yahoo,skype,efnet,gmail

2012/12/25 Rene Rivera <grafikrobot@gmail.com>
On 12/17/2012 12:25 PM, Dave Abrahams wrote:
on Sun Dec 16 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On Wed, Dec 12, 2012 at 9:49 AM, Beman Dawes <bdawes@acm.org> wrote:
On Tue, Dec 11, 2012 at 10:52 AM, Rene Rivera <grafikrobot@gmail.com>
wrote:
Hm.. That's barely a step :-\ ..And there's no need to branch. The tools already support multiple transport methods so we can just add another. Which brings me to one of the transport methods regression testing currently supports.. Downloading the current trunk/release as a ZIP archive. I was hoping to use the github facility that exists for downloading ZIPs of the repos. But unfortunately I couldn't make it attached the contents of the indirect references to the library subrepos.
That's right, unfortunately. However, we can get the exact URLs of the ZIP files from the GitHub API. I've recently done some scripting with that, e.g. https://github.com/ryppl/**ryppl/blob/develop/scripts/** github2bitbucket.py#L40<https://github.com/ryppl/ryppl/blob/develop/scripts/github2bitbucket.py#L40>
In fact, I think someone has coded up what's needed to make a monolithic zip here: https://github.com/quarnster/**sublime_package_control/**commit/** 9fe2fc2cad9bd2e7e1a38d7e5d4aaa**02fb2b4aea<https://github.com/quarnster/sublime_package_control/commit/9fe2fc2cad9bd2e7e1a38d7e5d4aaa02fb2b4aea>
After looking at both of those I see no point in using the github api (or additional structure data from sublime -- not totally sure where the submodule info comes from in this case though) for this as it provides no additional information than one can get from just parsing the ".gitmodules" file.
Hence the complexity of supporting testing with ZIPs is now a
magnitude larger as it means dealing with fetching more than a hundred individual repos :-(
Which now seems the only choice. At the tester side I will have to get the boost-master archive. Then parse out the ".gitmodules" file. And get each subrepo archive individually. Which increases the likelihood of failure considerably.
If you do it manually, yes. And of course after all that, even for direct git access, recreate a
boost-header tree (either moving files or symlinks).
I repeat.. More testing complexity :-(
Again: if you do it manually. But before any testing is done, it would be helpful if Boost.Build was
updated to handle the generation of boost-root/boost header file links, rather than relying on the workaround cmake script.
Well.. In an ideal world it would be possible to have a fully integrated "monolithic" repo that the testers can just use as that is the simplest and likely most repliable path. But, alas, this hope of mine was essentially dismissed during the DVCS/git discussions.
This isn't about DVCS but about whether we're going to have real modularity.
I don't know what you mean by "real modularity".
Monolithic development (currently): There is one repository, one release cycle. Modularized development (proposed): Each module has its own repository and release cycle. Optional: Multiple release cycles may be synced. Multiple modules may be delivered as one package. Is there room for misunderstanding? Maybe it is unclear what Boost's future development/test/release process will be like. But the meaning of "real modularity" should be clear, no? But the testers *must* test what Boost delivers as a package. At some point
end users get a Boost installed. And that's what we have to test. If we don't test that we will have unknown issues to deal with.
Absolutely! Boost should continue to provide monolithic packages. And these packages need to be tested. What we want to modularize is the development. Not the package that we provide to end users. However, if we want to provide a monolithic release from modularized sources, we can not simply "not modularize the release". We need to "put it back together" instead. Of course, there is a slight increase in complexity. We try to keep it minimal, but can not avoid it completely. cheers, Daniel

Daniel Pfeifer wrote:
2012/12/25 Rene Rivera <grafikrobot@gmail.com>
On 12/17/2012 12:25 PM, Dave Abrahams wrote:
I don't know what you mean by "real modularity".
Monolithic development (currently): There is one repository, one release cycle.
Modularized development (proposed): Each module has its own repository and release cycle.
This would suggest that each library have it's own versioning sequence. This in turn would suggest that each library have a list of dependcies. Each entry in this list would be the the pre-requisite library along with the minimum version number required.
Optional: Multiple release cycles may be synced. Multiple modules may be delivered as one package.
Is there room for misunderstanding? Maybe it is unclear what Boost's future development/test/release process will be like. But the meaning of "real modularity" should be clear, no?
lol - maybe - but I think we'll see otherwise. FWIW I agree with your concept of "real modularity" - but that would be a big step for us and we're not currently prepared for this.
But the testers *must* test what Boost delivers as a package. At some point end users get a Boost installed. And that's what we have to test. If we don't test that we will have unknown issues to deal with.
Hmmm - I would say at some point the package to be deployed by the boost organization should be tested. This should mean that each library is tested against all the other libraries in the package. But for developers and "testers" there is no reason to require that the be testing the whole of boost. If a particular library is tested against the current (or next) release deployment, that should be enough.
Absolutely! Boost should continue to provide monolithic packages. And these packages need to be tested. What we want to modularize is the development. Not the package that we provide to end users.
If each library has been tested against the release, there is no reason that the deployment need be complete. Anyone could deploy a subset if they wanted. We even have BCP to help that along.
However, if we want to provide a monolithic release from modularized sources, we can not simply "not modularize the release". We need to "put it back together" instead.
Eventually, we'll get to the concept of a "deployment" package which would be a subset of the current release versions of the libraries. Robert Ramey

2012/12/26 Robert Ramey <ramey@rrsd.com>
Daniel Pfeifer wrote:
2012/12/25 Rene Rivera <grafikrobot@gmail.com>
On 12/17/2012 12:25 PM, Dave Abrahams wrote:
I don't know what you mean by "real modularity".
Monolithic development (currently): There is one repository, one release cycle.
Modularized development (proposed): Each module has its own repository and release cycle.
This would suggest that each library have it's own versioning sequence.
This in turn would suggest that each library have a list of dependcies. Each entry in this list would be the the pre-requisite library along with the minimum version number required.
I think you are interpreting too much meaning into what I wrote. And I am afraid you missed the next line. But in principle, it could suggest that, yes.
Optional: Multiple release cycles may be synced. Multiple modules may
be delivered as one package.
Is there room for misunderstanding? Maybe it is unclear what Boost's future development/test/release process will be like. But the meaning of "real modularity" should be clear, no?
lol - maybe - but I think we'll see otherwise.
Other than what? FWIW I agree with
your concept of "real modularity" - but that would be a big step for us and we're not currently prepared for this.
Nobody suggested to make such a big step. But we can reach this point in multiple small steps if we keep "real modularity" in focus. -- Daniel

On 12/25/2012 5:32 PM, Robert Ramey wrote:
Daniel Pfeifer wrote:
2012/12/25 Rene Rivera <grafikrobot@gmail.com>
On 12/17/2012 12:25 PM, Dave Abrahams wrote:
I don't know what you mean by "real modularity".
Monolithic development (currently): There is one repository, one release cycle.
Modularized development (proposed): Each module has its own repository and release cycle.
This would suggest that each library have it's own versioning sequence.
This in turn would suggest that each library have a list of dependcies. Each entry in this list would be the the pre-requisite library along with the minimum version number required.
It would also suggest that each library is *only* tested against those requirements. And that any other combination is not officially supported when it reaches the end users.
But the testers *must* test what Boost delivers as a package. At some point end users get a Boost installed. And that's what we have to test. If we don't test that we will have unknown issues to deal with.
Hmmm - I would say at some point the package to be deployed by the boost organization should be tested. This should mean that each library is tested against all the other libraries in the package.
Yes, that's what I'm saying.
But for developers and "testers" there is no reason to require that the be testing the whole of boost. If a particular library is tested against the current (or next) release deployment, that should be enough.
For developers perhaps that enough. But not for testers. At some point someone has to test what is released. And the sooner that happens in the release cycle the better. Currently that testing happens continuously, in the form of the release branch testing. Which is ideal since you can't get any earlier. For github it means testing will happen on the master branch of the super-project and the sub-projects as a monolithic package. Along with this there's also the question of putting the release archive together. Seeing that if we had a working script that put the release archive together from the master branches it could be possibly be adapted to work for testing. -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org (msn) - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim,yahoo,skype,efnet,gmail

on Tue Dec 25 2012, "Robert Ramey" <ramey-AT-rrsd.com> wrote:
Daniel Pfeifer wrote:
2012/12/25 Rene Rivera <grafikrobot@gmail.com>
On 12/17/2012 12:25 PM, Dave Abrahams wrote:
I don't know what you mean by "real modularity".
Monolithic development (currently): There is one repository, one release cycle.
Modularized development (proposed): Each module has its own repository and release cycle.
This would suggest that each library have it's own versioning sequence.
That is the plan
This in turn would suggest that each library have a list of dependcies. Each entry in this list would be the the pre-requisite library along with the minimum version number required.
It's not just about the minimum version number; version compatibility can be more complicated than that. Providing for that is also the plan but it's not part of the immediate Git transition.
Optional: Multiple release cycles may be synced. Multiple modules may be delivered as one package.
Is there room for misunderstanding? Maybe it is unclear what Boost's future development/test/release process will be like. But the meaning of "real modularity" should be clear, no?
lol - maybe - but I think we'll see otherwise. FWIW I agree with your concept of "real modularity" - but that would be a big step for us and we're not currently prepared for this.
Of course we don't think Boost is prepared today, but we've been planning this stuff for quite some time and have specific approaches and technology in place that will help Boost to become prepared. If you're interested in the details of medium-to-long-term plans, those should probably be discussed on the ryppl-dev mailing list (see Google Groups).
But the testers *must* test what Boost delivers as a package. At some point end users get a Boost installed. And that's what we have to test. If we don't test that we will have unknown issues to deal with.
Hmmm - I would say at some point the package to be deployed by the boost organization should be tested. This should mean that each library is tested against all the other libraries in the package.
But for developers and "testers" there is no reason to require that the be testing the whole of boost. If a particular library is tested against the current (or next) release deployment, that should be enough.
You're mixing up the immediate steps with things that can/should happen down the road. Immediately the goal is to make the transition to modularized Git repositories without changing other procedures more than necessary. That means maintaining the current testing protocols and results. We have plans to improve testing in bigger ways, but that's a separate step.
Absolutely! Boost should continue to provide monolithic packages. And these packages need to be tested. What we want to modularize is the development. Not the package that we provide to end users.
If each library has been tested against the release, there is no reason that the deployment need be complete. Anyone could deploy a subset if they wanted.
Anyone could, but AFAICT Boost has no incentive to do so.
We even have BCP to help that along.
We won't need BCP much longer. Once library dependencies are declared, as part of build system files and/or ryppl feeds, it will be obsolete. -- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost

On 12/25/2012 12:54 PM, Daniel Pfeifer wrote:
2012/12/25 Rene Rivera <grafikrobot@gmail.com>
On 12/17/2012 12:25 PM, Dave Abrahams wrote:
on Sun Dec 16 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On Wed, Dec 12, 2012 at 9:49 AM, Beman Dawes <bdawes@acm.org> wrote:
On Tue, Dec 11, 2012 at 10:52 AM, Rene Rivera <grafikrobot@gmail.com>
wrote:
Hm.. That's barely a step :-\ ..And there's no need to branch. The tools already support multiple transport methods so we can just add another. Which brings me to one of the transport methods regression testing currently supports.. Downloading the current trunk/release as a ZIP archive. I was hoping to use the github facility that exists for downloading ZIPs of the repos. But unfortunately I couldn't make it attached the contents of the indirect references to the library subrepos.
That's right, unfortunately. However, we can get the exact URLs of the ZIP files from the GitHub API. I've recently done some scripting with that, e.g. https://github.com/ryppl/**ryppl/blob/develop/scripts/** github2bitbucket.py#L40<https://github.com/ryppl/ryppl/blob/develop/scripts/github2bitbucket.py#L40>
In fact, I think someone has coded up what's needed to make a monolithic zip here: https://github.com/quarnster/**sublime_package_control/**commit/** 9fe2fc2cad9bd2e7e1a38d7e5d4aaa**02fb2b4aea<https://github.com/quarnster/sublime_package_control/commit/9fe2fc2cad9bd2e7e1a38d7e5d4aaa02fb2b4aea>
After looking at both of those I see no point in using the github api (or additional structure data from sublime -- not totally sure where the submodule info comes from in this case though) for this as it provides no additional information than one can get from just parsing the ".gitmodules" file.
Hence the complexity of supporting testing with ZIPs is now a
magnitude larger as it means dealing with fetching more than a hundred individual repos :-(
Which now seems the only choice. At the tester side I will have to get the boost-master archive. Then parse out the ".gitmodules" file. And get each subrepo archive individually. Which increases the likelihood of failure considerably.
If you do it manually, yes.
And of course after all that, even for direct git access, recreate a
boost-header tree (either moving files or symlinks).
I repeat.. More testing complexity :-(
Again: if you do it manually.
OK.. What's is the not manual way to do this without having git?
Absolutely! Boost should continue to provide monolithic packages. And these packages need to be tested. What we want to modularize is the development. Not the package that we provide to end users.
Right.. And hence this thread about moving the testing infrastructure to the github base. Note.. At this point I only really care about the implementation practicalities. Because as it is, there's a rather small chance of getting all this working for the target dates. -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org (msn) - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim,yahoo,skype,efnet,gmail

on Wed Dec 26 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On 12/25/2012 12:54 PM, Daniel Pfeifer wrote:
2012/12/25 Rene Rivera <grafikrobot@gmail.com>
On 12/17/2012 12:25 PM, Dave Abrahams wrote:
on Sun Dec 16 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On Wed, Dec 12, 2012 at 9:49 AM, Beman Dawes <bdawes@acm.org> wrote:
On Tue, Dec 11, 2012 at 10:52 AM, Rene Rivera <grafikrobot@gmail.com>
wrote:
Hm.. That's barely a step :-\ ..And there's no need to branch. The tools already support multiple transport methods so we can just add another. Which brings me to one of the transport methods regression testing currently supports.. Downloading the current trunk/release as a ZIP archive. I was hoping to use the github facility that exists for downloading ZIPs of the repos. But unfortunately I couldn't make it attached the contents of the indirect references to the library subrepos.
That's right, unfortunately. However, we can get the exact URLs of the ZIP files from the GitHub API. I've recently done some scripting with that, e.g. https://github.com/ryppl/**ryppl/blob/develop/scripts/** github2bitbucket.py#L40<https://github.com/ryppl/ryppl/blob/develop/scripts/github2bitbucket.py#L40>
In fact, I think someone has coded up what's needed to make a monolithic zip here: https://github.com/quarnster/**sublime_package_control/**commit/** 9fe2fc2cad9bd2e7e1a38d7e5d4aaa**02fb2b4aea<https://github.com/quarnster/sublime_package_control/commit/9fe2fc2cad9bd2e7e1a38d7e5d4aaa02fb2b4aea>
After looking at both of those I see no point in using the github api (or additional structure data from sublime -- not totally sure where the submodule info comes from in this case though) for this as it provides no additional information than one can get from just parsing the ".gitmodules" file.
Hence the complexity of supporting testing with ZIPs is now a
magnitude larger as it means dealing with fetching more than a hundred individual repos :-(
Which now seems the only choice. At the tester side I will have to get the boost-master archive. Then parse out the ".gitmodules" file. And get each subrepo archive individually. Which increases the likelihood of failure considerably.
If you do it manually, yes.
And of course after all that, even for direct git access, recreate a
boost-header tree (either moving files or symlinks).
I repeat.. More testing complexity :-(
Again: if you do it manually.
OK.. What's is the not manual way to do this without having git?
http://www.samba.org/~jelmer/dulwich/ -- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost

Just getting back to this as the drive on my mac is now repaired.. In a totally empty state :-( On Wed, Dec 26, 2012 at 10:14 AM, Dave Abrahams <dave@boostpro.com> wrote:
on Wed Dec 26 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
OK.. What's is the not manual way to do this without having git?
OK.. That helps somewhat. It makes it possible to just write one piece of code for all testers (since we require python and we can add installing dulwich to that). But now I need to figure out what to write to use dulwich (the sample/test scripts don't implement all of the git front end commands fully). My goal is to have the equivalent of: git clone -b <branch> --depth 1 --recursive https://github.com/boost-lib/boost.git <some-test-dir> The first time, but with the shallow depth also applied recursively (something which seems to me to be a bug in git). And subsequent times doing: git pull --recurse-submodules https://github.com/boost-lib/boost.git<branch> Or at least that what I understand will give me only the current revision/s the first time. And then get only the subsequent updates correctly applied. Help in verifying that those would be the correct base git commands to emulate is appreciated. For those that will question why I'm going to the trouble.. One of the goals of the testing scripts is to minimize disk space *and* network bandwidth. Hence the convoluted fetch as minimal info as possible and store as minimal info as possible. Which brings a question.. Is there a way to have the local repo only store the current HEAD revision files (i.e. minimize the contents of the .git dir)? And also.. Is it possible to only store the specific branch revisions in the git repo dir? -- -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

on Fri Dec 28 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
Just getting back to this as the drive on my mac is now repaired.. In a totally empty state :-(
On Wed, Dec 26, 2012 at 10:14 AM, Dave Abrahams <dave@boostpro.com> wrote:
on Wed Dec 26 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
OK.. What's is the not manual way to do this without having git?
OK.. That helps somewhat. It makes it possible to just write one piece of code for all testers (since we require python and we can add installing dulwich to that).
It's even possible to write a script that creates a virtualenv and installs dulwich there on demand, so testers don't have to do it manually.
But now I need to figure out what to write to use dulwich (the sample/test scripts don't implement all of the git front end commands fully).
Yeah, I know; dulwich is closer to the "plumbing" than to the "porcelain." John Wiegley is my personal best-person-to-ask about how to use the low-level git bits. John, could you help René with this?
My goal is to have the equivalent of:
git clone -b <branch> --depth 1 --recursive https://github.com/boost-lib/boost.git <some-test-dir>
The first time, but with the shallow depth also applied recursively (something which seems to me to be a bug in git). And subsequent times doing:
git pull --recurse-submodules https://github.com/boost-lib/boost.git<branch>
Or at least that what I understand will give me only the current revision/s the first time. And then get only the subsequent updates correctly applied. Help in verifying that those would be the correct base git commands to emulate is appreciated. For those that will question why I'm going to the trouble.. One of the goals of the testing scripts is to minimize disk space *and* network bandwidth. Hence the convoluted fetch as minimal info as possible and store as minimal info as possible. Which brings a question..
Is there a way to have the local repo only store the current HEAD revision files (i.e. minimize the contents of the .git dir)?
I think that's the shallow clone technique you're using above (--depth 1). Do you have something else in mind?
And also..
Is it possible to only store the specific branch revisions in the git repo dir?
I don't know, but at this point you might consider whether it would be more efficient to simply get the information about submodule refs and then download/unpack all the appropriate .zip files -- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost

On Sat, Dec 29, 2012 at 2:15 PM, Dave Abrahams <dave@boostpro.com> wrote:
on Fri Dec 28 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
Just getting back to this as the drive on my mac is now repaired.. In a totally empty state :-(
On Wed, Dec 26, 2012 at 10:14 AM, Dave Abrahams <dave@boostpro.com> wrote:
on Wed Dec 26 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
OK.. What's is the not manual way to do this without having git?
OK.. That helps somewhat. It makes it possible to just write one piece of code for all testers (since we require python and we can add installing dulwich to that).
It's even possible to write a script that creates a virtualenv and installs dulwich there on demand, so testers don't have to do it manually.
Except that dulwich requires compiling a C module. So virtual installing wouldn't work.. Right?
My goal is to have the equivalent of:
git clone -b <branch> --depth 1 --recursive https://github.com/boost-lib/boost.git <some-test-dir>
The first time, but with the shallow depth also applied recursively (something which seems to me to be a bug in git). And subsequent times doing:
git pull --recurse-submodules https://github.com/boost-lib/boost.git <branch>
Or at least that what I understand will give me only the current revision/s the first time. And then get only the subsequent updates correctly applied. Help in verifying that those would be the correct base git commands to emulate is appreciated. For those that will question why I'm going to the trouble.. One of the goals of the testing scripts is to minimize disk space *and* network bandwidth. Hence the convoluted fetch as minimal info as possible and store as minimal info as possible. Which brings a question..
Is there a way to have the local repo only store the current HEAD revision files (i.e. minimize the contents of the .git dir)?
I think that's the shallow clone technique you're using above (--depth 1). Do you have something else in mind?
I was asking both is that what #1 does initially and is there a way to make #2 not keep old history. And obviously how would I go about doing it with dulwich.
And also..
Is it possible to only store the specific branch revisions in the git repo dir?
I don't know, but at this point you might consider whether it would be more efficient to simply get the information about submodule refs and then download/unpack all the appropriate .zip files
But I know doing the zips would be less efficient as I would have to download them all, all the time. At which point I could just do #1 above each time. Which is certainly disk space efficient, but obviously not bandwidth efficient. And it's likely going to be that I just start with doing #1 only. Until I can figure out how to do the rest. -- -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

on Sat Dec 29 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On Sat, Dec 29, 2012 at 2:15 PM, Dave Abrahams <dave@boostpro.com> wrote:
on Fri Dec 28 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
Just getting back to this as the drive on my mac is now repaired.. In a totally empty state :-(
On Wed, Dec 26, 2012 at 10:14 AM, Dave Abrahams <dave@boostpro.com> wrote:
on Wed Dec 26 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
OK.. What's is the not manual way to do this without having git?
OK.. That helps somewhat. It makes it possible to just write one piece of code for all testers (since we require python and we can add installing dulwich to that).
It's even possible to write a script that creates a virtualenv and installs dulwich there on demand, so testers don't have to do it manually.
Except that dulwich requires compiling a C module.
It does? "Dulwich is a pure-Python implementation of the Git file formats and protocols." Is http://www.samba.org/~jelmer/dulwich/ lying?
So virtual installing wouldn't work.. Right?
My goal is to have the equivalent of:
git clone -b <branch> --depth 1 --recursive https://github.com/boost-lib/boost.git <some-test-dir>
The first time, but with the shallow depth also applied recursively (something which seems to me to be a bug in git). And subsequent times doing:
git pull --recurse-submodules https://github.com/boost-lib/boost.git <branch>
Or at least that what I understand will give me only the current revision/s the first time. And then get only the subsequent updates correctly applied. Help in verifying that those would be the correct base git commands to emulate is appreciated. For those that will question why I'm going to the trouble.. One of the goals of the testing scripts is to minimize disk space *and* network bandwidth.
The disk space concern seems a bit misplaced given the size of the generated binaries. I mean, keep it from exploding, of course, but *minimizing* it seems like overkill.
Hence the convoluted fetch as minimal info as possible and store as minimal info as possible. Which brings a question..
Is there a way to have the local repo only store the current HEAD revision files (i.e. minimize the contents of the .git dir)?
I think that's the shallow clone technique you're using above (--depth 1). Do you have something else in mind?
I was asking both is that what #1 does initially and is there a way to make #2 not keep old history. And obviously how would I go about doing it with dulwich.
I *think* the answer to the first question is that there is no way to make #2 not keep old history.
And also..
Is it possible to only store the specific branch revisions in the git repo dir?
I don't know, but at this point you might consider whether it would be more efficient to simply get the information about submodule refs and then download/unpack all the appropriate .zip files
But I know doing the zips would be less efficient as I would have to download them all, all the time.
Only if the refs change. -- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost

On Sat, Dec 29, 2012 at 4:04 PM, Dave Abrahams <dave@boostpro.com> wrote:
on Sat Dec 29 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On Sat, Dec 29, 2012 at 2:15 PM, Dave Abrahams <dave@boostpro.com> wrote:
on Fri Dec 28 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
Just getting back to this as the drive on my mac is now repaired.. In
totally empty state :-(
On Wed, Dec 26, 2012 at 10:14 AM, Dave Abrahams <dave@boostpro.com> wrote:
on Wed Dec 26 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
OK.. What's is the not manual way to do this without having git?
OK.. That helps somewhat. It makes it possible to just write one
a piece of
code for all testers (since we require python and we can add installing dulwich to that).
It's even possible to write a script that creates a virtualenv and installs dulwich there on demand, so testers don't have to do it manually.
Except that dulwich requires compiling a C module.
It does?
"Dulwich is a pure-Python implementation of the Git file formats and protocols."
Is http://www.samba.org/~jelmer/dulwich/ lying?
Depends on your POV. It's pure in that it doesn't depend on the git sources or binaries. But <https://github.com/jelmer/dulwich/tree/master/dulwich> certainly has C sources to compile.
So virtual installing wouldn't work.. Right?
My goal is to have the equivalent of:
git clone -b <branch> --depth 1 --recursive https://github.com/boost-lib/boost.git <some-test-dir>
The first time, but with the shallow depth also applied recursively (something which seems to me to be a bug in git). And subsequent times doing:
git pull --recurse-submodules https://github.com/boost-lib/boost.git <branch>
Or at least that what I understand will give me only the current revision/s the first time. And then get only the subsequent updates correctly applied. Help in verifying that those would be the correct base git commands to emulate is appreciated. For those that will question why I'm going to the trouble.. One of the goals of the testing scripts is to minimize disk space *and* network bandwidth.
The disk space concern seems a bit misplaced given the size of the generated binaries. I mean, keep it from exploding, of course, but *minimizing* it seems like overkill.
True.. But it's only an ideal. I'm not going go to extremes to achieve the minimum. But certainly trying to conserve space can be helpful. Since the whole of Boost can be large. And I certainly hope we can find ways to reduce the binary footprint also moving forward.
Hence the convoluted fetch as minimal info as possible and store as minimal info as possible. Which brings a question..
Is there a way to have the local repo only store the current HEAD revision files (i.e. minimize the contents of the .git dir)?
I think that's the shallow clone technique you're using above (--depth 1). Do you have something else in mind?
I was asking both is that what #1 does initially and is there a way to make #2 not keep old history. And obviously how would I go about doing it with dulwich.
I *think* the answer to the first question is that there is no way to make #2 not keep old history.
I suspect as much also. But perhaps someone who knows the innards of git can give a definite answer. -- -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

on Sat Dec 29 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On Sat, Dec 29, 2012 at 4:04 PM, Dave Abrahams <dave@boostpro.com> wrote:
on Sat Dec 29 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On Sat, Dec 29, 2012 at 2:15 PM, Dave Abrahams <dave@boostpro.com> wrote:
on Fri Dec 28 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
Just getting back to this as the drive on my mac is now repaired.. In
totally empty state :-(
On Wed, Dec 26, 2012 at 10:14 AM, Dave Abrahams <dave@boostpro.com> wrote:
on Wed Dec 26 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote: > OK.. What's is the not manual way to do this without having git?
OK.. That helps somewhat. It makes it possible to just write one
a piece of
code for all testers (since we require python and we can add installing dulwich to that).
It's even possible to write a script that creates a virtualenv and installs dulwich there on demand, so testers don't have to do it manually.
Except that dulwich requires compiling a C module.
It does?
"Dulwich is a pure-Python implementation of the Git file formats and protocols."
Is http://www.samba.org/~jelmer/dulwich/ lying?
Depends on your POV. It's pure in that it doesn't depend on the git sources or binaries. But <https://github.com/jelmer/dulwich/tree/master/dulwich> certainly has C sources to compile.
I think https://github.com/jelmer/dulwich/blob/master/setup.py#L71 answers that question.
So virtual installing wouldn't work.. Right?
I think you can install extension modules in a virtualenv, but you might not want to ask people to compile them anyway. -- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost

On Sat, Dec 29, 2012 at 5:45 PM, Dave Abrahams <dave@boostpro.com> wrote:
Depends on your POV. It's pure in that it doesn't depend on the git
on Sat Dec 29 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote: sources
or binaries. But <https://github.com/jelmer/dulwich/tree/master/dulwich> certainly has C sources to compile.
I think https://github.com/jelmer/dulwich/blob/master/setup.py#L71 answers that question.
Ah I see.. I'll need to figure out how to have it not ask for compiling then. Since when I installed, IIRC, it insisted on compiling them.
So virtual installing wouldn't work.. Right?
I think you can install extension modules in a virtualenv, but you might not want to ask people to compile them anyway.
My preference would be to avoid compiling. And ideally self-install stuff (this is what we already do with our own scripts). -- -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

on Sat Dec 29 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On Sat, Dec 29, 2012 at 5:45 PM, Dave Abrahams <dave@boostpro.com> wrote:
Depends on your POV. It's pure in that it doesn't depend on the git
on Sat Dec 29 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote: sources
or binaries. But <https://github.com/jelmer/dulwich/tree/master/dulwich> certainly has C sources to compile.
I think https://github.com/jelmer/dulwich/blob/master/setup.py#L71 answers that question.
Ah I see.. I'll need to figure out how to have it not ask for compiling then. Since when I installed, IIRC, it insisted on compiling them.
So virtual installing wouldn't work.. Right?
I think you can install extension modules in a virtualenv, but you might not want to ask people to compile them anyway.
My preference would be to avoid compiling. And ideally self-install stuff (this is what we already do with our own scripts).
Actually, any tester has to have a C++ compiler anyhow, right? So maybe you'd be better off using the Python bindings to libgit2. Certainly if you care about performance this is likely to be a big advantage. -- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost

on Tue Dec 25 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On 12/17/2012 12:25 PM, Dave Abrahams wrote:
on Sun Dec 16 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On Wed, Dec 12, 2012 at 9:49 AM, Beman Dawes <bdawes@acm.org> wrote:
On Tue, Dec 11, 2012 at 10:52 AM, Rene Rivera <grafikrobot@gmail.com> wrote:
Hm.. That's barely a step :-\ ..And there's no need to branch. The tools already support multiple transport methods so we can just add another. Which brings me to one of the transport methods regression testing currently supports.. Downloading the current trunk/release as a ZIP archive. I was hoping to use the github facility that exists for downloading ZIPs of the repos. But unfortunately I couldn't make it attached the contents of the indirect references to the library subrepos.
That's right, unfortunately. However, we can get the exact URLs of the ZIP files from the GitHub API. I've recently done some scripting with that, e.g. https://github.com/ryppl/ryppl/blob/develop/scripts/github2bitbucket.py#L40
In fact, I think someone has coded up what's needed to make a monolithic zip here: https://github.com/quarnster/sublime_package_control/commit/9fe2fc2cad9bd2e7...
After looking at both of those I see no point in using the github api (or additional structure data from sublime -- not totally sure where the submodule info comes from in this case though) for this as it provides no additional information than one can get from just parsing the ".gitmodules" file.
I'm pretty sure that's not correct. The .gitmodules file doesn't contain information about which commit to check out for each submodule.
Hence the complexity of supporting testing with ZIPs is now a magnitude larger as it means dealing with fetching more than a hundred individual repos :-(
Which now seems the only choice. At the tester side I will have to get the boost-master archive. Then parse out the ".gitmodules" file. And get each subrepo archive individually. Which increases the likelihood of failure considerably.
I'm not sure. Isn't it true that shorter transfers are more likely to succeed than longer ones?
Well.. In an ideal world it would be possible to have a fully integrated "monolithic" repo that the testers can just use as that is the simplest and likely most repliable path. But, alas, this hope of mine was essentially dismissed during the DVCS/git discussions.
This isn't about DVCS but about whether we're going to have real modularity.
I don't know what you mean by "real modularity".
In this context I mean the ability to work on one part of a system without being encumbered by the other parts. -- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost

On 12/25/2012 3:08 PM, Dave Abrahams wrote:
on Tue Dec 25 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On 12/17/2012 12:25 PM, Dave Abrahams wrote:
on Sun Dec 16 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On Wed, Dec 12, 2012 at 9:49 AM, Beman Dawes <bdawes@acm.org> wrote:
On Tue, Dec 11, 2012 at 10:52 AM, Rene Rivera <grafikrobot@gmail.com> wrote:
Hm.. That's barely a step :-\ ..And there's no need to branch. The tools already support multiple transport methods so we can just add another. Which brings me to one of the transport methods regression testing currently supports.. Downloading the current trunk/release as a ZIP archive. I was hoping to use the github facility that exists for downloading ZIPs of the repos. But unfortunately I couldn't make it attached the contents of the indirect references to the library subrepos.
That's right, unfortunately. However, we can get the exact URLs of the ZIP files from the GitHub API. I've recently done some scripting with that, e.g. https://github.com/ryppl/ryppl/blob/develop/scripts/github2bitbucket.py#L40
In fact, I think someone has coded up what's needed to make a monolithic zip here: https://github.com/quarnster/sublime_package_control/commit/9fe2fc2cad9bd2e7...
After looking at both of those I see no point in using the github api (or additional structure data from sublime -- not totally sure where the submodule info comes from in this case though) for this as it provides no additional information than one can get from just parsing the ".gitmodules" file.
I'm pretty sure that's not correct. The .gitmodules file doesn't contain information about which commit to check out for each submodule.
Right it doesn't. But your ryppl code doesn't handle that either since it fetches the repos individually from the non-version-specific master branches (AFAICT). And the sublime code uses its own metadata files, ".sublime-package" and "package-metadata.json", to determine what to get. Although I can't tell if that contains specific version info. But since it also looks like it works with clone repos perhaps it doesn't need to worry about that.
Hence the complexity of supporting testing with ZIPs is now a magnitude larger as it means dealing with fetching more than a hundred individual repos :-(
Which now seems the only choice. At the tester side I will have to get the boost-master archive. Then parse out the ".gitmodules" file. And get each subrepo archive individually. Which increases the likelihood of failure considerably.
I'm not sure. Isn't it true that shorter transfers are more likely to succeed than longer ones?
Perhaps, if one happens to have an not reliable internet connection. But I would expect testers to have reliable connections. But that's a minor unreliability.. The more likely problem is in code bugs in the testing script ;-) -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org (msn) - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim,yahoo,skype,efnet,gmail

on Wed Dec 26 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
On 12/25/2012 3:08 PM, Dave Abrahams wrote:
on Tue Dec 25 2012, Rene Rivera <grafikrobot-AT-gmail.com> wrote:
After looking at both of those I see no point in using the github api (or additional structure data from sublime -- not totally sure where the submodule info comes from in this case though) for this as it provides no additional information than one can get from just parsing the ".gitmodules" file.
I'm pretty sure that's not correct. The .gitmodules file doesn't contain information about which commit to check out for each submodule.
Right it doesn't. But your ryppl code doesn't handle that either since it fetches the repos individually from the non-version-specific master branches (AFAICT).
No, that's also incorrect. Ryppl uses zeroinstall to fetch specific versions of dependencies based on the output of a statisfiability solver.
Hence the complexity of supporting testing with ZIPs is now a magnitude larger as it means dealing with fetching more than a hundred individual repos :-(
Which now seems the only choice. At the tester side I will have to get the boost-master archive. Then parse out the ".gitmodules" file. And get each subrepo archive individually. Which increases the likelihood of failure considerably.
I'm not sure. Isn't it true that shorter transfers are more likely to succeed than longer ones?
Perhaps, if one happens to have an not reliable internet connection. But I would expect testers to have reliable connections. But that's a minor unreliability.. The more likely problem is in code bugs in the testing script ;-)
OK, any new code comes with a risk of bugs, but seriously: this is not the kind of monstrous increase in complexity that will be difficult to code for, or to debug. -- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost
participants (6)
-
Beman Dawes
-
Daniel Pfeifer
-
Dave Abrahams
-
Rene Rivera
-
Robert Ramey
-
Ryo IGARASHI