Organization idea: separate libraries in our SVN

6 Aug 2007

      [I thought I wrote a message similar to this a few years ago, but I 
can't seem to find it on GMane.  Anyway, this is a good time to 
re-introduce it since Subversion allows directory manipulations.]

The current layout of the Boost material in our SCM system is the same 
as the final archives.  That does not necessarily have to be the case. 
Since we put emphasis on distinct libraries within Boost as far as 
maintenance is concerned, maybe we should enforce in the SCM layout.

You know how, when we look at a potential library, the library is in an 
archived folder with sub-folders mirroring the files' places in the 
final layout.  The trial library would have its own root "boost" and 
"libs" (and any other) sub-directories.  I suggest that all the current 
libraries be set up like that, in a common directory holding all the 
libraries.  This common directory would be the trunk.  At release time, 
the contents of each library's directory are merged into a release 
directory.  If only one library directory has a particular file object, 
it's copied as-is.  If multiple libraries have a commonly-named file 
object, we have an error if at least one of them is a file.  Otherwise, 
the commonly-named directories are merged using this algorithm 
recursively.  (A particular leaf file should only be part of one library.)

So the new trunk would be like:
* trunk/
** General/
** any/
...
** crc/
...
** filesystem/
...
** math-quaterion/
** math-common_factor/
...
** quickbook/
...
** utility-general/
** utility-enable_if/
...
** wave/
** wave-tool/
...

Note that the tools get parallel library directories.  The "General" 
folder holds our license, logos, and any other non-library common files. 
  Libraries of the same category don't have to have distinctive 
namespaces (and sub-folders of the boost/ header directory); they can be 
merged in the final product, but maintained separately in the SCM 
system.  (The existing I/O, math, and numeric namespaces/directories are 
sort of like this, but now we can refine them further, and do stuff like 
put all the Boost container classes under one sub-namespace/directory.)

While the final users get a merged archive, Boost developers would have 
to include each library's header separately in their compiler options. 
This could be useful because it will tell us about each library's 
dependencies on other Boost libraries.

I think Subversion has some sort of symbolic-file-link support.  Maybe 
with that, we could create a directory parallel to the trunk that 
represents an already merged archive.  (It would have to be read-only, 
of course.)  It could be updated with a commit script watching the trunk 
directory.

* master-level/
** trunk/
** synthetic-archive/  <- read-only; maybe at separate website
** branches/
** tags/

Something I like about this system is that it lets me know exactly how 
many libraries we have.  (There's so many and it's hard to keep track 
sometimes.)  Furthermore, now that libraries can be safely enumerated 
enables all sorts of automatic processing.  Using SVN properties on each 
library's folders, and various sub-objects, would enhance said 
processing.  (I think that properties and their changes generally can be 
version-controlled.)  We could have the commit script make sure that the 
appropriate file objects have the properties we need.

* The master list of libraries can be script-generated.  Descriptions of 
each library and their original authors can be stored as SVN properties 
in each library's folder.

* The current maintainers for each library can be stored as an SVN 
property.  A regression suite that access SVN can then send 
notifications to the right people.  (A master list of maintainers can 
also be generated for the final archive.)

* There can be a property listing which sibling libraries a particular 
library (directly) depends on.  This can be used by a script to isolate 
one library for extraction and take all it needs for the ride.  An 
extraction can be in my separate library directories form or in a fused 
archive form.  Furthermore, when actually built, you'll discover any 
missing sibling library dependencies.

* Did you know that you can mix-and-match various revisions of items in 
the same directory?  Exploiting this could reduce the need of branching 
in order to isolate breaking changes.  Let's say that we start out at 
revision r25.  The Filesystem library is changed and checked-in at r30. 
  All the other libraries stay at r25 until an explicit update bumps 
everything to r30.  (Subversion's commits do _not_ sneak in an update to 
uninvolved file objects.)  Unknowingly, r27 introduced a bug in the CRC 
library which breaks Filesystem.  The Filesystem maintainer could 
temporarily lock his/her copy of the CRC library folder to r26 until the 
bug is fixed!  Taken to the extreme, the release manager can check out 
the copy of the trunk's HEAD, back-step any libraries as necessary to 
their individual last-known-good(-enough) revision, and then run the 
merge script for the release's archive.

* For each library's documentation directory, the directory could have a 
property describing the primary file.  Each documentation file could 
have a property describing build directions, if needed.

* For each library's testing directory, each test file could have a 
property describing which kind of test (compile, compile-fail, run, 
run-fail, etc.) it is.  Jamfiles don't necessarily have to be included; 
they could be synthesized from the properties while the merge script 
builds an archive.  And if we move to a different build system than Jam, 
then the properties could be used for the new system.

* For the mandatory source files, a property could be added for distinct 
execution closures.  As far as I know, this property wouldn't be needed 
except for three files in Boost.Test.  Those files each include a "main" 
function, and so can't be mixed with each other or all other mandatory 
source.  Each one of those files could have a distinct execution closure 
tag, which puts it in a separate "src" subdirectory, while all the other 
mandatory source files leave that property blank and all go into a main 
"src" subdirectory when merged.  (I know we don't keep a root-level 
"src" directory; but with SCM separation of libraries and a merge 
script, we can!)  Any special compilation directions could be placed in 
a property to be transfered to a Jamfile (or successor) for our build 
system.  (Obviously this isn't 100% recommended since a user could use 
his/her own IDE to compile a source file and miss the property's 
directions.)

There is a _one-time_ cost to separate each library/tool from the 
existing merged archive to distinct directories.  That might take a 
while, but advantages of a separated system should be worth it.  (It can 
be done in piecemeal; keep everything in the "General" directory and 
separate the libraries/tools out one at a time.  We should probably do 
the leaf libraries [0 sibling dependencies] first.  Actually, we should 
start with Boost.Config; it should have low dependencies, but a lot of 
other libraries depend on it.)

I've heard that Windows, when moving the contents of one directory into 
another, will (recursively?) merge the contents of commonly-named 
directories.  If that's accurate, then a Windows user could temporarily 
be a release manager until the merge script is made.  I don't really 
know how to script, but hopefully a merge would be simple enough to 
write that the Windows-user workaround wouldn't be needed.  (Macs use 
overwrite semantics, being consistent in not distinguishing between file 
objects based on the source's and destination's is-directory flags.)

I know that there's a lot of automation.  But, if it's possible to do 
this plan, we'll be working smarter, not harder, for maintenance and 
releases.

-- 
Daryle Walker
Mac, Internet, and Video Game Junkie
darylew AT hotmail DOT com

Daryle Walker

tags

participants (1)