
Ian Emmons <iemmons <at> bbn.com> writes:
First, __please__ pay attention to documentation. I know you have said that documentation isn't something you like/want to do
I do not believe I ever said something like this (not that I am big fan of it). And I do want to do docs for all the new features.
II. New "data driven test case" subsystem
<<snip>>
d) join - dataset constructed by joining 2 datasets of the same type
int a[] = {1,2,3}; int b[] = {7,8,9}; data::make(a) + data::make(b) - dataset with 6 integer values
This should be called "concatenation", not "joining". People with a database background will expect something called "join" to increase the arity of the result.
I believe join is used for this kind of operation quite commonly as well, but if there is a general consensus concat is better I can "rename" it. Keep in mind that in practice it only means the name of the header file for users of this feature (and may not be that if you include some "union" headers, which automatically include all necessary headers.
e) zip - dataset constructed by zipping 2 datasets of the same size, but not necessarily the same type
This dataset has an arity which is sum of argument dataset arities.
int a[] = {1,2,3}; char* b[] = {"qwe", "asd", "zxc"};
data::make(a) ^ data::make(b) dataset with 3 samples which are pairs of int and char*.
Calling this "zipping" is odd (at least to me). Makes it sound like a compression facility. Perhaps "tupling" would be better.
Zipping is well established term for this kind of operation. Tupling does not sound good IMO.
I also think the choice of operator here is not ideal. How does the xor operator evoke any notion of this operation? I would choose bitwise-or "|" because that is sometimes used as a flat-file column delimiter. (Actually, my favorite choice would be the comma operator to go along with the "tupling" terminology, but who am I to defy Scott Meyers' More Effective C++, Item 7?)
My most closest analogy is zipper which zips 2 sides together in common things. ^ operator resembles this most closely (merge left and right together into something united) .
f) grid - dataset constructed by "multiplying" 2 datasets of the same different sizes and types type
This dataset has an arity which is sum of argument dataset arities.
int a[] = {1,2,3}; char* b[] = {"qwe", "asd"}; double c[] = {1.1, 2.2};
data::make(a) * data::make(b) * data::make(c) dataset with 12 samples which are tuples of int and char* and double.
For people with a database background, "cross product" is the obvious name for this. Calling it anything else is silly. Also, I think you mean that the arity is the *product* of the argument dataset arities.
By arity of the dataset I mean an arty of the samples inside of it. and this arity is indeed a sum of arities. Size of the grid dataset is indeed product of sizes. Gennadiy