On Mon, Aug 18, 2008 at 07:05:10PM -0400, Daryle Walker wrote:
implementing. This is one of those times.
Thanks for your deep analysis and your time. Unfortunately, it's based on the assumption that the task ID is opaque, which it is not.. Re-reading my original post, this could be blamed on me..
Integer types in C and C++ are a mess. For example, I have made a library where a task is identified by a unique unsigned integer. The extra information about the tasks is stored in a std::vector. When a new task is
[snip, rearrange]
You tasks IDs are conceptually opaque, why is any external code wanting to mess with them? The external code shouldn't be doing
There are TWO task IDs: one is used by the library (which I also made), and that one _is_ opaque (at least to library clients), it _is_ typedef, and it is not even an integer (it's a pointer). Now, to every created task, I need to assign a positive[*] index (in an application which just uses the library) which is _not_ opaque: the indices are vertex IDs which are in turn used to construct a CSR representation (edge-lists) of a certain graph which is in turn operated on by an external library. (You may remember my earlier post where I proposed that the CSR class from Boost.Graph provides a documented, public way of accessing the raw CSR vectors.) I will call these "other" IDs "task indices" to avoid further confusion. I could bundle the task index in the task structure and provide a function that takes the task ID and returns the index. This is not satisfactory for several reasons: - the task library should be general-purpose and lightweight (task indices might not be needed in certain applications) - task indices may be assigned to different policies, which I do not want to hard-code into the library - I do not want to complicate interfaces with optional information (e.g. extra index parameter to task creation -- see 1st point) Since the library is written in C (well defined ABI across compilers), I can't provide overloaded interfaces. Thus, I have chosen that the task library shall be minimalistic, and that extra information about tasks, where needed, shall be layered in external containers. In this particular application, the choice of the external container (vector) is dictated by a 3rd party library.
(The ID is a typedef and not a naked "unsigned," right?) Anyway, does it really matter; this ID generation code is only used during task construction, right?
The "true" task ID (provided by the "task library") is a typedef and opaque, yes. The task index (used to construct a graph) is a plain int, not even a typedef. Why? [*] Because the external library uses plain ints[**] in its interfaces. The library can be configured to use longs, but then, fortunately, a compile-time error happens when passing &v[0] of vector<int> to long*. [**] Now, the library itself does not use plain int, but a typedef. But since *all* data is passed in terms of the same typedef (vertex/edge IDs, vertex/edge weights), this implies that I cannot configure the library to use unsigned integers and expect it to work correctly (because it does more complicated operations with weights than just indexing arrays). Which leads me to conclusion that the library interface is not ideal; it should define at least index_type and weight_type instead of the single number_type. But fixing the library is out of my scope - I have to work with what I have.
Why are you using a number to refer to a container element in the first place? Then I realized that you can't use iterators because they're not stable with vector's element adds or removes. Then I
Correct.
wondered, why are you using a vector in the second place? Wouldn't a
To interface with the external library, which has a C interface and requires a bunch of int* (actually, number_type, see above) arguments, to which I pass &v[0]. Vector is the only standard container which guarantees contiguous storage. And for the kind of data processing the library is doing, it is actually good choice.
iterators, leaving them available to implement your ID type. And you don't seem to need random-access to various task elements. (A deque
I do indeed not need random-access to "real tasks". In the task library I have actually implemented something similar the design you have proposed here.
(I was going to suggest using Boost's pointer-containers, but then I realized that you really don't need containment at all.) Now you'll pass this class around instead of an integer type. The size may be higher though, two pointers (and an 'int' in debug-mode).
Indeed -- I do not need containment and the task library just returns opaque task IDs -- it is the task creator's responsibility to keep track of them. Now, it is almost certainly true that in this project I do not need the full integer<T> class that I proposed, and even if it were available, it'd be too much to convert the project at this point. I like to rant, but I'm also pragmatic :-) boost.integer and numeric_cast will do (thanks to you and Dave for pointing it out) I'll leave the implementation of the proposed simpler, non-ranged integer<T> template as a challenge for myself and as a material for learning about other boost libraries :-)
No, you should rethink your design on why you need integers in the first place.
Compact storage of graphs that enables fast algorithms (e.g. cache-friendly layout, avoids frequent memory rellocations (unlike linked lists), etc.). And before you ask why I don't use Boost.graph: because I'm working with hypergraphs. A hypergraph can be represented as an ordinary bipartite graph, but such representation is too awkward. (Actually... time to think a bit about this premise :-))