
----- Original Message ----- From: "Brock Peabody" <brock.peabody@npcinternational.com>
I think I understand why there are problems with any iterator more powerful than an input iterator and I agree. I don't understand what benefit there is to supporting anything besides input iterators though.
O.K., so if you agree that supporting forward-only cursors makes sense and that an input iterator has guarantees that are similar then you are ready to think about the next major decision in designing an input iterator which is how to implement the copy operation. When you implement copy I think there are two fundamental choices: 1. When you copy an iterator, continue to point to the same underlying cursor. There is precedent for this, the istream iterators do it. But I don't much like it since programmers may "surprised" that when they copy the iterator incrementing the copy affects the original. 2. When you copy the iterator, cache a copy of the current row value, on increment open a new cursor (COW pattern). This means that copies won't have "side effects" on the original iterator. However, opening a new cursor also can surprise programmers since they may expect the iterator to behave like a forward iterator and produce records in a particular order. This is what we did in DTL but it does very occasionally produce confusion when folks expect our iterator to act as a forward iterator and not just an input iterator. Below is a snippet from the DTL mailing list where I explain further. Anyway, to me this is a major design decision and I would be interested to hear what others think. discussion from http://groups.yahoo.com/group/DatabaseTemplateLibrary/message/1559
Paul Harris Wrote
here's what I was trying to do:
template <class In, class Out, class Compare> Out copy_duplicates(In begin, In end, Out out, Compare compare) { while (begin != end) { begin = adjacent_find(begin,end,compare); if (begin != end) { tcout << "Found duplicate " << *begin << endl; *out++ = *begin++; } } return out; }
I think you are expecting select_iterator to do something here that it does not provide a guarantee for. As we say in the docs, a select iterator is just an *Input Iterator*. For a definition look at the SGI STL webpage: http://www.sgi.com/tech/stl/InputIterator.html In particular if you look at the notes this means that [1] i == j does not imply ++i == ++j. [2] Every iterator in a valid range [i, j) is dereferenceable, and j is either dereferenceable or past-the-end. The fact that every iterator in the range is dereferenceable follows from the fact that incrementable iterators must be dereferenceable. [3] After executing ++i, it is not required that copies of the old value of i be dereferenceable or that they be in the domain of operator==. [4] It is not guaranteed that it is possible to pass through the same input iterator twice. So your algorithm is not following the requirements for an input iterator. Nor should you expect distance() to return something meaningful since the records are not in a particular order. So what does this mean in terms of STL - it means that a select_iterator will work with only a few of the STL algorithms, those that promise to need only an input iterator. Unfortunately there is not an easy way to enforce this check at compile time. The guys at boost did some work a while ago where you could do some meta checks for various iterator invariants but they could only detect crude gaffs. So then why isn't everything a RandomDBView that gives more guarantees? 1. Some ODBC drivers don't support scrollable recordsets or bulk fetches. 2. For the drivers that do support scrollable recordsets, sometimes getting a scrollable cursor extracts a huge penalty. When you ask for a scrollable cursor you are requesting a snapshot of a set of records in a particular order. On some databases, the way they handle this is to dump the entire set of records to a file on the client!! This can be very slow. Most modern databases are smarter, they track a version of the table and an internal list that is the order - this is still overhead but less. So, I think there is a place for select_iterator, it just needs to be clear what the guarantees are (which are few). I do also agree that having an incremented copy reset is not all that nice, but at least this guarantees that a copy of the iterator can give the full recordset. The other major alternative would be to have all copies of a select iterator point to the same recordset, but this gives the nastiness that incrementing a copy of an iterator can now affect the original. This may be acceptable, though, if we think about things like istream_iterator where this is exactly what happens. Anyway, I would want to think about it carefully before I went that route.