[tokenizer] is exception unaware

Hello I've been writing a small program that expects certain number arguments from command line for any given option. <getopt.h> supports only single argument for each option, while I needed to supply three arguments, like that: ./program -f foo bar baz The easiest solution was to use Boost.Tokenizer library, which indeed is a very small and nify tool. However I was surprised to discover that Boost.Tokenizer is exception unaware! Consider following two examples. One with current version of Boost.Tokenizer, that is exception unaware. Please note that the code looks a bit dirty, and extra bool flag is needed. #include<iostream> #include<boost/tokenizer.hpp> #include<string> int main() { std::string arguments = "foo bar"; // not receiveing all three arguments is an exception. std::string arg1,arg2,arg3; boost::tokenizer<> tok(arguments); boost::tokenizer<>::iterator pos=tok.begin(); bool error=false; if(pos!=tok.end()) arg1=*pos++; else error=true; if(pos!=tok.end()) arg2=*pos++; else error=true; if(pos!=tok.end()) arg3=*pos++; else error=true; if(error) { std::cout << "not enough arguments\n"; exit(1); } std::cout << arg1 << " " << arg2 << " " << arg3 << "\n"; } Without all those checks if(pos!=tok.end()), the program crashes on assert_fail(). This is how the code would look like, if Boost.Tokenizer was correctly throwing exceptions: #include<iostream> #include<boost/tokenizer.hpp> #include<string> int main() { std::string arguments = "foo bar baz"; // not receiveing all three arguments is an exception. std::string arg1,arg2,arg3; boost::tokenizer<> tok(arguments); boost::tokenizer<>::iterator pos=tok.begin(); try { arg1=*pos++; arg2=*pos++; arg3=*pos++; } catch(boost::tokenizer_out_of_range&) { std::cout << "not enough arguments\n"; exit(1); } std::cout << arg1 << " " << arg2 << " " << arg3 << "\n"; } Questions: - is this IMO bad design intentional? - if it's not, I'll gladly make a patch for that - then would there be a chance that it will get accepted? ;) -- Janek Kozicki |

Janek Kozicki wrote:
boost::tokenizer<>::iterator pos=tok.begin(); - is this IMO bad design intentional?
Boost.Tokenizer is written as to model an iterator. Have you even seen an iterator throw an exception on reaching the end of the sequence? So yes, the design is probably intentional. Sebastian Redl

Sebastian Redl said: (by the date of Tue, 18 Jul 2006 14:06:59 +0200)
Janek Kozicki wrote:
boost::tokenizer<>::iterator pos=tok.begin(); - is this IMO bad design intentional?
Boost.Tokenizer is written as to model an iterator. Have you even seen an iterator throw an exception on reaching the end of the sequence?
The rationale for container iterators is that iterating over containers should not be slowed down by exceptions. Those iterators don't even check for the iterator's validity! And trying to access beyond the container is not detected by the program, but simply causes segmentation fault. That's good for usual containers.
So yes, the design is probably intentional.
However tokenizer::iterator is not a typical container nor typical iterator. It is even unusual that it checks for the iterator's validity (because trying to access beyond the container causes assert(valid_) to fail ). The rationale from typical container iterators makes no sense when applied to tokenizer::iterator, because iterating over tokenizer items is not fast - it requires parsing of the content. There are two solutions to correct this: - remove assert(valid_) check from tokenizer, so it follows more precisely the standard, and causes segmantation fault on failure. - do opposite, and accept the fact that tokenizer is not a typical container, and replace assert_fail with throw. -- Janek Kozicki |

Janek Kozicki wrote:
However tokenizer::iterator is not a typical container nor typical iterator. It is even unusual that it checks for the iterator's validity (because trying to access beyond the container causes assert(valid_) to fail ).
MS debug iterators check their constraints - in debug mode. assert() also only exists in debug mode.
The rationale from typical container iterators makes no sense when applied to tokenizer::iterator, because iterating over tokenizer items is not fast - it requires parsing of the content.
The stream iterators don't throw either. (The underlying stream might throw, though.) Sebastian Redl

- is this IMO bad design intentional?
Does vector<T>::iterator throw if it goes past .end() ? no. I think boost simply models after the standard. What I agree it lack is some kind of .count() so you know how much tokens there is, but what you can do to get the info is : if(std::distance(tok.begin(), tok.end()) < 3) { std::cout << "not enough arguments\n"; exit(1); } arg1=*pos++; arg2=*pos++; arg3=*pos++; Philippe

Philippe Vaucher wrote:
but what you can do to get the info is :
if(std::distance(tok.begin(), tok.end()) < 3) { std::cout << "not enough arguments\n"; exit(1); }
arg1=*pos++; arg2=*pos++; arg3=*pos++;
Alternatively, you could write a next() function: template<typename It> inline typename std::iterator_traits<It>::value_type next(It &first, It &last) { if(first == last) { throw something; } return *first++; } And then just do: arg1 = next(pos, end); arg2 = next(pos, end); arg3 = next(pos, end); Sebastian Redl
participants (3)
-
Janek Kozicki
-
Philippe Vaucher
-
Sebastian Redl