[range] boost::end("hello") is sub-optimal

There is a bug in Boost.Range that is causing boost::end() to execute a slower code path when called with a string literal. The type of "hello" is char const [6], and end() should be O(1). Indeed, Boost.Range has code to handle just this case, but it never gets called. The problem is the way overload resolution happens between these two functions template< typename T, std::size_t sz > const T* boost_range_end( const T (&array)[sz] ) const char* boost_range_end( const char* s ) The first is O(1) and the second is O(N) becuase it calls strlen. But the second will always be chosen because non-templates are preferred over templates. -- Eric Niebler Boost Consulting www.boost-consulting.com

Eric Niebler wrote:
There is a bug in Boost.Range that is causing boost::end() to execute a slower code path when called with a string literal. The type of "hello" is char const [6], and end() should be O(1). Indeed, Boost.Range has code to handle just this case, but it never gets called. The problem is the way overload resolution happens between these two functions
template< typename T, std::size_t sz > const T* boost_range_end( const T (&array)[sz] )
const char* boost_range_end( const char* s )
I think something like template< typename T > enable_if< is_same< T, char >, const char* >::type boost_range_end( const T* s ) for the second overload should fix it. Regards, Daniel

Daniel Frey wrote:
Eric Niebler wrote:
There is a bug in Boost.Range that is causing boost::end() to execute a slower code path when called with a string literal. The type of "hello" is char const [6], and end() should be O(1). Indeed, Boost.Range has code to handle just this case, but it never gets called. The problem is the way overload resolution happens between these two functions
template< typename T, std::size_t sz > const T* boost_range_end( const T (&array)[sz] )
const char* boost_range_end( const char* s )
I think something like
template< typename T > enable_if< is_same< T, char >, const char* >::type boost_range_end( const T* s )
for the second overload should fix it.
That only works on compilers that handle SFINAE. Probably the easiest thing would be to remove the char* overloads and modify the "primary" boost_range_end(T const &) overload, which currently only works when T is an STL container, to use is_pointer<T> to dispatch to str_end(). -- Eric Niebler Boost Consulting www.boost-consulting.com

Eric Niebler wrote:
That only works on compilers that handle SFINAE. Probably the easiest thing would be to remove the char* overloads and modify the "primary" boost_range_end(T const &) overload, which currently only works when T is an STL container, to use is_pointer<T> to dispatch to str_end().
What if the non-template overload would be replaced by a template overload? template<> const char* boost_range_end( const char* s ) Wouldn't that work? I know of a similar (but much less complex) case where this solved the overload resolution problem. Sebastian Redl

Sebastian Redl wrote:
Eric Niebler wrote:
That only works on compilers that handle SFINAE. Probably the easiest thing would be to remove the char* overloads and modify the "primary" boost_range_end(T const &) overload, which currently only works when T is an STL container, to use is_pointer<T> to dispatch to str_end().
What if the non-template overload would be replaced by a template overload?
template<> const char* boost_range_end( const char* s )
Wouldn't that work? I know of a similar (but much less complex) case where this solved the overload resolution problem.
Did you try it? What you've shown is a function template specialization, not an overload, but it won't work, because it doesn't match any primary template. You could've written const char*&, but for completeness, you'd also have to provide overloads for char*&, char*const&, const char*const&, and all the wchar_t variants. That's a lot. Now that I think about it, we had discussed eliminating the inconsistent "boost::size(int[5])==5, but boost::size(char[5])==4" behavior. Bugs aside, why is it still there? Thorsten? -- Eric Niebler Boost Consulting www.boost-consulting.com

Eric Niebler wrote:
Now that I think about it, we had discussed eliminating the inconsistent "boost::size(int[5])==5, but boost::size(char[5])==4" behavior. Bugs aside, why is it still there? Thorsten?
well, I'm waiting for an update on a little. There are several other changes happining and I want to coordinate them with Pavol and the string library. -Thorsten

"Eric Niebler" <eric@boost-consulting.com> wrote in message news:43B6220B.4050408@boost-consulting.com...
There is a bug in Boost.Range that is causing boost::end() to execute a slower code path when called with a string literal. The type of "hello" is char const [6], and end() should be O(1). Indeed, Boost.Range has code to handle just this case, but it never gets called. The problem is the way overload resolution happens between these two functions
template< typename T, std::size_t sz > const T* boost_range_end( const T (&array)[sz] )
const char* boost_range_end( const char* s )
The first is O(1) and the second is O(N) becuase it calls strlen. But the second will always be chosen because non-templates are preferred over templates.
These two functions are not equivalent, since strlen("hello") equals 5 and not 6. Depending on what you are using the range for, you might or might not want the '\0' at the end of the string to be part of the range. Joe Gottman

Joe Gottman wrote:
"Eric Niebler" <eric@boost-consulting.com> wrote in message news:43B6220B.4050408@boost-consulting.com...
There is a bug in Boost.Range that is causing boost::end() to execute a slower code path when called with a string literal. The type of "hello" is char const [6], and end() should be O(1). Indeed, Boost.Range has code to handle just this case, but it never gets called. The problem is the way overload resolution happens between these two functions
template< typename T, std::size_t sz > const T* boost_range_end( const T (&array)[sz] )
const char* boost_range_end( const char* s )
...
These two functions are not equivalent, since strlen("hello") equals 5 and not 6. Depending on what you are using the range for, you might or might not want the '\0' at the end of the string to be part of the range.
When I said, "Boost.Range has code to handle just this case" I meant that it handles arrays of char and wchar_t to account for the null termination. So these two functions *are* equivalent in that they both return the same value. The only difference is one is O(1) and the other is O(N). -- Eric Niebler Boost Consulting www.boost-consulting.com

On Sat, 31 Dec 2005 09:15:39 +0300, Eric Niebler <eric@boost-consulting.com> wrote: []
template< typename T, std::size_t sz > const T* boost_range_end( const T (&array)[sz] )
const char* boost_range_end( const char* s )
The first is O(1) and the second is O(N) becuase it calls strlen. But the second will always be chosen because non-templates are preferred over templates.
It depends on a compiler. gcc replaces strlen() calls with a constant when called with a string literal.

Eric Niebler wrote:
There is a bug in Boost.Range that is causing boost::end() to execute a slower code path when called with a string literal. The type of "hello" is char const [6], and end() should be O(1). Indeed, Boost.Range has code to handle just this case, but it never gets called. The problem is the way overload resolution happens between these two functions
template< typename T, std::size_t sz > const T* boost_range_end( const T (&array)[sz] )
const char* boost_range_end( const char* s )
The first is O(1) and the second is O(N) becuase it calls strlen. But the second will always be chosen because non-templates are preferred over templates.
I guess that is intentional. See: boost::end("hello\0secret") BTW, '#define foreach BOOST_FOREACH' causes name conflict with Boost.Foreach. :-) Regards, MB p-stade.sourceforge.net

MB wrote:
Eric Niebler wrote:
There is a bug in Boost.Range that is causing boost::end() to execute a slower code path when called with a string literal. The type of "hello" is char const [6], and end() should be O(1). Indeed, Boost.Range has code to handle just this case, but it never gets called. The problem is the way overload resolution happens between these two functions
template< typename T, std::size_t sz > const T* boost_range_end( const T (&array)[sz] )
const char* boost_range_end( const char* s )
The first is O(1) and the second is O(N) becuase it calls strlen. But the second will always be chosen because non-templates are preferred over templates.
I guess that is intentional. See: boost::end("hello\0secret")
I don't think it's intentional. There is explicit code for the handling of arrays of char that is never getting called. Presumably, it's not there just because Thorsten likes to type. :-) Your point is valid, though -- it would lead to inconsistent behavior. We'll have to wait for Thorsten to answer. This may all be moot. I recall Thorsten agreeing to remove direct support for null-terminated strings from Boost.Range and adding a utility for creating a range from a char*.
BTW, '#define foreach BOOST_FOREACH' causes name conflict with Boost.Foreach. :-)
Yup, fixed. Thanks. -- Eric Niebler Boost Consulting www.boost-consulting.com
participants (7)
-
Daniel Frey
-
Eric Niebler
-
Joe Gottman
-
Maxim Yegorushkin
-
MB
-
Sebastian Redl
-
Thorsten Ottosen