[asio] some stuff about ipv4::address

Hi. These are minor things I noticed with, and was bothered by, the asio::ipv4::address class. This is by no means an asio review (although we desperately need some networking library, that's for sure). What I dislike is the string-accepting contructor and the to_string() method. My personal taste is that textual conversions (and the ip::address class is a very good candidate for this) should be made through the standard ostream/istream operators. This has the usual advantages of using an idiom. It also provides standard means to detect parsing error using the ios_base::fail bit. The ip::address currently has the only option of throwing in the string-accepting contructor, however, this is not intuitive, IMHO (and not documented). String conversions can be easily performed using boost::lexical_cast. Another thing is the plethora of assingment operators. Wouldn't a single copy-assignment be enough? And one last suggestion: A construtor taking 4 unsigned chars, allowing this: asio::ipv4::address my_address(192, 168, 0, 1); Thanks for listening, Yuval

Hi Yuval, --- Yuval Ronen <ronen_yuval@yahoo.com> wrote:
What I dislike is the string-accepting contructor and the to_string() method. My personal taste is that textual conversions (and the ip::address class is a very good candidate for this) should be made through the standard ostream/istream operators. This has the usual advantages of using an idiom. It also provides standard means to detect parsing error using the ios_base::fail bit. The ip::address currently has the only option of throwing in the string-accepting contructor, however, this is not intuitive, IMHO (and not documented). String conversions can be easily performed using boost::lexical_cast.
I think the current string conversion functions are convenient and natural, and I see them not being coupled to iostreams as a feature :) However I take your point about non-throwing parsing, so I will add both input and output iostream operations for the address class (and the endpoint classes too).
Another thing is the plethora of assingment operators. Wouldn't a single copy-assignment be enough?
For the current implementation, probably yes. But I don't want the interface to assume that the implementation will always be so cheap as to not worry about extra temporaries.
And one last suggestion: A construtor taking 4 unsigned chars, allowing this: asio::ipv4::address my_address(192, 168, 0, 1);
No problem, I can add that. Cheers, Chris

Christopher Kohlhoff wrote:
Hi Yuval,
--- Yuval Ronen <ronen_yuval@yahoo.com> wrote:
What I dislike is the string-accepting contructor and the to_string() method. My personal taste is that textual conversions (and the ip::address class is a very good candidate for this) should be made through the standard ostream/istream operators. This has the usual advantages of using an idiom. It also provides standard means to detect parsing error using the ios_base::fail bit. The ip::address currently has the only option of throwing in the string-accepting contructor, however, this is not intuitive, IMHO (and not documented). String conversions can be easily performed using boost::lexical_cast.
I think the current string conversion functions are convenient and natural, and I see them not being coupled to iostreams as a feature :)
However I take your point about non-throwing parsing, so I will add both input and output iostream operations for the address class (and the endpoint classes too).
But if you do supply the I/O operators, then you don't really need the string operations because you get those for free using lexical_cast, don't you? Wouldn't you want to save writing code that is already present in lexical_cast?
Another thing is the plethora of assingment operators. Wouldn't a single copy-assignment be enough?
For the current implementation, probably yes. But I don't want the interface to assume that the implementation will always be so cheap as to not worry about extra temporaries.
I don't think there is any such assumption in what I said. You should be saying to the user: "you can assign an ip::address from X, Y, and Z", and that's all he needs to know. There's a difference between this and "there is an assignment operator from X, Y and Z". The latter is less flexible than the former, and for no good reason. Stating the former allows you to include or disclude these assignment operators as you see fit, without changing the user's interface. But this is really a minor issue...

Hi Yuval, --- Yuval Ronen <ronen_yuval@yahoo.com> wrote:
But if you do supply the I/O operators, then you don't really need the string operations because you get those for free using lexical_cast, don't you? Wouldn't you want to save writing code that is already present in lexical_cast?
Particularly for construction, I think it is worth allowing direct conversion from a string, even if it is just syntactic sugar. Writing: asio::ipv4::endpoint ep(1234, "127.0.0.1"); is a bit clearer than: asio::ipv4::endpoint ep(1234, boost::lexical_cast<asio::ipv4::address>("127.0.0.1")); and it is such a common requirement that it's worth supporting in the library.
I don't think there is any such assumption in what I said. You should be saying to the user: "you can assign an ip::address from X, Y, and Z", and that's all he needs to know. There's a difference between this and "there is an assignment operator from X, Y and Z". The latter is less flexible than the former, and for no good reason. Stating the former allows you to include or disclude these assignment operators as you see fit, without changing the user's interface.
Although it may not crop up much in practice, changing from one assignment operator to multiple (or vice versa) does change the behaviour of the interface, and saying that "you can assign an address from X, Y or Z" doesn't fully document the behaviour. The first option involves a user-defined conversion for assignment from non-address types, whereas the second does not. Cheers, Chris

Yuval Ronen wrote: [SNIP]
And one last suggestion: A construtor taking 4 unsigned chars, allowing this: asio::ipv4::address my_address(192, 168, 0, 1);
I suppose you meant four unsigned shorts? What kind of use case do you have in mind for such a constructor? Where do you obtain IP address information as four distinct integers instead of a 32 bit integer as is standard? -- Pedro Lamarão Desenvolvimento Intersix Technologies S.A. SP: (55 11 3803-9300) RJ: (55 21 3852-3240) www.intersix.com.br Your Security is our Business

On Thu, 5 Jan 2006, Pedro Lamarão wrote:
Yuval Ronen wrote:
[SNIP]
And one last suggestion: A construtor taking 4 unsigned chars, allowing this: asio::ipv4::address my_address(192, 168, 0, 1);
I suppose you meant four unsigned shorts?
An IPv4 address is comprised of four octets, and an unsigned char sufficiently represents an octet. A short would be overkill, no? -- Chris Cleeland, cleeland_c @ ociweb.com, http://www.milodesigns.com/~chris Principal Software Engineer, Object Computing, Inc., +1 314 579 0066 Support Me Supporting Cancer Survivors in Ride for the Roses 2005 >>>>>>>>> Donate at http://www.milodesigns.com/donate <<<<<<<<<

Chris Cleeland wrote:
On Thu, 5 Jan 2006, Pedro Lamarão wrote:
Yuval Ronen wrote:
[SNIP]
And one last suggestion: A construtor taking 4 unsigned chars, allowing this: asio::ipv4::address my_address(192, 168, 0, 1);
I suppose you meant four unsigned shorts?
An IPv4 address is comprised of four octets, and an unsigned char sufficiently represents an octet. A short would be overkill, no?
I see, you meant an implicit conversion to char? I still fail to see a use case for this. Help me here. :) -- Pedro Lamarão Desenvolvimento Intersix Technologies S.A. SP: (55 11 3803-9300) RJ: (55 21 3852-3240) www.intersix.com.br Your Security is our Business

On Fri, 6 Jan 2006, Pedro Lamarão wrote:
Chris Cleeland wrote:
I suppose you meant four unsigned shorts?
An IPv4 address is comprised of four octets, and an unsigned char sufficiently represents an octet. A short would be overkill, no?
I see, you meant an implicit conversion to char?
Whatever is necessary. If the args were shorts (signed or unsigned), the interface is representing that the following would be legal: asio::ipv4::address my_address(322, 798, 0, 1024); And we know that's not correct. If the caller trusts implicit conversions rather than explicitly passing unsigned chars, then that's their problem.
I still fail to see a use case for this. Help me here. :)
I can't help you with that, as I did not originate the suggestion. The only thing I can conjure in my mind is when parsing a string representation of an address, particularly a dotted-decimal representation. Typically this sort of stuff is handled by functions like inet_aton, but I can see where one might prefer to not rely on external, non-standard functions like that. -- Chris Cleeland, cleeland_c @ ociweb.com, http://www.milodesigns.com/~chris Principal Software Engineer, Object Computing, Inc., +1 314 579 0066 Support Me Supporting Cancer Survivors in Ride for the Roses 2005 >>>>>>>>> Donate at http://www.milodesigns.com/donate <<<<<<<<<

Chris Cleeland wrote:
Whatever is necessary. If the args were shorts (signed or unsigned), the interface is representing that the following would be legal:
asio::ipv4::address my_address(322, 798, 0, 1024);
And we know that's not correct.
If the arguments are unsigned chars, you (the author of the function) have no way of knowing whether it's correct, because you won't see 322 or 798.
If the caller trusts implicit conversions rather than explicitly passing unsigned chars, then that's their problem.
Callers that explicitly pass unsigned char don't care about the argument type; their code will work with either.

On Fri, 6 Jan 2006, Peter Dimov wrote:
Chris Cleeland wrote:
Whatever is necessary. If the args were shorts (signed or unsigned), the interface is representing that the following would be legal:
asio::ipv4::address my_address(322, 798, 0, 1024);
And we know that's not correct.
If the arguments are unsigned chars, you (the author of the function) have no way of knowing whether it's correct, because you won't see 322 or 798.
I'm missing the point of what you're saying. If the args are unsigned chars, there is no need for the author of the function to check the range b/c the range is implicit in the type. Hopefully pedantic compilers will issue warnings regarding loss of precision via implicit conversion catching errors like this at compile time rather than at execution time. If, otoh, the args are *shorts*, though, then there becomes a need to check validity at execution time.
If the caller trusts implicit conversions rather than explicitly passing unsigned chars, then that's their problem.
Callers that explicitly pass unsigned char don't care about the argument type; their code will work with either.
Of course they will. I can't tell: are you arguing that the interface should use unsigned short or unsigned char? -- Chris Cleeland, cleeland_c @ ociweb.com, http://www.milodesigns.com/~chris Principal Software Engineer, Object Computing, Inc., +1 314 579 0066 Support Me Supporting Cancer Survivors in Ride for the Roses 2005 >>>>>>>>> Donate at http://www.milodesigns.com/donate <<<<<<<<<

Chris Cleeland wrote:
On Fri, 6 Jan 2006, Peter Dimov wrote:
Chris Cleeland wrote:
Whatever is necessary. If the args were shorts (signed or unsigned), the interface is representing that the following would be legal:
asio::ipv4::address my_address(322, 798, 0, 1024);
And we know that's not correct.
If the arguments are unsigned chars, you (the author of the function) have no way of knowing whether it's correct, because you won't see 322 or 798.
I'm missing the point of what you're saying. If the args are unsigned chars, there is no need for the author of the function to check the range b/c the range is implicit in the type. Hopefully pedantic compilers will issue warnings regarding loss of precision via implicit conversion catching errors like this at compile time rather than at execution time.
If the interface takes unsigned char, passing 322 will result in the function receiving 66. This may generate a compiler warning, or it may not. Even if it does generate a warning, an explicit static_cast to unsigned char (because the programmer thinks that the value is in range) will silence it. If the interface takes an int, passing 322 will result in the function receiving 322. The function can now handle this error in an appropriate manner. Callers that pass an unsigned char are neither helped nor harmed by the interface taking an int. Callers that pass an int that is out of range potentially lose an easy to ignore warning and gain runtime error detection. So it's a set of tradeoffs. If the interface takes shorts, there is both a warning and a possibility of runtime error detection. In summary, what I'm saying is that in C/C++, using unsigned char for an argument with a range 0..255 is not inherently superior to the alternatives. (It would be in a range-checked language.)

Peter Dimov wrote:
If the interface takes unsigned char, passing 322 will result in the function receiving 66. This may generate a compiler warning, or it may not. Even if it does generate a warning, an explicit static_cast to unsigned char (because the programmer thinks that the value is in range) will silence it.
I don't get it. I won't surprise anyone if I'll say that the whole point of xxx_cast is to signal the programmer who writes this code that he's doing something fishy, and he should carefully check if this is ok. The best thing a library writer can do, is to provide the maximum compiler checks that will produce either an error or a warning in case of suspicious usage. Forcing the user to use a red-flagged cast is exactly such an example. If the user chose to supress the warning using a cast, then we could only assume he knows what he's doing. Runtime checks are inferior because they A) hurt performance B) make code cumbersome so compile-time check are prefered. Was there anything new and surprising in what I just said? I think not...

--- Yuval Ronen <ronen_yuval@yahoo.com> wrote:
Peter Dimov wrote:
If the interface takes unsigned char, passing 322 will result in the function receiving 66. This may generate a compiler warning, or it may not. Even if it does generate a warning, an explicit static_cast to unsigned char (because the programmer thinks that the value is in range) will silence it.
I don't get it. I won't surprise anyone if I'll say that the whole point of xxx_cast is to signal the programmer who writes this code that he's doing something fishy, and he should carefully check if this is ok. The best thing a library writer can do, is to provide the maximum compiler checks that will produce either an error or a warning in case of suspicious usage. Forcing the user to use a red-flagged cast is exactly such an example. If the user chose to supress the warning using a cast, then we could only assume he knows what he's doing.
Runtime checks are inferior because they A) hurt performance B) make code cumbersome so compile-time check are prefered.
Was there anything new and surprising in what I just said? I think not...
After reading this discussion I'm undecided about the best course of action :) Some thoughts, both for and against: - Does unsigned char always imply 0..255? Might there be a standards-conforming C++ implementation where char is not 8 bits? (Admittedly porting a sockets library, which inherently deals in sequences of octets, to this architecture could be rather difficult.) - If the motivation for this constructor is literal IP addresses, is the use case sufficiently common? Well known literals should be created some other way (e.g. by calling ipv4::address::loopback()). - Let's assume you are using it for literal IP addresses, and the constructor takes 4 ints and throws an exception when out of range. If you do not pass out of range values the optimiser can determine that the values are valid, so there would be no performance impact. - If the address was implemented as a C structure containing 4 unsigned chars it could be initialised using { 1, 2, 3, 4 }. This constructor, if it took unsigned chars, would be the equivalent of that. - Using unsigned chars doesn't just document the range of valid values, but also documents that an IP address is a 32-bit value composed of four 8-bit values. - If you want to use this constructor, you must know what you're doing, so be it on your own head ;) Cheers, Chris

On Fri, 13 Jan 2006, Christopher Kohlhoff wrote:
Some thoughts, both for and against:
- Does unsigned char always imply 0..255? Might there be a standards-conforming C++ implementation where char is not 8 bits? (Admittedly porting a sockets library, which inherently deals in sequences of octets, to this architecture could be rather difficult.)
Good question; I know that sizeof(char) == sizeof(unsigned char) == 1, but I don't know that it's guarantee anywhere that something that has a sizeof==1 is exactly 8 bits. That said, I can't think of any implementation anywhere that I've encountered otherwise, including some pretty arcane old hardware architectures. As you somewhat imply, porting code that deals in octets to such an architecture would be pretty difficult, to say the least.
- Let's assume you are using it for literal IP addresses, and the constructor takes 4 ints and throws an exception when out of range. If you do not pass out of range values the optimiser can determine that the values are valid, so there would be no performance impact.
I'm not sure I would agree that an exception is an appropriate response here, but that might get into religious wars. If the interface provides an easy opportunity to pass in out-of-range arguments, then somebody is going to do it.
- If the address was implemented as a C structure containing 4 unsigned chars it could be initialised using { 1, 2, 3, 4 }. This constructor, if it took unsigned chars, would be the equivalent of that.
True.
- Using unsigned chars doesn't just document the range of valid values, but also documents that an IP address is a 32-bit value composed of four 8-bit values.
I tried an experiment and, at least using g++ 4.0, I couldn't get the *compiler* to complain when I had a CTOR that took 4 unsigned chars as args and I passed in values that were out of range. So, maybe this idea is a nice dream that can't make it to reality just yet. -- Chris Cleeland, cleeland_c @ ociweb.com, http://www.milodesigns.com/~chris Principal Software Engineer, Object Computing, Inc., +1 314 579 0066 Support Me Supporting Cancer Survivors in Ride for the Roses 2005 >>>>>>>>> Donate at http://www.milodesigns.com/donate <<<<<<<<<

Chris Cleeland wrote:
Good question; I know that sizeof(char) == sizeof(unsigned char) == 1, but I don't know that it's guarantee anywhere that something that has a sizeof==1 is exactly 8 bits.
No, that's not guaranteed. There are implementations with char types consisting of more than 8 bit (not necessarily old architectures). HTH, m Send instant messages to your online friends http://au.messenger.yahoo.com

At 9:17 AM -0600 1/12/06, Chris Cleeland wrote:
On Fri, 13 Jan 2006, Christopher Kohlhoff wrote:
- Does unsigned char always imply 0..255?
Good question; I know that sizeof(char) == sizeof(unsigned char) == 1, but I don't know that it's guarantee anywhere that something that has a sizeof==1 is exactly 8 bits.
That said, I can't think of any implementation anywhere that I've encountered otherwise, including some pretty arcane old hardware architectures. As you somewhat imply, porting code that deals in octets to such an architecture would be pretty difficult, to say the least.
No, CHAR_BIT is only required to be *at least* 8. It is pretty common for DSP platforms to have char, short, and int all be of the same size (either 16 or 32 bits, or perhaps even 64 bits for some these days, depending on the specific model of DSP). This is all driven by a combination of the smallest addressable unit and the native data unit (i.e. register) size. General purpose processors have pretty much settled on 8 bit addressable units (there used to be others, I'm pretty sure I remember 9, and I think 12), but the DSP world is quite different (though powers of 2 seem to have won there too).

Pedro Lamarão wrote:
Yuval Ronen wrote:
And one last suggestion: A construtor taking 4 unsigned chars, allowing this: asio::ipv4::address my_address(192, 168, 0, 1);
I suppose you meant four unsigned shorts?
As Chris noted, unsigned shorts would imply (to users, and to the compiler) that the acceptable range is 16 bits and not 8 bits.
What kind of use case do you have in mind for such a constructor? Where do you obtain IP address information as four distinct integers instead of a 32 bit integer as is standard?
Chris was right about this one, too. The main idea was parsing IP addresses in a dot notation form. The Microsoft inet_addr() function I used to use, reserved one IP address (255.255.255.255 IIRC) to indicate a parse error, and that wasn't acceptable in my case. You might say that if asio's ip::address class provides dot notation parsing, then I don't have to worry about it any more, and also don't need this constructor, and you have a point there... but I also think that this constructor just makes the code look nicer when I have the IP address hard coded (which is not that often...) Yuval

Yuval Ronen wrote: [SNIP]
Chris was right about this one, too. The main idea was parsing IP addresses in a dot notation form. The Microsoft inet_addr() function I used to use, reserved one IP address (255.255.255.255 IIRC) to indicate a parse error, and that wasn't acceptable in my case.
inet_addr is deprecated (for exactly this reason) in favor of inet_ntop, which is standard. If this is not present in win32 (I don't remember), getaddrinfo is. Both of the latter work transparently with IPv4 and IPv6. -- Pedro Lamarão
participants (8)
-
Chris Cleeland
-
Christopher Kohlhoff
-
Kim Barrett
-
Martin Wille
-
Pedro Lamarão
-
Pedro Lamarão
-
Peter Dimov
-
Yuval Ronen