Hi!
Library version 1.33.0.
The input string contains null characters. I want to use null character
as the seperator. The following code produces only "X".
Why "Y" and "Z" are discarded and how to fix this code?
=========
#include<iostream>
#include
The input string contains null characters. I want to use null character as the seperator. The following code produces only "X". Why "Y" and "Z" are discarded and how to fix this code?
string str="X\0Y\0Z";
Take a closer look at that line. The tokenizer library isn't the problem. - me22
On Sunday 16 October 2005 03.58, me22 wrote:
The input string contains null characters. I want to use null character as the seperator. The following code produces only "X". Why "Y" and "Z" are discarded and how to fix this code?
string str="X\0Y\0Z";
Take a closer look at that line. The tokenizer library isn't the problem. Yes and no. Fixing the above to
str=string("X\0Y\0Z", 5);
will only partly solve the problem since the
char_separator ctor takes a 'const Char*'
char_separator<char> sep("\0");
and passing a "\0" will
treated as an empty string when the ctor initializes the private member
m_dropped_delims. It is not clear to me how to best work around this. Kind
of a feature of the interface plus the fact that c-strings are terminated by
'\0';)
One solution to support passing '\0' as a separator could be to add another
ctor to char_separator that can accept a 'const string &' for kept and
dropped delimiters; something like
class char_separator
{
public:
typedef std::basic_string
On second thought there seems to be a way without changing
the interface of the library: use and escaped_list_separator instead.
A variation of your would then become
#include
Hello,
it seems to me that boost::char_separator needs to have another ctor that can
accept delimiters that are string types. For example, given a
std::string("X\0Y\0\0Z", 6), it does not seem to be possible to use the
current ctor of boost::char_separator so that '\0' can be used as a
separator.
It is possible to use boost::escaped_list_separator, since *it* takes a string
type on construction, but on the other hand boost::escaped_list_separator
does not have an empty_token_policy. In summary, I suggest adding another
ctor to boost::char_separator. This will enable the parsing of the above
string as
<X><Y><><Z>
or
<X><Y><Z>
by the following:
#include
Fredrik Hedman wrote:
Hello,
it seems to me that boost::char_separator needs to have another ctor that can accept delimiters that are string types. For example, given a std::string("X\0Y\0\0Z", 6), it does not seem to be possible to use the current ctor of boost::char_separator so that '\0' can be used as a separator.
It is possible to use boost::escaped_list_separator, since *it* takes a string type on construction, but on the other hand boost::escaped_list_separator does not have an empty_token_policy. In summary, I suggest adding another ctor to boost::char_separator. This will enable the parsing of the above string as
I would prefer the more general: template <class It> // where: typeof(*It) == Char, ++It, It == It char_separator(It delims_begin, It delims_end, empty_token_policy empty_tokens = drop_empty_tokens) Kept and dropped delims get a bit messy, though. Some sort of mask? (vector<bool> kept_delims = vector<bool>()) As in valarray's mask_array.
On Thursday 20 October 2005 00.37, Simon Buchan wrote:
Fredrik Hedman wrote:
Hello,
it seems to me that boost::char_separator needs to have another ctor that can accept delimiters that are string types. For example, given a std::string("X\0Y\0\0Z", 6), it does not seem to be possible to use the current ctor of boost::char_separator so that '\0' can be used as a separator.
It is possible to use boost::escaped_list_separator, since *it* takes a string type on construction, but on the other hand boost::escaped_list_separator does not have an empty_token_policy. In summary, I suggest adding another ctor to boost::char_separator. This will enable the parsing of the above string as
I would prefer the more general: template <class It> // where: typeof(*It) == Char, ++It, It == It char_separator(It delims_begin, It delims_end, empty_token_policy empty_tokens = drop_empty_tokens)
Kept and dropped delims get a bit messy, though. Some sort of mask? (vector<bool> kept_delims = vector<bool>()) As in valarray's mask_array.
Hi Simon, what you are suggesting is certainly possible, but seems to imply another two arguments, so that the dropped delims can be passed into the ctor too. My view is that this solution makes the ctor less convenient to use. The intent of the arguments to the ctor is to tell the char_separator what delimiters to drop and what to keep. So the two arguments are basically two (disjunct?) sets, but with the current interface it does not seem to be possible to pass in a '\0' as a delimiter. Hence my suggestion to add a ctor that takes delimiters that are of string type. -- Best Regards, Fredrik Hedman
participants (4)
-
CN
-
Fredrik Hedman
-
me22
-
Simon Buchan