Re: [boost] [string] proposal

----- "Patrick Horgan" <phorgan1@gmail.com> a écrit :
On 01/28/2011 09:59 AM, Dean Michael Berris wrote:
On Fri, Jan 28, 2011 at 7:20 PM, Dean Michael Berris <mikhailberis@gmail.com> wrote:
So the interface I was thinking about (and suggesting) is a lot more minimal than what rope or std::string have exposed. I think when I do finish that design document (with rationale) it would be clear why I would like to keep it immutable and why I would prefer it still be called a string.
Let me finish that document -- expect something over the weekend. :)
And I stopped before I write too much -- the initial version is already up: https://github.com/downloads/mikhailberis/cpp-string-theory/cpp-string-theor... -- I'll give it more information and the actual interfaces and implementation as soon as I get some Z's. :)
You mention that your string is thread safe by design, but you only solve the problem of mutating the data of a string, your references to the pieces from which you compose a string are not thread safe, since they are mutable, right?
I don't think so. Isn't the easiest way towards proper composition to consider that the whole is the same as the part? I see the "chain" (I voted for that one!) as an immutable tree of immutable leafs. I think this can be naïvely seen as the following recursive boost::variant: boost::make_recursive_variant < boost::shared_ptr <const std::basic_string <code_unit_type>>, boost::shared_ptr <const std::list <boost::recursive_variant_>>
::type
Ivan

On 01/29/2011 02:41 AM, Ivan Le Lann wrote:
----- "Patrick Horgan"<phorgan1@gmail.com> a écrit :
On 01/28/2011 09:59 AM, Dean Michael Berris wrote:
On Fri, Jan 28, 2011 at 7:20 PM, Dean Michael Berris <mikhailberis@gmail.com> wrote:
So the interface I was thinking about (and suggesting) is a lot more minimal than what rope or std::string have exposed. I think when I do finish that design document (with rationale) it would be clear why I would like to keep it immutable and why I would prefer it still be called a string.
Let me finish that document -- expect something over the weekend. :) And I stopped before I write too much -- the initial version is already up: https://github.com/downloads/mikhailberis/cpp-string-theory/cpp-string-theor... -- I'll give it more information and the actual interfaces and implementation as soon as I get some Z's. :)
You mention that your string is thread safe by design, but you only solve the problem of mutating the data of a string, your references to the pieces from which you compose a string are not thread safe, since they are mutable, right? I don't think so. Isn't the easiest way towards proper composition to consider that the whole is the same as the part? I see the "chain" (I voted for that one!) as an immutable tree of immutable leafs. I think this can be naïvely seen as the following recursive boost::variant: But if everything is immutable, what if you add a phrase in the middle of a line.
chain thesentence=chain("I like bananas. Yes I do.") becomes: thesentence=thesentence.insert(atpos15, " all the time"); To create a sentence, "I like bananas all the time. Yes I do." Originally the tree would have one element demarcating the beginning and end of the original string. After the addition, you could have a tree with three elements two pointing into the original string, "I like bananas" and ". Yes I do." and a middle one pointing at the beginning and end of " all the time". To insert that something had to change. A list or chain of 1 element became a list or chain of 3 elements. Whatever changed has to be thread safe. Of course you say leafs are immutable, so the original leaf that pointed at the beginning and end of the original string would still exist, but now be unused, right? Am I understanding this correctly? Patrick

On Sun, Jan 30, 2011 at 7:22 AM, Patrick Horgan <phorgan1@gmail.com> wrote:
On 01/29/2011 02:41 AM, Ivan Le Lann wrote:
I don't think so. Isn't the easiest way towards proper composition to consider that the whole is the same as the part? I see the "chain" (I voted for that one!) as an immutable tree of immutable leafs. I think this can be naïvely seen as the following recursive boost::variant:
But if everything is immutable, what if you add a phrase in the middle of a line.
You're doing string manipulation. What you should be doing is building a string.
chain thesentence=chain("I like bananas. Yes I do.")
becomes:
thesentence=thesentence.insert(atpos15, " all the time");
Nope. chain thesentence = "I like bananas. Yes I do."; thesentence = substr(thesentence, 0, 15) ^ " all the time" ^ substr(thesentence, length(thesentence) - 10, 10);
To create a sentence, "I like bananas all the time. Yes I do."
Originally the tree would have one element demarcating the beginning and end of the original string. After the addition, you could have a tree with three elements two pointing into the original string, "I like bananas" and ". Yes I do." and a middle one pointing at the beginning and end of " all the time". To insert that something had to change. A list or chain of 1 element became a list or chain of 3 elements. Whatever changed has to be thread safe. Of course you say leafs are immutable, so the original leaf that pointed at the beginning and end of the original string would still exist, but now be unused, right? Am I understanding this correctly?
I think you're still thinking of string manipulation when you should have been thinking about string building. ;) So since because the original chain is still referenced in the new chain the data in the block from the original chain will still reference that same block -- you just have concatenation nodes that point to different segments of the same block that is re-used. Then you can actually write the contents for the temporary chain built from " all the time" into that same block (maybe after the original sentence), and then you still just fit everything into as little memory as you possibly require and get the immutability guarantee. It's still thread-safe because you're not modifying anything that's already built because you're building something new. ;) At the end of the assignment though, the original structure for the original string is actually freed -- that means the concatenation tree that used to be a single node, will have a reference count that drops to 0 and is actually returned to the allocator used to allocate the node in the first place. Note that chain is designed to act like a shared_ptr in this regard. HTH PS. I really should just drop this, but the question was really interesting. -- Dean Michael Berris about.me/deanberris

... elision by patrick of his own stuff! oh my! ... Nope.
chain thesentence = "I like bananas. Yes I do."; thesentence = substr(thesentence, 0, 15) ^ " all the time" ^ substr(thesentence, length(thesentence) - 10, 10);
To create a sentence, "I like bananas all the time. Yes I do."
Originally the tree would have one element demarcating the beginning and end of the original string. After the addition, you could have a tree with three elements two pointing into the original string, "I like bananas" and ". Yes I do." and a middle one pointing at the beginning and end of " all the time". To insert that something had to change. A list or chain of 1 element became a list or chain of 3 elements. Whatever changed has to be thread safe. Of course you say leafs are immutable, so the original leaf that pointed at the beginning and end of the original string would still exist, but now be unused, right? Am I understanding this correctly?
I think you're still thinking of string manipulation when you should have been thinking about string building. ;)
So since because the original chain is still referenced in the new chain the data in the block from the original chain will still reference that same block -- you just have concatenation nodes that point to different segments of the same block that is re-used. Then you can actually write the contents for the temporary chain built from " all the time" into that same block (maybe after the original sentence), and then you still just fit everything into as little memory as you possibly require and get the immutability guarantee. It's still thread-safe because you're not modifying anything that's already built because you're building something new. ;)
At the end of the assignment though, the original structure for the original string is actually freed -- that means the concatenation tree that used to be a single node, will have a reference count that drops to 0 and is actually returned to the allocator used to allocate the node in the first place. Note that chain is designed to act like a shared_ptr in this regard.
HTH
PS. I really should just drop this, but the question was really interesting. lol! That's exactly in every detail what I thought you meant but you never clearly stated it. They're like Python strings. Now I've tricked you into doing a concrete description of what you mean:) lol! So no data structure will ever change once created, only one thread will create it, and all other references are read only. So the only thing to worry about are reference counts to string data which will need some
On 01/29/2011 03:39 PM, Dean Michael Berris wrote: form of concurrency control associated with them? Patrick
participants (3)
-
Dean Michael Berris
-
Ivan Le Lann
-
Patrick Horgan