On Sun, Mar 13, 2016 at 1:02 PM, Phil Bouchard wrote:
Thanks for your help.
So block_ptr<> wouldn't be a drop-in replacement for shared_ptr<> but an add-on for: - complex containers, - neural networks, - C# / Java / Javascript engines, - etc.
No problem. Also renaming the thread to reflect the current benchmarks (since it doesn't appear that block_ptr is faster than shared_ptr where creation is involved). 1. Gathering these benchmarks is always useful, even if block_ptr was not designed to be a drop-in replacement for share_ptr. So you had the right idea, even if the execution needed some work. It's important to get the execution correct, though, otherwise the results are not meaningful - which is where I (and I believe Rob, also) was trying to steer you. 2. Looking at the numbers you obtained after modifying my benchmark example, they are close enough that at least you could say "Using block_ptr over shared_ptr won't be a significant performance loss":
unique_ptr (new): 47.7686 unique_ptr (make_unique): 46.8545 shared_ptr (new): 77.8261 shared_ptr (make_shared): 50.8072 shared_ptr (allocate_shared_noinit): 33.021 block_ptr (new): 69.6554
3. There are more benchmarks beyond just creation (though creation is likely the most meaningfully expensive one). Copying overhead might be one: I haven't looked at what is involved in copying a block_ptr, but some work happens when you copy a shared_ptr. 4. shared_ptr by way of allocate_shared allows creation to not involve 'new' expressions, or even a call to '::operator new(std::size_t)' at all. i.e. For those C++ projects that have a requirement for all dynamic allocation in their project to involve some stateful custom allocator instances of some stateful custom allocator type, they can still use shared_ptr (with allocate_shared). Is this possible with block_ptr? I hope to take a look at the block_ptr motivation, design, and implementation when I have some time next week. The title of the thread caught my attention; "X is 600% faster than Y" always has a high excitement potential. Glen
On 03/13/2016 01:28 PM, Glen Fernandes wrote:
1. Gathering these benchmarks is always useful, even if block_ptr was not designed to be a drop-in replacement for share_ptr. So you had the right idea, even if the execution needed some work. It's important to get the execution correct, though, otherwise the results are not meaningful - which is where I (and I believe Rob, also) was trying to steer you.
You and Rob pointed out the errors rather quickly ;)
2. Looking at the numbers you obtained after modifying my benchmark example, they are close enough that at least you could say "Using block_ptr over shared_ptr won't be a significant performance loss":
unique_ptr (new): 47.7686 unique_ptr (make_unique): 46.8545 shared_ptr (new): 77.8261 shared_ptr (make_shared): 50.8072 shared_ptr (allocate_shared_noinit): 33.021 block_ptr (new): 69.6554
It would be interesting to see the performance of block_ptr_base<> as well. Perhaps I could make block_ptr<> cherry pick any smart_ptr with a template parameter eventually but that's just a thought, not a necessity for now.
3. There are more benchmarks beyond just creation (though creation is likely the most meaningfully expensive one). Copying overhead might be one: I haven't looked at what is involved in copying a block_ptr, but some work happens when you copy a shared_ptr.
Copying is another story because copying a block_proxy would refer to the same pointers; i.e. it wouldn't be copying the nodes of a container for example. Perhaps I can work on that.
4. shared_ptr by way of allocate_shared allows creation to not involve 'new' expressions, or even a call to '::operator new(std::size_t)' at all.
i.e. For those C++ projects that have a requirement for all dynamic allocation in their project to involve some stateful custom allocator instances of some stateful custom allocator type, they can still use shared_ptr (with allocate_shared). Is this possible with block_ptr?
An easy way to use a custom allocator is by deriving from block<>:
template <typename T>
struct userblock : public block
I hope to take a look at the block_ptr motivation, design, and implementation when I have some time next week. The title of the thread caught my attention; "X is 600% faster than Y" always has a high excitement potential.
The motivation is mainly to get rid of garbage collectors once and for all. I am working on WebKit and in Brave New World, its memory manager would be deterministic but unfortunately the garbage collector in well inlaid into WebKit so it's not an easy task to replace. But I am looking at alternatives like Duktape, etc. Also the documentation is outdated as proxies are now explicit but it'll give you the general idea. Thanks again for all your help! Regards, -Phil
Thus whether the /set/ was composed of memory blocks referencing each other in a cyclic way or not, all of them will be subject to destruction and deallocation indifferent from the cyclicism problem presented by the reference counters. As we can see in the following example,
Phil, [I'm sorry, I can't seems to find the root of this thread (I might have deleted it).] I've been reading the documentation, especially presentation (PPT), rationale + tutorials. In the presentation/rationale you state that a downside of refcounted pointers is responsibility to manually break cycles (the samples quote using weak_ptr to do it). In contrast you state that block_ptr doesn't require cycle tracking; e.g. in the rationale: the /set/ counter reaches 0 and consequently every elements composing the /set/ will be destructed and deallocated: To my surprise, though, in your tutorial I spot this https://github.com/philippeb8/block_ptr/blob/master/doc/tutorial.html#L113-L...
Secondly in the case where a cyclic set is being destroyed, in order to prevent block_ptr's member pointers from accessing an object that has already been destroyed the function cyclic() is provided to explicitly check the state of the pointee object
To me it looks like this comprehensively refutes the purpose of the block_ptr as a ref-cycle mitigation scheme: the programmer seems to still be responsible for anticipating cycles and special casing destructors for it. Worse, instead of "just leaking" resources, it looks like failure to do so leads to undefined behaviour. Can you clarify this situation? Maybe I'm understanding this wrong. Seth
Seth wrote:
To me it looks like this comprehensively refutes the purpose of the block_ptr as a ref-cycle mitigation scheme: the programmer seems to still be responsible for anticipating cycles and special casing destructors for it.
This problem can't be solved in general. When you have struct node { node_ptr<node> p_; ~node() { p_->something(); } void something(); }; and you have a cycle with two nodes pointing at each other, it's simply not possible to destroy them properly. The shared_ptr collectors with which I have experimented try to sidestep this by first zeroing all pointers that participate in a cycle, then running the destructors. So you'd need ~node() { if( p_ ) p_->something(); } See for example http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2232.html#cycles I abandoned this proposal because I could think of no way to make it work with multiple threads and weak_ptr.
On 03/13/2016 05:50 PM, Seth wrote:
To my surprise, though, in your tutorial I spot this https://github.com/philippeb8/block_ptr/blob/master/doc/tutorial.html#L113-L...
Secondly in the case where a cyclic set is being destroyed, in order to prevent block_ptr's member pointers from accessing an object that has already been destroyed the function cyclic() is provided to explicitly check the state of the pointee object
To me it looks like this comprehensively refutes the purpose of the block_ptr as a ref-cycle mitigation scheme: the programmer seems to still be responsible for anticipating cycles and special casing destructors for it. Worse, instead of "just leaking" resources, it looks like failure to do so leads to undefined behaviour.
Can you clarify this situation? Maybe I'm understanding this wrong.
I'm happy to have faced this special case back in 2011 because it is not obvious. So you have a cycle being wiped out by the proxy: +-> a1 ---> a2 -+ | | +---------------+ 1) a1.~A() will be called 2) a1.~A() will call a2.foo() 3) But a2 might have been wiped out already. So if you try to access a member in the destructor without explicitly checking if it was destroyed already then this will lead to an undefined behavior, just like dereferencing a null pointer. I could throw an exception when operator -> is used while a cycle is being destroyed but I don't think the overhead is worth it for this special case only.
On March 13, 2016 6:36:49 PM EDT, Phil Bouchard
So you have a cycle being wiped out by the proxy:
+-> a1 ---> a2 -+ | | +---------------+
1) a1.~A() will be called
2) a1.~A() will call a2.foo()
3) But a2 might have been wiped out already.
So if you try to access a member in the destructor without explicitly checking if it was destroyed already then this will lead to an undefined behavior, just like dereferencing a null pointer.
I could throw an exception when operator -> is used while a cycle is being destroyed but I don't think the overhead is worth it for this special case only.
Neither answer is great. The exception could occur during shack unwinding, so you couldn't use your pointers in very many contexts. No exception means UB, which means users have to remember a special case and thereby lose safety. I fail to see how that is better than using weak_ptr. In that case, one must always be explicit rather than remember special cases. ___ Rob (Sent from my portable computation engine)
On 03/14/2016 07:35 AM, Rob Stewart wrote:
Neither answer is great. The exception could occur during shack unwinding, so you couldn't use your pointers in very many contexts. No exception means UB, which means users have to remember a special case and thereby lose safety.
I fail to see how that is better than using weak_ptr. In that case, one must always be explicit rather than remember special cases.
It is much more complicated to use an explicit weak_ptr because you need to know where the cycle ends, etc. What if you have a neural network? How do you define where to use weak_ptr over shared_ptr, etc.?
On 15/03/2016 01:23, Phil Bouchard wrote:
On 03/14/2016 07:35 AM, Rob Stewart wrote:
Neither answer is great. The exception could occur during shack unwinding, so you couldn't use your pointers in very many contexts. No exception means UB, which means users have to remember a special case and thereby lose safety.
I fail to see how that is better than using weak_ptr. In that case, one must always be explicit rather than remember special cases.
It is much more complicated to use an explicit weak_ptr because you need to know where the cycle ends, etc.
What if you have a neural network? How do you define where to use weak_ptr over shared_ptr, etc.?
The basic rule for tree structures is to use shared_ptr when pointing "down" from parent to child and weak_ptr when pointing "up" from child to parent. (The above assumes that you want the whole tree to stay alive when you have a pointer to the root, which is the common case. In some cases though it's more useful to reverse the tree, such that keeping a pointer to a child keeps all its parents alive. Either way works, though not mixing the two.) For more free-form graphs it's a bit harder to define, obviously. A simple solution (albeit one that requires a bit of double handling) is to have each node only use weak_ptrs to point at each other, and store the shared_ptrs for each node in a bag or map in the graph container object. It's typically only a slight increase in housekeeping work when adding/removing nodes.
On 03/13/2016 07:04 PM, Phil Bouchard wrote:
Also the documentation is outdated as proxies are now explicit but it'll give you the general idea.
Are proxies regions? (cf. your earlier reference to region-based memory management.)
On 03/13/2016 02:52 PM, Bjorn Reese wrote:
On 03/13/2016 07:04 PM, Phil Bouchard wrote:
Also the documentation is outdated as proxies are now explicit but it'll give you the general idea.
Are proxies regions? (cf. your earlier reference to region-based memory management.)
Yes, it marks a region.
On 03/13/2016 02:52 PM, Bjorn Reese wrote:
On 03/13/2016 07:04 PM, Phil Bouchard wrote:
Also the documentation is outdated as proxies are now explicit but it'll give you the general idea.
Are proxies regions? (cf. your earlier reference to region-based memory management.)
I could rename: - proxy_ptr<> -> root_ptr<> and - block_ptr<> -> leaf_ptr<> I think it would be easier to understand.
On 03/13/2016 05:23 PM, Phil Bouchard wrote:
On 03/13/2016 02:52 PM, Bjorn Reese wrote:
On 03/13/2016 07:04 PM, Phil Bouchard wrote:
Also the documentation is outdated as proxies are now explicit but it'll give you the general idea.
Are proxies regions? (cf. your earlier reference to region-based memory management.)
I could rename: - proxy_ptr<> -> root_ptr<> and - block_ptr<> -> leaf_ptr<>
Or perhaps: - proxy_ptr<> -> root_ptr<> and - block_ptr<> -> node_ptr<>
2016-03-13 18:38 GMT-03:00 Phil Bouchard
Or perhaps:
- proxy_ptr<> -> root_ptr<> and - block_ptr<> -> node_ptr<>
Or: proxy_ptr<> -> root_ptr<> and block_ptr<> -> child_ptr<> -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/
On 03/13/2016 05:41 PM, Vinícius dos Santos Oliveira wrote:
2016-03-13 18:38 GMT-03:00 Phil Bouchard
: Or perhaps:
- proxy_ptr<> -> root_ptr<> and - block_ptr<> -> node_ptr<>
Or:
proxy_ptr<> -> root_ptr<> and block_ptr<> -> child_ptr<>
Thanks but I just can't see function calls using: void foo(leaf_ptr<int> const & p); void foo(child_ptr<int> const & p); I think the following is better: void foo(node_ptr<int> const & p); Because you can still call the function using a root_ptr<>: int main() { root_ptr<int> p(new block<int>(10)); foo(p); } And a root is a node but a root cannot be a leaf or a child...
2016-03-13 19:01 GMT-03:00 Phil Bouchard
Thanks but I just can't see function calls using:
void foo(leaf_ptr<int> const & p); void foo(child_ptr<int> const & p);
I think the following is better:
void foo(node_ptr<int> const & p);
Because you can still call the function using a root_ptr<>:
I wasn't aware of this behavior. I agree with `node` working better. -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/
On 03/13/2016 01:28 PM, Glen Fernandes wrote:
I hope to take a look at the block_ptr motivation, design, and implementation when I have some time next week. The title of the thread caught my attention; "X is 600% faster than Y" always has a high excitement potential.
I did some code cleanup again and now block_proxy cannot be instantiated
(less error-prone).
The only thing left to be done is to "lambdafy" the following loop:
for (intrusive_list::iterator
participants (8)
-
Bjorn Reese
-
Gavin Lambert
-
Glen Fernandes
-
Peter Dimov
-
Phil Bouchard
-
Rob Stewart
-
Seth
-
Vinícius dos Santos Oliveira