Ovanes Markarian
On Thu, Apr 30, 2009 at 7:12 AM, tom fogal
wrote: I find this very strange. Gcc accepted some parallelized STL implementation too in recent years, and I always questioned it. For example, 25.1.1/1 in c++98 (for_each def'n):
Effects: Applies f to the result of dereferencing every iterator in the range [first, last), starting from first and proceeding to last - 1.
How could you possibly parallelize that? It's defined to operate in series. I guess if the compiler could prove that `f' had no side effects, and there were no volatiles involved, it might be possible (is that strong enough to prove that that execution order is irrelevant? hrm..) It seems like the former would run up against solving the aliasing problem, though. What am I missing here?
(Granted, I can't recall the last time I wrote a for_each that *relied* on that execution order. Further, I could probably be convinced that requiring a particular ordering here was a specification mistake. Still, a standard is a standard.)
my example was referring to find. There you use a single value to be found in the range.
You'll note that I said `For example'. I did not claim that there did not exist a std algorithm which could not be parallelized. I intended to claim that there exists some algorithms which can only be meaningfully parallelized if one interprets the algorithm in a particular (what I would say normal) manner, as opposed to the way the standard dictates it must be. In reading 25.1 now, I'm finding very few functions which actually dictate the ordering, which is promising. However, I *am* seeing a lot of `returns the *first* <X> in the range [first, last) such that...'. This is unfortunate :(
Why should it be relevant in that context if the iterator points to a volatile value or not or is volatile itself. find-algo makes a read-only operations on container sequence.
The functor given to e.g. find_if ``shall not apply any non-constant
function *through the dereferenced iterator*'' (25/7, emph. added).
This does not preclude modifying the external environment (i.e. having
side effects). See the appended program for an example of a program
which does terribly dirty things, yet I would consider written to spec.
I'd be delighted if you or others could prove me wrong.
I didn't use volatile in the example program, but since a volatile
variable can be modified by an external entity, a `constant' function
which queries/depends on that mutable program state will see
differences in behavior for parallel vs. serial STL implementations.
Especially if that constant function then modifies the program state
itself (e.g. ++num_equal in the example).
Finally, note that the parallel STL implementation I referenced
([1] after a second of googling) reports parallelization of mutable
functions as well as non-mutable functions. However, upon further
examination I find we are in luck there! The standard explicitly
requires that std::accumulate (for example) may not have side effects.
Hopefully that (stronger) requirement will be imbued on all of the
sequence operations in the future.
Cheers,
-tom
[1]
http://gcc.gnu.org/onlinedocs/libstdc++/manual/parallel_mode.html#manual.ext...
#include <algorithm>
#include <functional>
#include <iostream>
#include <utility>
static int x = 0;
struct MyClass {
MyClass() : value(x++) {}
MyClass(int v) : value(v) {}
bool operator ==(const MyClass &c) const {
return this->value == c.value;
}
int value;
};
static MyClass initial(19), another(64);
static MyClass &aref = initial;
static int num_equal = 0;
void docrazythings(int val) {
if(val == 42) {
if(aref == another) {
aref = initial;
} else {
aref = another;
}
}
}
struct apply : public std::unary_function