The first test measures time to insert N elements into a given container,
I think that too much information is missing from the benchmark setup which makes the results hard to interpret (see below for some of my questions). A good rule of thumb for how much information to include when describing the setup of a benchmark is that anybody should, based solely on the information provided, be able to re-implement the benchmarks and get the same (or very similar) results. the second measures the time taken to erase 25% of those same elements from the container, and the third test measures iteration performance after the second test has taken place.
Note: because plf::colony is an unordered container and subsequently we
are not concerned about insert order, push_front has been used with std::list instead of push_back, in order to provide a fair performance comparison. Are you using "push_back"/ "insert(end, N, T())" on vector and deque? Reserving memory upfront? Is the final size greater than the capacity before the insertion? For example if the containers have initially zero size and capacity prior to the insertion I cannot understand how anything can be faster than "std::vector::vector(N, T())". If the initial capacity is larger than the container size after the insertion I cannot understand how anything can be faster than "std::vector::insert(end, N, T())". Both of these cases are very relevant to the game development application you mention since there one typically has an upper bound of the number of elements within a container (like maximum number of entities).
Erase Performance
The curve for vector and deque looks like O(N^2): are you using erase-remove-if?