[fixed_string] Copying/comparing entire strucutre

Dear All, Considering small-capacity fixed_strings like the example that I gave previously: struct NameAndAddress { static_string<14> firstname; static_string<14> surname; static_string<30> address_line_1; static_string<22> city; static_string<10> postcode; }; I have been looking at how best to copy and compare them. For small-capacity strings, and also for large-capacity strings when their size approaches the capacity, it will be quicker to unconditionally copy/compare everything, including the unused bytes beyond the end, rather than looping over just the current size. I've tried a very crude benchmark based on this: struct string { uint8_t size; std::array<char,N> data; }; extern void copy_all(string& dest, const string& src) { dest = src; } extern void copy_size(string& dest, const string& src) { dest.size = src.size; std::copy(src.data.begin(), src.data.begin()+src.size, dest.data.begin()); } Results will depend on the architecture but I found that for N < 64 it was probably always best to copy_all(). For N==255, time for copy_all() and copy_size() is about equal when the container is half full. Things are more complicated for operator== because of the unused bytes. It would be necessary to initialise everything to 0 and maintain that during e.g. erase() and assign(). I haven't tried to benchmark that but I suspect the trade-off would be similar. operator< is a mess due to endianness. I don't know whether any compilers or memcmp() implementations are smart enough to use byte-swap instructions to process data 4 or 8 bytes at a time. I don't think I'm suggesting that this should necessarily be implemented in fixed_string, at least not without lots more investigation, but it does illustrate more of the trade-offs that exist between different applications. Regards, Phil.

Phil Endecott wrote:
Results will depend on the architecture but I found that for N < 64 it was probably always best to copy_all(). For N==255, time for copy_all() and copy_size() is about equal when the container is half full.
Things are more complicated for operator== because of the unused bytes. It would be necessary to initialise everything to 0 and maintain that during e.g. erase() and assign(). I haven't tried to benchmark that but I suspect the trade-off would be similar.
This is an interesting observation because constexpr support requires the array to be initialized to 0. So if we can justify this as actually being faster for N < some value, we might as well do it.
participants (2)
-
Peter Dimov
-
Phil Endecott