Serialisation: Is is_trivial<T> a sufficient precondition to bypass serialisation?
A library user has raised an interesting question at: https://github.com/BoostGSoC13/boost.afio/commit/ac32ebfceb9a08b133e8c d47000d3b4807bc3e42#commitcomment-9574522 Basically, AFIO v1.3 has new pre-serialisation metaprogramming in response to Robert's comments on his Incubator about it being too hard to serialise and deserialise data with AFIO. AFIO will now consume any STL container and will implicitly auto expand to ASIO scatter gather buffers any: 1. Trivial type T 2. C array of trivial type T 3. STL container of trivial type T, including initializer_list. A free function, to_asio_buffers(T), can be specialised by external code to extend this with custom ASIO scatter gather buffers generation. Note that AFIO explicitly and intentionally expects that anyone interested in async file i/o will be doing their serialisation and endian conversion far away from AFIO code, so to_asio_buffers(T) is really for marking extra types to be treated as implicit auto expand. The user asks the question: Is is_trivial<T> a sufficient precondition for this auto expansion to be safe, or should is_standard_layout<T> also be required? The user notes that is_trivial<T> && is_standard_layout<T> == is_pod<T> which seems a little overkill to me. There is also the possibility that is_trivial<T> is too conservative. Some may argue that is_trivially_copyable<T> would be sufficient. Thoughts? Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
2015-02-04 15:54 GMT+04:00 Niall Douglas
There is also the possibility that is_trivial<T> is too conservative. Some may argue that is_trivially_copyable<T> would be sufficient. Thoughts?
There could be problems with alignment if a serialized data will be deserialized by other compiler. For example: struct A { int i, char c; }; struct B : public A { char c2[3]; int i2; }; Here structure B would be serialized differently by different compilers. sizeof(B) == 12 on GCC, some of the MSVC give 16. is_trivial<B> and is_trivially_copyable<B> would not be sufficient for that case. is_standard_layout<B> is required here. -- Best regards, Antony Polukhin
Am 04.02.2015 12:54 schrieb "Niall Douglas"
A library user has raised an interesting question at:
https://github.com/BoostGSoC13/boost.afio/commit/ac32ebfceb9a08b133e8c d47000d3b4807bc3e42#commitcomment-9574522
Basically, AFIO v1.3 has new pre-serialisation metaprogramming in response to Robert's comments on his Incubator about it being too hard to serialise and deserialise data with AFIO. AFIO will now consume any STL container and will implicitly auto expand to ASIO scatter gather buffers any:
1. Trivial type T 2. C array of trivial type T 3. STL container of trivial type T, including initializer_list.
So you generate a list of N scatter/gather buffers for a container with the size of N? That sounds a tad too much. Doesn't that create a immense overhead for the network interface?
A free function, to_asio_buffers(T), can be specialised by external code to extend this with custom ASIO scatter gather buffers generation. Note that AFIO explicitly and intentionally expects that anyone interested in async file i/o will be doing their serialisation and endian conversion far away from AFIO code, so to_asio_buffers(T) is really for marking extra types to be treated as implicit auto expand.
The user asks the question: Is is_trivial<T> a sufficient precondition for this auto expansion to be safe, or should is_standard_layout<T> also be required? The user notes that is_trivial<T> && is_standard_layout<T> == is_pod<T> which seems a little overkill to me.
There is also the possibility that is_trivial<T> is too conservative. Some may argue that is_trivially_copyable<T> would be sufficient. Thoughts?
Niall
-- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
_______________________________________________ Unsubscribe & other changes:
On 4 Feb 2015 at 21:46, Thomas Heller wrote:
Basically, AFIO v1.3 has new pre-serialisation metaprogramming in response to Robert's comments on his Incubator about it being too hard to serialise and deserialise data with AFIO. AFIO will now consume any STL container and will implicitly auto expand to ASIO scatter gather buffers any:
1. Trivial type T 2. C array of trivial type T 3. STL container of trivial type T, including initializer_list.
So you generate a list of N scatter/gather buffers for a container with the size of N? That sounds a tad too much. Doesn't that create a immense overhead for the network interface?
Firstly AFIO does file i/o, not network i/o. You'd use ASIO for
network i/o.
If the user feeds std::list<int>(1000) to AFIO write(), I don't think
you get a choice here: a gather write of 1000 individual items is
requested and that is what is issued.
If the user feeds std::vector<int>(1000) to AFIO write(), there is
already a specialisation of to_asio_buffers() for vector, array,
string and C arrays which recognises the fact they guarantee
contiguous storage of their contents. In that situation AFIO issues a
single gather buffer for the entire contents at once.
If the user feeds std::vector
On 05/02/2015 01:55, Niall Douglas wrote:
As with network sockets, most DMA engines for disk i/o have hard limits on scatter gather buffer size. That isn't AFIO's problem.
So if I try to write a std::vector that is too large to fit in a scatter/gather buffer, my write will fail?
On 6 Feb 2015 at 11:07, Mathias Gaunard wrote:
As with network sockets, most DMA engines for disk i/o have hard limits on scatter gather buffer size. That isn't AFIO's problem.
So if I try to write a std::vector that is too large to fit in a scatter/gather buffer, my write will fail?
AFIO observes the IOV_MAX batch limit, so on POSIX with pwritev() support no it should never fail, though of course you lose atomicity between IOV_MAX batches. On POSIX without pwritev() support AFIO issues each buffer singly anyway, so your atomicity is per buffer. On Windows if buffered i/o is on then there is no limit and atomicity is per buffer (Windows has no scatter gather file i/o functions for buffered files). If buffered i/o is off, the WriteFileGather() API currently has an unofficial limit of 32Mb on x64 operating systems due to NT kernel structure limits. Because this limit is not documented and not stable even across 32 bit vs 64 bit systems never mind between Intel and ARM, AFIO passes through your request as-is, and returns an error if the WriteFileGather() API does. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
On Fri, Feb 6, 2015 at 1:20 PM, Niall Douglas
On 6 Feb 2015 at 11:07, Mathias Gaunard wrote:
As with network sockets, most DMA engines for disk i/o have hard limits on scatter gather buffer size. That isn't AFIO's problem.
So if I try to write a std::vector that is too large to fit in a scatter/gather buffer, my write will fail?
AFIO observes the IOV_MAX batch limit, so on POSIX with pwritev() support no it should never fail, though of course you lose atomicity between IOV_MAX batches. On POSIX without pwritev() support AFIO issues each buffer singly anyway, so your atomicity is per buffer.
On Windows if buffered i/o is on then there is no limit and atomicity is per buffer (Windows has no scatter gather file i/o functions for buffered files). If buffered i/o is off, the WriteFileGather() API currently has an unofficial limit of 32Mb on x64 operating systems due to NT kernel structure limits. Because this limit is not documented and not stable even across 32 bit vs 64 bit systems never mind between Intel and ARM, AFIO passes through your request as-is, and returns an error if the WriteFileGather() API does.
What's the definition of atomicity here? -- Olaf
On 6 Feb 2015 at 13:52, Olaf van der Spek wrote:
So if I try to write a std::vector that is too large to fit in a scatter/gather buffer, my write will fail?
AFIO observes the IOV_MAX batch limit, so on POSIX with pwritev() support no it should never fail, though of course you lose atomicity between IOV_MAX batches. On POSIX without pwritev() support AFIO issues each buffer singly anyway, so your atomicity is per buffer.
On Windows if buffered i/o is on then there is no limit and atomicity is per buffer (Windows has no scatter gather file i/o functions for buffered files). If buffered i/o is off, the WriteFileGather() API currently has an unofficial limit of 32Mb on x64 operating systems due to NT kernel structure limits. Because this limit is not documented and not stable even across 32 bit vs 64 bit systems never mind between Intel and ARM, AFIO passes through your request as-is, and returns an error if the WriteFileGather() API does.
What's the definition of atomicity here?
Atomicity on filing systems is that the whole of a write operation will be atomically seen as a single unit by all readers of that file whether in other processes or other machines [1]. So if you write 64Kb of data, either none of the 64Kb will appear to readers, or all of it will. You can't see it mid write. This feature lets you do fun stuff like distributed mutual exclusion algorithms using atomic appends as the message channel, and extent zeroing on the front of the file to prevent the file growing in physical allocation. You can see an example of such an algorithm at https://boostgsoc13.github.io/boost.afio/doc/html/afio/quickstart/asyn c_file_io/atomic_logging.html where performance, except on ZFS, is very respectable. Plus the code is completely platform independent. [1]: Excluding mmaps. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
participants (5)
-
Antony Polukhin
-
Mathias Gaunard
-
Niall Douglas
-
Olaf van der Spek
-
Thomas Heller