Serialisation: Is is_trivial<T> a sufficient precondition to bypass serialisation?

newer
[new libraries] [sort] New Library...

Niall Douglas

4 Feb 2015 4 Feb '15

11:54 a.m.

A library user has raised an interesting question at: https://github.com/BoostGSoC13/boost.afio/commit/ac32ebfceb9a08b133e8c d47000d3b4807bc3e42#commitcomment-9574522 Basically, AFIO v1.3 has new pre-serialisation metaprogramming in response to Robert's comments on his Incubator about it being too hard to serialise and deserialise data with AFIO. AFIO will now consume any STL container and will implicitly auto expand to ASIO scatter gather buffers any: 1. Trivial type T 2. C array of trivial type T 3. STL container of trivial type T, including initializer_list. A free function, to_asio_buffers(T), can be specialised by external code to extend this with custom ASIO scatter gather buffers generation. Note that AFIO explicitly and intentionally expects that anyone interested in async file i/o will be doing their serialisation and endian conversion far away from AFIO code, so to_asio_buffers(T) is really for marking extra types to be treated as implicit auto expand. The user asks the question: Is is_trivial<T> a sufficient precondition for this auto expansion to be safe, or should is_standard_layout<T> also be required? The user notes that is_trivial<T> && is_standard_layout<T> == is_pod<T> which seems a little overkill to me. There is also the possibility that is_trivial<T> is too conservative. Some may argue that is_trivially_copyable<T> would be sufficient. Thoughts? Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Attachments:

SMime.p7s (application/x-pkcs7-signature — 6.2 KB)

Show replies by date

Antony Polukhin

4 Feb 4 Feb

5:02 p.m.

New subject: Serialisation: Is is_trivial<T> a sufficient precondition to bypass serialisation?

2015-02-04 15:54 GMT+04:00 Niall Douglas <s_sourceforge@nedprod.com>:

...

There is also the possibility that is_trivial<T> is too conservative. Some may argue that is_trivially_copyable<T> would be sufficient. Thoughts?

There could be problems with alignment if a serialized data will be deserialized by other compiler. For example: struct A { int i, char c; }; struct B : public A { char c2[3]; int i2; }; Here structure B would be serialized differently by different compilers. sizeof(B) == 12 on GCC, some of the MSVC give 16. is_trivial<B> and is_trivially_copyable<B> would not be sufficient for that case. is_standard_layout<B> is required here. -- Best regards, Antony Polukhin

Thomas Heller

8:46 p.m.

New subject: Serialisation: Is is_trivial<T> a sufficient precondition to bypass serialisation?

Am 04.02.2015 12:54 schrieb "Niall Douglas" <s_sourceforge@nedprod.com>:

...

A library user has raised an interesting question at:

https://github.com/BoostGSoC13/boost.afio/commit/ac32ebfceb9a08b133e8c d47000d3b4807bc3e42#commitcomment-9574522

Basically, AFIO v1.3 has new pre-serialisation metaprogramming in response to Robert's comments on his Incubator about it being too hard to serialise and deserialise data with AFIO. AFIO will now consume any STL container and will implicitly auto expand to ASIO scatter gather buffers any:

1. Trivial type T 2. C array of trivial type T 3. STL container of trivial type T, including initializer_list.

So you generate a list of N scatter/gather buffers for a container with the size of N? That sounds a tad too much. Doesn't that create a immense overhead for the network interface?

...

A free function, to_asio_buffers(T), can be specialised by external code to extend this with custom ASIO scatter gather buffers generation. Note that AFIO explicitly and intentionally expects that anyone interested in async file i/o will be doing their serialisation and endian conversion far away from AFIO code, so to_asio_buffers(T) is really for marking extra types to be treated as implicit auto expand.

The user asks the question: Is is_trivial<T> a sufficient precondition for this auto expansion to be safe, or should is_standard_layout<T> also be required? The user notes that is_trivial<T> && is_standard_layout<T> == is_pod<T> which seems a little overkill to me.

There is also the possibility that is_trivial<T> is too conservative. Some may argue that is_trivially_copyable<T> would be sufficient. Thoughts?

Niall

-- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

_______________________________________________ Unsubscribe & other changes:

http://lists.boost.org/mailman/listinfo.cgi/boost

Niall Douglas

5 Feb 5 Feb

12:55 a.m.

New subject: Serialisation: Is is_trivial<T> a sufficient precondition to bypass serialisation?

On 4 Feb 2015 at 21:46, Thomas Heller wrote:

...

...
Basically, AFIO v1.3 has new pre-serialisation metaprogramming in response to Robert's comments on his Incubator about it being too hard to serialise and deserialise data with AFIO. AFIO will now consume any STL container and will implicitly auto expand to ASIO scatter gather buffers any:

1. Trivial type T 2. C array of trivial type T 3. STL container of trivial type T, including initializer_list.

So you generate a list of N scatter/gather buffers for a container with the size of N? That sounds a tad too much. Doesn't that create a immense overhead for the network interface?

Firstly AFIO does file i/o, not network i/o. You'd use ASIO for network i/o. If the user feeds std::list<int>(1000) to AFIO write(), I don't think you get a choice here: a gather write of 1000 individual items is requested and that is what is issued. If the user feeds std::vector<int>(1000) to AFIO write(), there is already a specialisation of to_asio_buffers() for vector, array, string and C arrays which recognises the fact they guarantee contiguous storage of their contents. In that situation AFIO issues a single gather buffer for the entire contents at once. If the user feeds std::vector<std:vector<int>(1000)>(100), then AFIO will create an ASIO gather buffer sequence of 100 buffers, each pointing at a region of 1000 ints. You can nest your STL containers to any arbitrary length - AFIO understands. AFIO also understands you can gather write a const value_type container like unordered_map, but cannot scatter read into a const value_type and a static assert is thrown if you try. In other words, const containers or value_types always turn into asio::const_buffer, as indeed they should. As with network sockets, most DMA engines for disk i/o have hard limits on scatter gather buffer size. That isn't AFIO's problem. POSIX also imposes a limit of IOV_MAX scatter gather buffers which is AFIO's problem, and AFIO correctly issues blocks of IOV_MAX until the input is fully dispatched. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Mathias Gaunard

6 Feb 6 Feb

10:07 a.m.

New subject: Serialisation: Is is_trivial<T> a sufficient precondition to bypass serialisation?

On 05/02/2015 01:55, Niall Douglas wrote:

...

As with network sockets, most DMA engines for disk i/o have hard limits on scatter gather buffer size. That isn't AFIO's problem.

So if I try to write a std::vector that is too large to fit in a scatter/gather buffer, my write will fail?

Niall Douglas

12:20 p.m.

New subject: Serialisation: Is is_trivial<T> a sufficient precondition to bypass serialisation?

On 6 Feb 2015 at 11:07, Mathias Gaunard wrote:

...

...
As with network sockets, most DMA engines for disk i/o have hard limits on scatter gather buffer size. That isn't AFIO's problem.

So if I try to write a std::vector that is too large to fit in a scatter/gather buffer, my write will fail?

AFIO observes the IOV_MAX batch limit, so on POSIX with pwritev() support no it should never fail, though of course you lose atomicity between IOV_MAX batches. On POSIX without pwritev() support AFIO issues each buffer singly anyway, so your atomicity is per buffer. On Windows if buffered i/o is on then there is no limit and atomicity is per buffer (Windows has no scatter gather file i/o functions for buffered files). If buffered i/o is off, the WriteFileGather() API currently has an unofficial limit of 32Mb on x64 operating systems due to NT kernel structure limits. Because this limit is not documented and not stable even across 32 bit vs 64 bit systems never mind between Intel and ARM, AFIO passes through your request as-is, and returns an error if the WriteFileGather() API does. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Olaf van der Spek

12:52 p.m.

New subject: Serialisation: Is is_trivial<T> a sufficient precondition to bypass serialisation?

On Fri, Feb 6, 2015 at 1:20 PM, Niall Douglas <s_sourceforge@nedprod.com> wrote:

...

On 6 Feb 2015 at 11:07, Mathias Gaunard wrote:

...
...
As with network sockets, most DMA engines for disk i/o have hard limits on scatter gather buffer size. That isn't AFIO's problem.

So if I try to write a std::vector that is too large to fit in a scatter/gather buffer, my write will fail?

AFIO observes the IOV_MAX batch limit, so on POSIX with pwritev() support no it should never fail, though of course you lose atomicity between IOV_MAX batches. On POSIX without pwritev() support AFIO issues each buffer singly anyway, so your atomicity is per buffer.

On Windows if buffered i/o is on then there is no limit and atomicity is per buffer (Windows has no scatter gather file i/o functions for buffered files). If buffered i/o is off, the WriteFileGather() API currently has an unofficial limit of 32Mb on x64 operating systems due to NT kernel structure limits. Because this limit is not documented and not stable even across 32 bit vs 64 bit systems never mind between Intel and ARM, AFIO passes through your request as-is, and returns an error if the WriteFileGather() API does.

What's the definition of atomicity here? -- Olaf

Niall Douglas

2:49 p.m.

New subject: Serialisation: Is is_trivial<T> a sufficient precondition to bypass serialisation?

On 6 Feb 2015 at 13:52, Olaf van der Spek wrote:

...

...
...
So if I try to write a std::vector that is too large to fit in a scatter/gather buffer, my write will fail?

AFIO observes the IOV_MAX batch limit, so on POSIX with pwritev() support no it should never fail, though of course you lose atomicity between IOV_MAX batches. On POSIX without pwritev() support AFIO issues each buffer singly anyway, so your atomicity is per buffer.

On Windows if buffered i/o is on then there is no limit and atomicity is per buffer (Windows has no scatter gather file i/o functions for buffered files). If buffered i/o is off, the WriteFileGather() API currently has an unofficial limit of 32Mb on x64 operating systems due to NT kernel structure limits. Because this limit is not documented and not stable even across 32 bit vs 64 bit systems never mind between Intel and ARM, AFIO passes through your request as-is, and returns an error if the WriteFileGather() API does.

What's the definition of atomicity here?

Atomicity on filing systems is that the whole of a write operation will be atomically seen as a single unit by all readers of that file whether in other processes or other machines [1]. So if you write 64Kb of data, either none of the 64Kb will appear to readers, or all of it will. You can't see it mid write. This feature lets you do fun stuff like distributed mutual exclusion algorithms using atomic appends as the message channel, and extent zeroing on the front of the file to prevent the file growing in physical allocation. You can see an example of such an algorithm at https://boostgsoc13.github.io/boost.afio/doc/html/afio/quickstart/asyn c_file_io/atomic_logging.html where performance, except on ZFS, is very respectable. Plus the code is completely platform independent. [1]: Excluding mmaps. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

3806

Age (days ago)

3808

Last active (days ago)

List overview

Download

7 comments

5 participants

participants (5)

Antony Polukhin
Mathias Gaunard
Niall Douglas
Olaf van der Spek
Thomas Heller