
On Thu, Jan 12, 2012 at 11:02 AM, Artyom Beilis <artyomtnk@yahoo.com> wrote:
So if you are looking for vendor specific functions that allow you to do some kind of data transfer directly from socket to memory buffer it is not for you, especially when only one API around supports it (OCI)
You have to use a vendor-specific API in the back-end anyways. What matters is to have the user-facing API not preventing optimizations in the back-end in the case you want to insert 100,000 rows to a table. If the library forces to call stmt.execute() for each row, and thus a network round-trip to the server, you're far from high-performance. I doubt Oracle OCI is the only vendor or DBMS API that offers ways to improve such a case (and I'm not talking of its Direct Path API which bypasses the normal DML path). As an example, you showed binding a scalar to a prepared statement's placeholder, which Oracle supports of course, but it also allows binding an array of scalars almost as simply, and sends all those in a single round-trip. One could agree it's the job of a back-end to accumulate the scalar values and behind the scene accumulate them in an array, similar to Oracle's prefetching but in reverse, but it could also be in the API itself, which is simply converted to a loop for back-ends that do not support bulk array operations. I don't dispute that your wrapper is 2x or 3x faster than existing wrappers, (and I was already taking prepared statements reuse for granted in fact), but was discussing the kind of large insert or select above. SQLite doesn't have the kind of bulk operations I mention, simply because it's a server-less SQL engine talking directly to the filesystem, with no network overhead, but for client-server RDBMS you simply cannot afford to ignore network round-trips to have the best performance. As you rightly point out, you don't aim to be an ORM (which often by their nature of traversing an object graph and doing small "scattered" scalar inserts, can't leverage bulk operations, and therefore can't achieve best performance) and concentrate on just the DBMS connectivity, but all I'm saying is that bulk insert or select are precisely something a targeted library like this should address, one way or another, to call itself high-performance. Just my $0.02. --DD