New subject: Requests for comments on a (partly) hypothetical non-relational serialization library

19 Jun 2010

      Hi all: I've been working for a while on a variety of tools to facilitate 
application development in normal (cross-platform) C++, and avoid the 
byzantine dependency chains (including needing multiple boost versions) 
which so often creep in because real applications always seem to piece 
together disparate parts with different build systems, requirements, even 
how to download the source code...pretty soon you're not programming C++ 
anymore, you're tinkering with Python make scripts, or Perl code 
generators, or learning Git or Subversion ... know what I'm saying?  
Anyhow, I'm a fan of the Mongo database, but it's notoriously hard to build 
even the drivers, and not really suited for simple SQLite-like object 
serialization for persistence between runs of an application (even though 
this is theoretically possible, it is poorly documented and still requires 
linking against the entire Mongo system).

So I've decided to develop a serialization framework (not a database) with 
some "NoSQL" features based on Mongo, but alot easier to use.  I believe 
this framework could provide a foundation upon which useful, moderately 
complex C++ applications could be designed, by providing extensions to the 
library which are optional to use but which incorporate my work (I hope 
that doesn't sound pedantic) on general application development, without 
extra external dependencies.  Specifically, these extensions would 
include:

1)  A tool for generating GUI code -- for wxWidgets, in particular -- from 
archives that could be edited with a simple textual front-end, vaguely like 
XAML;

2) A custom language based on Clojure -- a Lisp dialect originally 
implemented by Rich Hickey on the JVM -- for expressing queries and 
importing/exporting data from/to an archive;

3) Perl6-like regular expressions for matching against textual fields in an 
archive; 

4) AI-inspired algorithms for sorting, filtering, and in other ways 
operating on archives.

My academic background is in AI -- actually, to be precise, I wrote a 
doctoral dissertation in the philosophy of science, but I researched AI in 
this context -- but I'm especially interested in nonrelational database 
theory because it better captures the process of modeling complex systems, 
and, in general, nonrelational databases are more interesting from an AI 
perspective because the lack of a fixed schema means that operations like 
sorting and filtering can require some "reasoning".  I'm particularly 
interested in application development because I think one concrete 
application of AI research is to make tools like IDEs smarter.  A 
non-relational serialization library could potentially serve the 
application development process not only by providing an easy way to 
persist data, but through IDE extensions or project generators -- store 
lists of debug breakpoints in an archive, or parse source code for 
namespaces, types, etc., and store the results in an archive, or an archive 
to represent all the controls in a GUI...

The library I have in mind would differ from boost.serialization by 
providing explicit support for non-relational functionality, and also by 
using a restricted type system along the lines of MongoDB and JSON: any 
persistable data field would have to be marshalled into one of a few 
predefined types, although users could explicitly extend the type system if 
desired.  Aside from writing persistence code directly in the C++ source 
(along the lines of, e.g., instantiating a serialize() template in 
namespace boost::serialization), the test or demo applications I've been 
writing use external files, written in the (currently very minimal) 
Clojure-like language I mentioned above, and an interpreter does the actual 
serialization -- so the persistence strategy could be altered without 
recompiling the application, even while it is running.  I think this offers 
new potential for using AI-style algorithms for things like tracking usage 
patterns, because all of that could be implemented fully orthogonal to the 
application itself.

So, that's the project I've sort of assigned myself, and I would appreciate 
any comments and ideas and what I could do to make this the kind of library 
C++ programmers would consider trying out.  Thanks in advance.

Requests for comments on a (partly) hypothetical non-relational serialization library

nathaniel＠photino.org

David Abrahams

tags

participants (2)