
"Calum Grant" <calum@visula.org> wrote
Calum, is it possible to see the latest code of your example?
Here it is.
I see... A couple of questions: 1) You modify counter all the time -- does RML allow to do this with the fields on which the table is sorted? 2) When you say : select(m_locations, m_locations.x>=loc.x-m_cutoff && m_locations.x<=loc.x+m_cutoff && m_locations.y>=loc.y-m_cutoff && m_locations.y<=loc.y+m_cutoff && m_locations.z>=loc.z-m_cutoff && m_locations.z<=loc.z+m_cutoff) Do you use any kind of special index for this? In any case this seems to prove that RML does a good job when searching on multiple indexes -- you do quite a bit of that. Also, inserts/updates are probably very efficient. But, as I said before, the overall efficiency of this example is based on the fact that this is a very smart, custom, hand-coded solution (see how you maintain the counter). No library would be able to beat its performance using a higher-level abstraction (If RTL had a count() operator in addition to generic groupby(), we could be much smarter in maintaining it during updates). OTOH, RTL doesn't seem to have much visible overhead. What's happenning during the request is that for every record a range query is executed on the index sorted by lattitude to cut off most of the records (about 90%). This range query is then walked and the proper selection condition (based on distance) is applied, and the counter is calculated. This all gets indexed on the (counter, city, name). Everything except indexing is done lazily. The major performance overhead is introduced by indexing of the selection (about 2,000,000 records), but this also provides a major gain with updates, since selection is (at least at the moment) the most expensive part of this. Still, I can't do as good as you can, since I don't have a notion of "counter", so groupby has to be run over and over again (even if on a faster argument). I should be able to find some major inefficiencies in selection, and improve the performance. I don't think I'll be able to beat your result though (by the reasons I described above). But again, I don't believe this all proves RML is faster than RTL, since the coding approaches were so different. I don't even know if it's possible to compare the libraries at this point, since they have their strong points in different areas. RTL's strong point in the ability to build and run SQL-like views. This I tried to show in my example. RML seems to be very efficient at maintaining and using multiple indexes, but, despite its more SQL-like interface, can mimique only very simple SQL queries. Regards, Arkadiy