
Arkadiy Vertleyb wrote:
"Calum Grant" <calum@visula.org> wrote
I frankly have no idea how I would implement that problem in SQL, but then this was a problem you chose to be very difficult to solve using a SQL notation.
I am not an SQL expert, but should be something like this (if I remember correctly):
SELECT number, name, city, COUNT(*) AS cnt FROM mytable a, mytable b WHERE [the long condition based on coordinates] GROUP BY a.number ORDER BY cnt DESC;
Then you hope that the optimizer figures out how to not compare every possible pair ;-)
Needless to say, implementing a query analyser to do all that purely in TMP would be rather difficult! But I don't believe there are any hidden costs in RML - what you see is what you get.
This exact statement I modeled in RTL, but RTL also specifies HOW to do things in addition to WHAT to do, so I was able to do a smartrer (range-based) query.
Ah, but you can specify "how" using C++.
I am thrilled with your performance results, though (although I consider them the results of a by-hand solution). Did you completely avoid calculating the distance? If so, I think any such solution would provide only approximate result (although it might happen to be a pretty good approximation).
I compare on Euclidean distance squared. It does not actually matter whether Euclidean distance, Euclidean distance squared, or surface distance is used, the ordering will be preserved. Best regards, Calum