
The problem with this data is that it contains a lot of duplicates. If I cluster the cities into 5103 clusters, I get 47ms. On the other hand if I don't cluster them, then I get 4.1s. The expensive part is building the index of indexing on distances. The results are rather odd - the 500 locations I get have 991 neighbours.
This sounds like a correct result -- can I see the first 50 items?
The output is at the bottom. I've managed to get my times down a bit by reimplementing the solution in a much more straightforward way. I can insert all 21421 items, calculate the number of neighbours, and display the top 500 in 235ms. Then I can remove 1982 items, and redisplay the results, in 47 milliseconds. This of course makes me suspicious that I'm doing something wrong......... The solution I have is not terribly elegant from RML's point of view, since as I said before RML is designed for logical queries rather than numerical computation. It was not a terribly interesting problem because it needed just a single table with two indexes. Regards, Calum =============================================== 17882 ALTEC-AS 991 16918 BILIM-AS 991 16917 IXEUROPE-FR-ASN 991 8734 WISH-NOKNOK 991 16915 DHMS-NET 991 17700 IOMART-AS 991 17494 RAMSATCOM 991 17670 GETIT 991 17493 AS-IKSYS 991 16913 UNSPECIFIED 991 16912 ASN-MPLS 991 17491 ABB 991 17490 UNSPECIFIED 991 17698 MEGAPROVIDER-AS 991 18013 CITCO-AS 991 16911 YACAST-AS 991 17803 KHODA 991 .... 17278 AS-PETERSTAR 991 6399 Novaxess 991 17277 UNSPECIFIED 991 17868 RTK-Primorye 991 17276 UNSPECIFIED 991 17275 AS-SYNCHROLINE 991 11605 UNSPECIFIED 991 18167 RIPE 991 17274 AS-SUNET2000 991 11599 Swisscom-NA 991 11598 HPPOLAND-AS 991 17273 STARTVAS 991 There are 21421 locations 17219 ASN-CEDECRA 990 17587 UNSPECIFIED 990 17778 KIWWI-HU-AS 990 17777 Actimage 990 17433 EBS-Europe 990 ... 17280 LiberCom-AS 990 18207 RIPE 990 18206 RIPE 990 17279 UNIFFM-NET 990 17493 AS-IKSYS 990 There are 19439 locations Insert time = 234 milliseconds Delete and redisplay = 47 milliseconds