
18 Oct
2005
18 Oct
'05
midnight
"Calum Grant" <calum@visula.org> wrote
The problem with this data is that it contains a lot of duplicates. If I cluster the cities into 5103 clusters, I get 47ms. On the other hand if I don't cluster them, then I get 4.1s. The expensive part is building the index of indexing on distances. The results are rather odd - the 500 locations I get have 991 neighbours.
This sounds like a correct result -- can I see the first 50 items? Regards, Arkadiy