
"Jose" <jmalv04@gmail.com> wrote
Given the problems with the datasets, I would change the initial query to one that clusters the cities as you do and show only the city + (lat,long) and number of neighbours in the query results. Some cities will show with the right string and others not given that the data doesn't map all coordinates to the same city name.
RTL result is also 991 neighbours. The problem is that the Amsterdam area has the 991 neighbours (and most likely all AS with the same identical coordinates) so it is better to group the results by city i.e.:
Amsterdam (lata, lonb) 991 city B (latc,lond) xyz city C (late, lonf) abc
With these results we can compare both queries and although the string names from the cities my differ the numerical values should not.
Well, this may make more sence (depending on what you want to find out). However, this is a much simpler query -- just sort on the sity, groupby, and then re-sort on the counter -- I am sure both libraries will be efficient enough. Also I don't think one can specify (lat, lon) for a city. I would finish with the original example before we proceed to anything else. Regards, Arkadiy