Finding a cheap calculation for the distance between two points in the United Kingdom
Posted : Friday, 02 September 2011 00:18:33
Recently I was working on a project whereby we had to be able to calculate the distance between two points (e.g. two sets of lat/lon coordinates) within the United Kingdom. The reason for doing so was so that we were able to rank a set of arbitrary search results by distance from a given location. This could potentially be very computationally expensive particularly when considering there could be hundreds or even thousands of results. By the time I was brought into the project a rough algorithm had been found to to do this but was as yet untested. The algorithm in question is known as the “Squared Euclidian” distance. It basically involves performing a Pythagoras equation to determine the square of a hypotenuse – given that in this scenario there is no requirement to calculate the actual distance as it was purely for ranking purposes there was no reason to find the square root of squared hypotenuse. As usual Wikipedia has information on the subject:
http://en.wikipedia.org/wiki/Euclidean_distance
So that’s the background. The equation was used to rank a set of search results which were then paginated and finally rendered onto a web page, at this point the actual distance was calculated in order to show the actual distance to the user. This was when some odd behaviour was observed. The ranked results (e.g. supposedly sorted by distance) were not appearing in the correct order on the page. For example the closest two results were 2.3 miles, these were then followed by one at 2.1 miles which was in turn followed by one 2.4 miles which was in turn followed by one at 2.2miles. Clearly they were not being ranked being distance! I spent a good deal of time investigating this issue and what follows is the process I used to solve the problem.
I began to run some calculations to try and find out the distance between a few sets of points and then compare the actual distance with the score that would have been generated by the Squared Euclidian equation to try and discover a flaw. Excel is great and I really like what it can do (more accurately I really what people far more skilled then me can do with it) but after a few hours staring at pages and pages of seemingly random numbers I decided on a different approach before my brain melted. I have a MCTS certification in Bing maps so it was a fairly straightforward task to build a little visualisation tool using the Bing Maps Ajax API.
I should point here that a huge amount of credit and thanks go to Ricky Brunditt who’s blog was a fantastic source of information and equations.
Firstly I plotted a ring of points all the same distance from a given location:
As you can see there is clearly a distortion whereby the further north a point is the larger it appears on the Mercator projection of the globe, also you can clearly see how the further north a point is the greater the horizontal distance appears between adjacent points. This is nothing groundbreaking but it did two things for me. First it helped me to visualise how the latitude and longitudes vary at different points on the globe. Second it made me think I really should have listened more in geography class – that would certainly have saved me some time in this instance.
So with a slightly clearer understanding I tried a different approach. I plotted a horizontal line of points and a vertical line of points all the same distance from the previous point:
Okay this was going somewhere. As we know that in the above picture all consecutive points along either axis are the same distance apart, the Squared Euclidian value for the first point on the latitudinal axis should be the same as that of the first point on the longitudinal axis. I took six points along each axis from a starting point just off Cornwall:
And then compared the values of the Squared Euclidian values:
Latitude  Longitude  Distance (miles)  Sq. Euclidian Distance 
51.5188  9.50419  100  2.097648419 
52.96713  9.50419  200  8.390593674 
54.41546  9.50419  300  18.87883577 
55.86378  9.50419  400  33.5623747 
57.31211  9.50419  500  52.44121046 
58.76044  9.50419  600  75.51534307 
50.07048  7.24837  100  5.088721445 
