<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>mobile geo social &#187; google app engine</title>
	<atom:link href="http://hitching.net/tag/google-app-engine/feed/" rel="self" type="application/rss+xml" />
	<link>http://hitching.net</link>
	<description>a blog by bob hitching</description>
	<lastBuildDate>Fri, 10 Feb 2012 09:33:12 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Scalable, fast, accurate geo apps using Google App Engine + geohash + faultline correction</title>
		<link>http://hitching.net/2009/11/10/scalable-fast-accurate-geo-apps-using-google-app-engine-geohash-faultline-correction/</link>
		<comments>http://hitching.net/2009/11/10/scalable-fast-accurate-geo-apps-using-google-app-engine-geohash-faultline-correction/#comments</comments>
		<pubDate>Tue, 10 Nov 2009 01:00:00 +0000</pubDate>
		<dc:creator>bob hitching</dc:creator>
				<category><![CDATA[mobile geo social]]></category>
		<category><![CDATA[geo]]></category>
		<category><![CDATA[geohash]]></category>
		<category><![CDATA[geolocation]]></category>
		<category><![CDATA[geomeme]]></category>
		<category><![CDATA[google app engine]]></category>
		<category><![CDATA[linkedin]]></category>

		<guid isPermaLink="false">http://hitching.net/?p=15953</guid>
		<description><![CDATA[GeoMeme is a web app (and also a mobile web app for iPhone and Android) that I recently developed as a pet project. It measures real-time local twitter trends. Visitors to GeoMeme choose a location on the map, and two &#8230; <a href="http://hitching.net/2009/11/10/scalable-fast-accurate-geo-apps-using-google-app-engine-geohash-faultline-correction/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.geome.me/">GeoMeme</a> is a web app (and also a mobile web app for iPhone and Android) that I recently developed as a pet project. It measures real-time local twitter trends. </p>
<p>Visitors to GeoMeme choose a location on the map, and two search terms to compare. GeoMeme then measures and compares the number of matching tweets within the bounds of the map, based on public data from a number of mobile twitter apps.</p>
<p>As an example, GeoMeme can work out that <a href="http://www.geome.me/VrIXq"> <img src='http://hitching.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  beats <img src='http://hitching.net/wp-includes/images/smilies/icon_sad.gif' alt=':(' class='wp-smiley' />  in San Francisco</a>:</p>
<p><a href="http://www.geome.me/VrIXq"><img src="http://hitching.net/wp-content/uploads/2009/11/w_screenshot-291x300.png" /></a></p>
<p>A large amount of geo-data is generated by GeoMeme, and so arises a need shared by many geo apps: scalable, fast, and accurate spatial queries, used to select a subset of geo-data for display as markers on a <a href="http://www.geome.me/">map</a>, or on <a href="http://earth.geome.me/">Google Earth</a>.</p>
<h3><img style="width: 21px; height: 34px; float: left; margin-left: 0pt; margin-right: 1em;" src="http://chart.apis.google.com/chart?chst=d_map_pin_letter&#038;chld=:)|FF0000|000000" alt=":)" />Google App Engine</h3>
<p><a href="http://appengine.google.com/">Google App Engine</a> is an obvious choice for hosting your geo app. The App Engine datastore is built on top of Google&#8217;s BigTable technology which scales very well, and is optimized for fast data retrieval. And it doesn&#8217;t cost the earth like some traditional GIS database solutions.</p>
<h3><img style="width: 21px; height: 34px; float: left; margin-left: 0pt; margin-right: 1em;" src="http://chart.apis.google.com/chart?chst=d_map_pin_letter&#038;chld=:(|FF0000|000000" alt=":(" /> Inequality constraint</h3>
<p>If you are coming from a background of relational databases, you might think the solution here would be to store the latitude and longitude of all your markers in a database table, and do a simple query to retrieve only those contained within the bounds of the map.</p>
<p>However, the flipside of being optimized for fast data retrieval is that BigTable only allows inequality filters on a single dimension, to avoid the burden of full table scans. For example, the following form of spatial query is not supported because it specifies inequality filters on both latitude and longitude dimensions:</p>
<p><span id="more-15953"></span></p>
<pre class="brush: python; title: ;">
SELECT latitude, longitude, title FROM myMarkers
WHERE latitude &gt;= :south AND latitude &lt;= :north
  AND longitude &gt;= :west AND longitude &lt;= :east

&gt;&gt;&gt; BadFilterError: invalid filter: Only one property per query may have inequality filters
</pre>
<h3><img style="width: 21px; height: 34px; float: left; margin-left: 0pt; margin-right: 1em;" src="http://chart.apis.google.com/chart?chst=d_map_pin_letter&#038;chld=:)|FF0000|000000" alt=":)" /> Geohash to the rescue</h3>
<p>Fortunately, there is an answer to this: <strong>geohash</strong>, the geocoding system invented by Gustavo Niemeyer.</p>
<p>My suspicion is that Niemeyer has travelled back in time after working out how to collapse two-dimensional space into a single dimension, but you might want to read the Wikipedia explanation of <a href="http://en.wikipedia.org/wiki/Geohash">the algorithm</a> instead.</p>
<p>An example: the location of the Sydney Opera House can be specified in two dimensions as {latitude: -33.858, longitude: 151.215}, or in a single dimension as {geohash: <a href="http://geohash.org/r3gx2ur29zzg7">r3gx2ur29zzg7</a>}.</p>
<p>Schuyler Erle has written an open source geohash <a href="http://mappinghacks.com/code/geohash.py.txt">python module</a>, which enables the following form of spatial query on Google App Engine, because the inequality filter is specified only on a single dimension:</p>
<pre class="brush: python; title: ;">
sw_geohash = Geohash((west, south))
ne_geohash = Geohash((east, north))

SELECT latitude, longitude, title FROM myMarkers
WHERE geohash &gt;= :sw_geohash AND geohash &lt;= :ne_geohash
</pre>
<h3><img style="width: 21px; height: 34px; float: left; margin-left: 0pt; margin-right: 1em;" src="http://chart.apis.google.com/chart?chst=d_map_pin_letter&#038;chld=;(|FF0000|000000" alt=";(" />Don&#8217;t hash me, coz I&#8217;m close to the edge</h3>
<p>However&#8230;! We&#8217;re not done yet, because an artifact of the geohash algorithm is that queries like the one above can often return rogue markers which are outside of the required bounds, when the map spans what we shall call a geohash &#8220;faultline&#8221;.</p>
<p>This problem is particularly evident near the equator and the Greenwich Meridian, which are the biggest faultlines, but there are actually faultlines all over the place at every zoom level.</p>
<h3><img style="width: 21px; height: 34px; float: left; margin-left: 0pt; margin-right: 1em;" src="http://chart.apis.google.com/chart?chst=d_map_pin_letter&#038;chld=:D|FF0000|000000" alt=":D" />Faultline correction</h3>
<p>GeoMeme solves this problem using &#8220;faultline correction&#8221;, an approach that I would like to share here:</p>
<ul>
<li>If a spatial query is vulnerable to faultlines, it is split into multiple sub-queries that do not cross the faultline.</li>
<li>Sub-query limits are approximately weighted according to their relative size.</li>
<li>Sub-queries are executed in parallel, taking advantage of BigTable&#8217;s distributed goodness, and the results combined, so all this happens very fast.</li>
<li>Even though the sub-queries are executed in parallel without any significant impact on user experience, App Engine CPU costs can increase to around 2x the CPU cost of a single less accurate query. Sub-query results are memcached to reduce this CPU overhead.</li>
</ul>
<p>Here&#8217;s a demo showing the effect of geohash faultlines, and the relative accuracy of spatial queries with or without faultline correction. Rogue markers are shown in the area surrounding the map. Switch between Correction: off / on / double and compare the accuracy.</p>
<p><iframe src="http://geohash-fcdemo.appspot.com/" width="600" height="450" frameborder="0"><br />
<a href="http://geohash-fcdemo.appspot.com/"><img src="http://geohash-fcdemo.googlecode.com/svn/trunk/screenshot.png" /></a></iframe></p>
<p>Note: &#8220;double&#8221; correction is an advanced option which splits the sub-queries into sub-sub-queries so that they do not cross any faultlines either. This can further increase accuracy, but with further CPU cost (up to 8x the CPU cost of a single less accurate query).</p>
<p>All the source code can be found <a href="http://code.google.com/p/geohash-fcdemo/">here</a>, including ffGeoSearch, a python module to handle faultline-friendly geo search, if you want to use this technique on your own geo app.</p>
<p>Enjoy!</p>
<p><em>This post is one of a <a href="http://googlegeodevelopers.blogspot.com/2009/11/geolocation-mobile-web-apps-geo.html">series</a> that aims to share some of the technology innovation that can be found in <a href="http://www.geome.me/">GeoMeme</a>. Other posts cover topics such as <a href="http://hitching.net/2009/11/10/fast-map-re-location-using-google-static-maps-v2-geocoder/">fast map re-location</a> using Google Static Maps v2, and <a href="http://hitching.net/2009/11/10/location-aware-mobile-web-apps-using-google-maps-v3-geolocation/">location-aware mobile web apps</a> using Google Maps v3.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://hitching.net/2009/11/10/scalable-fast-accurate-geo-apps-using-google-app-engine-geohash-faultline-correction/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
	</channel>
</rss>

