Google Geospatial Infrastructure for Everyone

There once was a time where the scale to serve and process global maps, imagery and car mounted camera pictures to the world could be counted on a single hand. Before that time niche providers of complicated mapping solutions would guard those services behind an increasing complex and expensive set of services and solutions that only government agencies could afford at scale. Things have changed rapidly over the last few years, powered by easy and inexpensive access to cloud based servers and storage solutions that sets the bar for the creation and maintenance of a global map base within the reach of modestly sized companies. The rich base maps and image layers are more of a commodity now, with the ability to plot mass features on a map can be done with a little training and even less resources.

Whilst low cost virtual machines and storage can reduce companies costs when  lifting and shifting computing from on-premise to cloud based services, they still function in the same way, often need the same method of scaling and require the same amount of operations people to maintain. So whilst scalability might be within everyone’s reach the speed of operations can slow the development of applications down. In a world where mobile applications and viral games can require the development and deployment of solutions within weeks if not days then using the spatial systems and solutions of the past would mean most of the applications wouldn’t have any spatial functionality.

Fortunately cloud based solutions have developed beyond functionality for just emulating hardware and storing bits, to higher order solutions that are easier for developers to access, as they operate via web services, reducing the time implications of integration and require little or no operational involvement, reducing the dependency on manpower that was the requirement of previous systems.

Using Google services such as Cloud Pub/Sub (scalable real-time messaging) and BigQuery (petabyte scale storage and analytics) merged with data from the Maps APIs it’s possible to build a highly scalable telemetry solution that can store, process and analyse from hundreds of thousands of vehicles or devices without the need for any servers (virtual or otherwise) or even any geospatial software. Whilst these products may look like they have emerged fully formed, functionality such as this has been used internally for solution within Google for many years. Often this way of working had been published into papers to has published papers about how data is processed (MapReduce), and queried (Dremel) and managed on millions of servers (Borg). This has allowed for other implementations to have been created based upon these architectures, which has given more people access to the solutions that were only possible within Google (and even led to improvements in Google’s own infrastructure). External implementations such as Apache Hadoop (MapReduce), Apache Drill (Dremel) and Kubernetes (Borg) can now be accessed by everyone and managed at a developer level through a web console or API from both Google and other third-party cloud providers. The popularity of the hashtag #GIFEE (Google Infrastructure for Everyone Else) shows that customers large and small can now get access to many of the infrastructure services that Google has used to scale multiple services to billions of users as it has grown.

Looking back at the article I wrote last year and then looking at the companies that have continued to make waves in the utilisation of geospatial functions then I suppose you wouldn’t have thought that  one of the biggest splashes would be that of a game whose origins were in the  1990s. Whilst people might focus on the visual elements of the game, the real power was the fact that at it’s heart this is based on millions of people playing in real time using a geospatial backend, that wouldn’t have been either possible or affordable to do even 3 years ago for most companies

One of the most interesting by-products of this was the rise of helper apps build by people to share information about the game. Applications that can support 500000 users, backed by thousands of geospatial queries per second, on a set of cloud services that cost around $100 was the stuff of fantasy a few years ago. Due to the availability of these solutions the ability for a company to build a game that becomes viral and scales to millions of users but also for a cottage industry of developers who can ride the wave of this success with applications costing a few hundred dollars, which might not have a long shelf life but can capture ‘the moment’, is powerful and affordable in terms of development for all companies in the future.  

Google Geospatial Infrastructure for Everyone Else

In the same way Google has released information about how they have built a highly scalable infrastructure they have also released information about how they process and store information about the real world. Implementations such as the S2 geometry library have been used by many companies (Foursquare, Yelp, Uber) to build and scale massive geospatial solutions as well as many parts of Google and public-ally available solutions such as Google Earth Engine have been used to produce high resolution cloud free images of the globe, as well as being available to many academics around the world. Over time we have also seen the use of products like Street View to identify house numbers to better support geocoding or place searching within Google search or the Maps APIs. In the same way that Google infrastructure has become increasingly available to anyone so has the ability to do the same spatial or imagery related tasks that only Google or Governments could have done in the past.

Getting meaning out of data has always been one of the key tasks of any analyst, be that geospatial or image. As the size, complexity and temporality of data has increased we can now store the data but increasingly need help in deriving the meaning. Services can be provided to many users once this has been done, the faster and the more relevant this information is the more useful it is for people. Take the identification of house numbers as an example. In the past bespoke systems might be needed that would run on clusters
of servers now you can get this all behind a single API. The
Cloud Vision API  can provide an easy access to this sort of analysis, from the API you
can get information about all sorts of information from the image, from Optical Character Recognition to Landmark Detection which has potential applications in wide varieties of applications and can all be tagged with location.

Building applications at the scale of Google is now within reach for many companies using cloud based solutions, with the complexity of machine learning being hidden behind simple APIs. The functionality once used to launch many of Google’s geospatial products can be enabled from web based consoles and integrated into applications through the use of a single platform, Google’s infrastructure is now available for everyone and more and more Google’s geospatial analytics infrastructure can be accessed through a web service by any developer for use in any application.