When web hosts go bad.

There are a number of reasons why this blog has taken a bit of hiatus in terms of posts in the last fewimage months. Probably the most important of which is that fact for once in the UK we seem to actually be having a summer, coupling this with my daughter forcing me to watch all of the Doctor Who’s since the reboot (honestly she’s only 4 but she’s quite persuasive) means I spend a lot of time outside on the bike or in explaining the fighting differences between Daleks, Cybermen and everything in between.

A slow spiral down the web plug

Secondly has been the slow demise of an excellent webhost, Webhost4Life. Whilst never the fastest host it has always been reliable for everything I’ve needed it for, especially for the price and features I paid for imageit. Recently though they decided to move to a new platform which I think this was due imageto the company being sold to another and a change of management went for a cheaper option. Initially it seemed good, the admin site was much improved and it seemed that things were going tickey-boo. Unfortunately as others have also found the support and stability of the site has gone way down. This blog was up and down more times than the Grand Old Duke of York, and even when it was up it was slower than an England footballer in front of goal. Finally having enough after 3 days of downtime I decided to pull the plug and move this blog over to a new host, Arvixe.

imageNot only has this host been a lot quicker for me, it obviously doesn’t seem to be as contested, but it also supports a variety of development platforms, from PHP to ASP.NET and whilst the admin pages aren’t as nice and I had to install WordPress 3 the manual way (actually it wouldn’t even upgrade on WebHost4Life so it was very broken) which took all of one hour including getting the data loaded from the old blog.

I wonder if it’s like changing banks, hosting is one of the most painful things to change once you have it all set up, once you’ve done it though, you wonder why you never did it before.

Moving On

image Finally I had my final, final, final leaving do this week from the hallowed halls of ESRI(UK) esri(UK) ESRI UK. After 7 weeks of being at Google and one month of garden leave (people kept telling me there was no ‘–ing’) it was about time too. Moving on from a job where I talked to people about GIS technology to a company which I talk to people about GEO technology has been less of a shift than you might think. The complexity of solutions might be less, I haven’t yet touched SAP since I left (phew), but it’s still based around understanding how people’s workflows might fit and integrate with the respective technologies and API’s. Although it’s a lot more consumer-focused though, a bit like web mapping was with ESRI UK 8 years ago, I like it.

image There is so many geo-technologies to learn at Google, from the server based Google Earth Enterprise to the cloud based Fusion tables, as well as the well know Google Earth and Maps. My first seven weeks I have have been like a kid in a candy store both metaphorically with the learning of new technology and physically in the micro kitchens (note to self, must do more exercise).

Expect to see much more Google Geo related tips and trips around here as I work the stack of technology in my sweetie bag. I won’t give up the esri thing just yet though and hope to do some integration work between the various systems and I’m giving a talk on integration next week

Now with added location and mobile

On a side note, whilst updating the site I implemented two new features. Firstly the move had broken my imageprevious ‘where are you from’ section so I thought I’d update it with some Google Maps code. Unlike the previous attempt I decided to take the easy geolocation route and have the browsers (or gears) do it for me. I’ll discuss how this works in a subsequent post, it’s not tricky and takes about 10 lines of code, all of which you can get from the Google Maps v3 API page here

The second feature was the addition of the WPTouch plug-in which formats WordPress sites to work nicely on mobile devices like the iPhone and Android devices. The free version seems to work nicely for me, the pro version seems to have some nice features that one day I might find I need. Give it a go, it looks nice.

The mysteries of SEO.

As you can see I had this idea of hosting my own blog. In case you wondered your here reading it. Now Iimage thought that would be a simple thing to do just put a site onto the internet and eventually the magic that is the Google-bot or the Bing-bot (do we call it that?) would one day swoop down and make me part of the internet (I firmly believe, albeit slightly misguided that if your on a public site and not in the index then your not actually on the internet). Now I suppose before I go into my failings as a web developer I feel  that I need to justify myself.

An explanation.

I’m a fairly seasoned web developer (read old), I understand the intricacies of JavaScript, Flash, Silverlight, you name it I can probably work out how to code it eventually (come on it’s not hard, well except WCF but I’ll post more on that another time). But as you see most of my time is developing sites and solutions for enterprise customers. They don’t need sites that are indexed or have a high pagerank, use adwords or meta-tags. I now realise that some of the stuff I have done would have benefited from some or all of  that!. That getting your site indexed and high up the page for certain search terms is a whole industry in itself and that there are some helpful sites that can explain the process.

How did I get here?

With my host it was fairly easy to get my blog site up and running, it’s a classis ASP multi-tenant provider with a shared server offering, with all the benefits (mostly cost) and shortcomings (mostly performance). It has an automatic deployment of WordPress of a certain patch level, 2.7.1 and then it’s simplicity to use the internal upgrade feature of WP to make sure the software is up to date. The impressive nature of WordPress for deployment and integration of new themes and components should be a model for all sorts of web based applications, anyway I digress. Once up and running, I thought that was it, all I needed to get done was to add some useful posts and my site would become one with the collective and all I would do would be to use Google Analytics to work out how many people were coming to the site and where.image

Now this is where it became interesting. Google Analytics is an excellent tool for finding out whose accessing your site and from where. It even shows a map of the locations around the world or an individual country of where the people are coming from. The question is unless you have a lot of friends and colleagues who might want to read (and be interested in what your writing) then you need to get it out to a wider audience. Google has another set of tools to do this, it is the method of expediting the crawling of your site by Google, sort of telling the search engine that your ready for your close-up.

Where’s your Sitemap?

Now before you think that registering your site with Google, or Bing for that matter as they have a basic set of equivalent tools, will open the floodgates of people to be exposed to your pearls of wisdom you should think again. Getting into the Google Index isn’t that hard, getting listed high up in the list for a particular search, or even on the magical first page, needs a lot of people to link to your site and add trackbacks and comments to actually show Google that your site has value to other people over just being a repository of drivel, I leave you to decide what this is. It is at this point the science of getting into search turns into the art of Search Engine Optimisation or SEO. Wikipedia defines SEO as the following:

Search engine optimization (SEO) is the process of improving the volume or quality of traffic to a web site from search engines via “natural” or un-paid (“organic” or “algorithmic”) search results.

SEO is the process of getting your site improved within the search indexed so that for the right keywords that your site ranks in the top two pages, at least. It relies on a number of factors but starts basically with two files, sitemap.xml and robots.txt. The first file tells the search engine which pages to index specifically and the second file tells search engines what not to crawl through. The difficulties is that as image time moves on there are no hard and fast rules which determine what a particular bot finds important. As ‘blackhat’ techniques have been used to play the system to improve page rankings, so have the algorithms used changed to sniff out people not playing fair. This means that whilst the sitemap is important for notifying people about what to index, the robots file is equally important about telling them what not to index so as to possibly be blacklisted by the crawler for trying to play the system.image

Thus you have a never-ending merry-go-round of SEO optimisations becoming redundant and new techniques being developed to try and keep people sites near the top. This doesn’t even take into consideration your actual location and how the search engine knows this and will send you to your ‘local sites’ based upon where your coming from. This sort of geo-targeting isn’t new but is increasingly being used to target people in all sort of sites, from search to twitter (see Trendsmap for an excellent example).

Now in most of my web development life SEO isn’t a term that needs to rank highly in a solution to maintain pipes or manage gazetteer data, but as more and more sites are exposed to the web and more companies see value in exposing their data to be used by everyone, then the nature of such mechanism as SEO and GIS will often need to be used together. This is especially important in the sharing of spatial metadata in a form that can be indexed easily by search engines and therefore more widely disseminated.

SEO and ArcGIS Server

Now I was wondering how we can both use SEO for promoting and sharing information from an ArcGIS server implementation and also how this might be used to protect services from being indexed when you don’t want them too. The REST API has been around for a while now you can see how Google indexes ArcGIS sites ‘on the web’ by doing a search on “ArcGIS/rest/services”. You could block this from an index by using a standard pattern within your robots exclusion file such as:

Disallow: /ArcGIS/rest/services/mymapservice

This might be supplemented by more complicated patterns that use wildcards although it should be understood that the mileage of this might vary according to the bot doing the crawl as it deviates from the standard.

Of course it is important to understand, only ‘good’ crawlers obey the robots.txt file, ‘bad’ robots will crawl anything if you put data onto the internet, you have to assume it’s going to be used. It’s therefore important that if applications and data need to be secure from unauthorized usage that you use the appropriate security measure for your application, more details about this can be obtained from the ESRI documentation here [Working with secure ArcGIS services].

Services are always only one part of any application, it’s also important to make your user interface as SEO friendly as possible with a mapping interface. This imagepost (from SEOmoz.org) gives a good overview about how you can provide spatial information that can be  reported in a format that allows indexing. A lot of it is similar to providing accessible information, as a bot often ‘sees’ a web page like a screen reader, ignoring the image based map information and concentrating on this links, the url’s and the text of the application, creating an accessible version of the site often creates a SEO and indexing friendly version of the site.

As GIS and spatial systems find increasingly find them used for both commercial and public services, getting them indexed is only going to become more important, how that is done is still much of an art.