A rule of thumb.

There has been a long standing rule of thumb when deciding how many instances to give a map service to give optimal performance. Finding this information has sometimes been hard although surprisingly when asked for this information the other day, and failing to find it, I decided to see if it was on the new resource centre. Fortunately the is a page on services performance.image

http://resources.esri.com/enterprisegis/index.cfm?fa=performance.app.services

Here it not only gives the ‘rule of thumb’ for the number of instances for a map service (2.5 * #CPUs) but also a whole series of information about the relative performance of each service type and the factors that will specifically effect the performance of any map service.

With 9.3.1 it becomes a bit easier to automatically determine why a service might be slow either through using the new .MSD service type and the map services publishing toolbar or using the old school mxdperfstat script.

The Perils of Synthetic Tools

Of course any synthetic tool will only give you a level of guidance, any proof would have be got through actually performance testing any solution during development, preferably as early as possible. Such tests and examples are given in two recent ESRI whitepapers, the High-Capacity Map Services: A Use Case with CORINE Land-Cover Data and Best Practices for Creating an ArcGIS Server Web Mapping Application for Municipal/Local Government.

Both documents cover the optimum use of data and their effect on how an application performs. The former in terms of a high scalability site but with information that can be applied to all sites, especially in terms of the recommendations about using file geodatabases for large performance gains. The latter document is important as it shows how a workflow can be mapped to making choices in implementation of an ArcGIS Server architecture, map and geoprocessing services for a medium sized authority.

A Good Guide

Guidance like that available in these two documents and on the Enterprise Resource Centre in general, whilst not indicative of how every site will perform, gives a good grounding in the pitfalls to avoid when translating user requirements to any specific solution architecture. With any performance and architecture though its important that you think of not only the performance now but also the performance implications of any site growing over time. Without any analysis of the capacity requirements of your site, you really don’t know how long your current performance will be applicable. It should be remembered though as is said so eloquently on this Ted Dziuba’s site ‘unless you know what you need to scale to, you can’t even begin to talk about scalability.’

Understanding your current performance requirements and your short to medium term load requirements, and potential spike points, will mean that you can concentrate on worrying about the right parts of your application in terms of performance and stop worrying about those areas that might never become a problem. The book ‘The Art of Capacity Planning’ gives a good overview about how to tackle monitoring your sites performance over time, what to worry about, and when.

97 Things

97 Things CoverI’ve been reading 97 Things Every Software Architect Should Know on and off for the last few months. Not only does the book come with it’s own website where you can make comments on the pearls of wisdom within it’s pages. It also helped me understand that the problems I’ve faced in Architecture over the last year are very common and that architecture whilst not exactly an art will never be a science either. In terms of the content there are three main items that stand out for me within the book.

1) Don’t put your resume ahead of the requirements.

There is sometimes a relentless pressure towards the new with technology. The lure of ‘shiny shiny’ is sometimes too much for both developers and architects and decisions about what to use is often made based upon what is cool rather than what is needed.

Now often new technologies are there to plug known problems or gaps with existing technologies (ArcGIS caching technologies is a good example of a new technology which was introduced for scalability reasons). It is often the case though that there is no good reason to choose a new method over an old method beyond the need to try something new

When evaluating new technologies especially cutting edge ones, you should always be checking out whether it really is the correct solution for your project, battling with new poorly understood and documented technologies might cause more problems than they solve in the long run.

If though a project is going to run a number of years, or have a lifespan which doesn’t include a technical refresh within 5 years or so, then it is always good to evaluate the new as they might still be supported into the future and most issues can be assumed to be quashed in the meantime.

2) It’s never too early to think about performance.

I love this one. When developing software even in a continuous integration environment, it’s all about functional requirements. It’s all about does it do this, does it break when I do that. Often though when all said and done, user acceptance is often done on how the application feels. How long does is take to load? Does the UI look right?

These non-functional requirements are often poorly defined to be tested against. The result, major overruns in projects when the delivered application fails performance or scalability testing during user acceptance. The solution, test performance as near to the start of a project as possible, in order to determine whether the new geoprocessing task you have added to the application is quick enough under load or brings the whole server farm to a grinding halt.

There are a number of ways of doing this, but as there is test driven development to catch bugs, there can be performance driven development to catch performance or architectural issues that might cause problems down the line when the system is delivered. This becomes increasingly important in a system that might be integrated into an enterprise workflow or service bus, where the nature of the performance issue might cause other services to be diminished.

I can never stress to people the importance of performance testing in any project, I can also never stress to people the way that performance testing can become a fixation. There lies a future post I’m afraid!

3) Chances are your biggest problem isn’t technical.

As we know there are many ways to skin a cat (sorry to those cat lovers out there). There are also many ways to deliver technical solutions, there are so many technologies and architectures out there to solve all sorts of problems that there is usually no excuse to not overcoming a technical hitch. What you can’t do is solve all of the non-technical problems as easily.

I mentioned in the last section about non-functional requirements. These are usually the key points that can make or break a solution and are thought up by real people! Sometimes these might be seen as impossible more likely they are seen as not fully defined to deliver too. In the main, all sides of any solution want it to succeed, at the minimum cost in both terms of money and time. If this understanding is the start of every conversation then any of the non-technical or people problems, ‘should’ be easier to solve.

Remember if you see a requirement that says the system ‘should have pretty maps’ run for the hills.

The book has many more great points, many of which I’ve seen happen on projects or will now be especially aware of!

In defence of the Web ADF.

If, like me, you have been developing internet mapping and GIS applications for a few years (twelve odd in my case) you can see how easy it can be to develop applications to do exactly what you want to do. You can design, architect and deploy all sorts of solutions to fit into you business or integrate into your enterprise systems.

It should be understood though that in the world of GIS (can we still call it that in this neo world?) most people aren’t developers or architects, they are analysts or planners or engineers that might want to share the information and maps are creating in their desktop application or capturing in the field, they want to do this with the minimum of fuss with a set of tools that do a straight forward job.

They also want to do it now, not in six month time at the end of a project, and also they might want to do it tomorrow with the need for a change request to be placed in the system.

UX is an important thing, but it isn’t the only thing.

In the world of Flash, Silverlight and JavaScript libraries it is easy to forget that many people just want to get a job done. In the world of techno-geeks and geo-nerds (I’m a mixture of both if you want to know) it is easy to overrate the shiny-ness of technologies in the actual usefulness of a solution to get a person’s job done.

Often development and implementation is a trade off in available resources, time and knowledge. Given infinite time and resources of course it is possible to build anything (ok almost anything, don’t get picky). But we often don’t have the luxury of either. A product and client like the the Web Mapping Application, available in ArcGIS Server 9.3.1, is a technology that is currently the only way for non-developers to get web based applications out for use within an organisation. It is also currently the only option with a task framework which can easily consume geo-processing tasks and edit data in the browser, using just standard out of the box tasks. This is still as powerful a tool as it was when it was first released.

Although web development technologies have moved on since the WebADF was first developed, the power and integration possibilities with ArcGIS are still unmatched. Whilst it might be possible to develop and architect a new solution platform using the REST API the WebADF currently does much of the heavy lifting for you. The framework it provides allows for the development of true Web GIS applications, the sort of applications that many professional GIS users actually want. In the new time of lightweight solutions the Web GIS platform of the WebADF’s time might be around the corner.

With great power comes great responsibility

As with anything that has a possibility of being tightly integrated with the server the WebADF has big implications with respect to performance, especially with the fact that you can tie the development to the non-pooled services. This does throw up some architectural challenges when moving such applications from the development stage to the production stage, but certain design decisions can help with performance and a testing regime throughout the project will allow for any problems to be caught before they become an issue.

A starter for ten.

It seems in this increasingly twitter fuelled world that anyone starting a blog must be certified. Surely the world can’t read more than 140 characters anymore, why bother making them? Well always one to buck the trend, I thought it might be a good time to start a blog.

Hmm a blog, what should it be on? ArcGIS, nope plenty of those, Programming, nope loads of those also. GeoNerdRage? Nope I can name a few of those also 🙂

So in order to start this blog I decided on the theme of performance and scalability, the architecture of such and the technologies that can help people design and develop solutions for performance and analyze the problems when things go wrong.

The need for performance

To me this is an increasingly important topic as a lot of spatial systems beyond simple mapping sites to solutions that integrate into key business applications within an organisation. Microsoft, SAP and Oracle all provide the big systems that power enterprises and GI is often stored within them or integrated alongside them to provide information the organisations use to make business decisions. In the past it has often just been a challenge to get the systems to work together at all, but with the move towards SOA and recently REST based services, it is increasingly easy to call systems and tie services together to improve the decision making process. In order for this process to be seamless they system also must be quick and perform under load and continue to perform even when the system grows with greater capacity.

The quandary is that as systems become more complex and solutions integrate together the design choices become increasingly important. The solutions we could get away with in the past with simple mapping applications become more challenging when integrating SOAP services with SAP, or scaling an editing application up to hundreds of concurrent web users. It can be done and that is where a constant approach for working performance into any design and monitoring the performance of any system at a variety of levels throughout the development process and into deployment stages.

How you do that and how you monitor it I’ll leave to future posts but much of the process can be found at a high level in the following documents:

Performance-Driven Software Development – Carey Schwaber (Forrester report can be found elsewhere on the internet for free if you register)

And the excellent

Performance Testing Guidance for Web Applications – Microsoft Patterns and Practices

Enjoy the ride…..