Andy Hedges' Blog

Why you don't need an Enterprise Service Bus (ESB)

ESBs irk me, not the technology in and of itself, that can be useful, it’s the way they used. Mostly because every architect and his mentor seems to think you can’t have an architecture without one. I swear sometimes they must have the ESB icon pre-painted on their whiteboards because, splat, there it is, a hulking great rectangle in the middle of every systems diagram. They aren’t always called ESB, sometimes they sneak through as ‘Orchestration’ or ‘Integration Hub’ or a vendor product name.

I first came across the term ESB about 10 years ago, a colleague mentioned them to me and we discussed the concept over lunch, by the end of the lunch break we’d come to the conclusion they weren’t necessary or indeed desirable in an SOA. I’ll put forward some of the classic reasons architects give for using ESB and explain a better way to achieve each desired outcome or in some cases why that outcome isn’t desirable.

Before I go on I should explain the difference between the central ESB (bad) and using ESB technology in a more appropriate manner.

The big central ESB
Figure 1. An ESB used in an inappropriate way

Figure 1. shows an ESB through which all services communicate, they are essentially unaware of each other and may be in the dark over what each others’ interfaces are. Now in the worst case the central ESB grows a team to support it. This is, after all, the obvious things to do, every service in the organisation wants changes all the time to what they use/expose from/to each other and therefore in order to prevent everyone hacking away at the ESB and general chaos the ESB team is created. All requests for change to the ESB go to the ESB team.

The ESB team are now furiously busy and considered heros for helping everyone communicate and are the first port of call from each service team to get what they need from the rest of the organisation. Soon they are getting requests for new functionality, however they don’t know how to change the services and perhaps it’s a little tricky to work out which service should handle that functionality so they slip a little bit of business logic into the ESB and call it orchestration, after all orchestration is what ESBs are for. This continues for a number of years until it dawns on everyone that the ESB contains most, or at least significant portions of the business logic and the services have become no more than CRUD layers over databases.

ESB technology used sensible way
Figure 2. ESB technology used a sensible way

Figure 2. is ESB technology used in a more appropriate way, of course it is no longer a true ESB, because it isn’t “Enterprise”, it isn’t just one service bus for the entire enterprise. Each service uses ESB technology within itself to ensure their interface can be stable, they can use it to maintain multiple versions of the interfaces simultaneously but that is within the service. The service team has complete autonomy over what happens with their service now. In many cases they don’t need expensive software to do this, they can simply code it in the programming language of choice or use a simple library to achieve things like mediation, routing, location, security etc. Once each service has this capability the requirement for a central ESB team falls away entirely. As the organisation has been sensible and assigned teams for each service then those team speak to each other directly to get new functionality. The act of the teams talking directly to each other, face to face, person to person about their requirements also improves overall understanding of the organisations software assets.

Below I address some of the common reasons people give for using central ESBs and suggest more appropriate patterns for achieving the desired outcome.

The ESB Protects Me From Change

The argument here is that if you want Service A to call Service B you’d better go through the ESB in case Service B’s interface changes.

What should happen is that Service B publishes an interface and guarantees it won’t change for a period of time (say 12 months), if changes are required another version of just the interface is created and the old version is mediated on to that new version.

That’s quite a bit to take on so I’ll give a simple example. A company builds a service with an operation that exposes the price of various commodities, it returns a MoneyValue response:

MoneyValue getPriceOfGold()

This works perfectly until functional requirement gets added to return the price of silver too. The architects, being smart, realise this is probably not the last metal they will be asked to add and so they create a method that’s more flexible and looks like

MoneyValue getPriceOfMetal(Metal metal)

However there are several consumers of the old getPriceOfGold method. In order to support these clients they leave the old service interface in place but redirect calls from getPriceOfGold to getPriceOfMetal(gold)within the service and everyone is happy. Eventually they will ask consumers of getPriceOfGold() upgrade to getPriceOfMetal(Metal metal).

Therefore you don’t need a central ESB to achieve this objective, you don’t need anything outside the service but a bit of mediation logic in your service.

Extra Level of Protection

Just to quickly address the oft retort to the above that a central ESB gives another layer of protection, yes it does, but it comes with all the draw backs of the central ESB: adding logic in the wrong places, functional enchancement bottleneck and so forth.

The ESB Allows Me To Orchestrate Services

Orchestration is either business logic, which belongs in the service, this is the point of services after all, they contain your logic and data, or if it is simply a case of moving a human through some process or other then that belongs in the UI code (e.g. register for a website and then add something to your basket).

The ESB Can Mediate My Data

Yes it can but see above section ‘The ESB Protects Me From Change’, the mediation can be achieve much more sensibly in your service.

The ESB Can Locate My Services

Each service should have a mechanism for locating any other service for the purposes of RPC calls. I prefer to use DNS in most cases (e.g. metal-exchange.example.com would resolve to the metal exchange service interface or API), DNS has huge power but can be used very simply too. Bonjour (aka Zeroconf) is a often cited as a solution too and it’s a good answer but it merely some extentions to DNS at the end of the day. Others suggest things like UDDI but I have never found the need myself.

The ESB Can Do My Routing

For services that subscribe to events from other service, the same process can be used for locating those topics/queues, DNS to find the broker and well named and documented topics/queues on those brokers. Most message brokers provide means of routing messages from location to location sensibly, if yours doesn’t, get another.

The ESB Can Monitor My Services

Your services should provide monitoring information over any number of technologies that can be monitored by any number of technologies. Examples of technologies that can enable your services to be monitored are syslog, JMX, SNMP, Windows Events and simple log files — one or more of these are available in just about every language. Examples of technologies that can monitor those technologies are Nagios, OpenNMS and any number of commerical systems. You don’t need an ESB to do this.

The ESB Provides Extra Security

No they don’t, they remove security by terminating it prematurely. They open you to either deliberate or mistaken man in the middle attacks.

Andy Hedges
[comment]

The Simplest Blog That Might Work

I’m aiming for a simple, fast and minimalist blog. A blog where writing posts is all I have to think about, not themes, fancy backgrounds, AJAX, hosting services, cloud APIs and well one starts to lose the will to live.

Raspberry Pi in a takeway container on top of random network equipment
Figure 1. Raspberry Pi in a takeway container

The design goals are:

  • simple
  • cheap
  • fast

I’m quite pleased with the way it works, it’s probably not for technophobes but then nor is blogging. In order to publish a post you create a new text file with your post in it in a simple directory structure, if you have any images, video etc then you drop them in a “resources” folder. If you’d like some formatting in your post you can use markdown. After that, run a simple script and everything is taken care of: it’s published to the web. So far my limited posts look like this in the directory structure:

$ tree --charset US-ASCII posts/
posts/
|-- 20120216
|   |-- article.yaml
|   `-- resources
|       |-- First-Project-Syndrome-Figure1.png
|       `-- First-Project-Syndrome-Figure2.png
`-- 20131230
    `-- article.yaml

3 directories, 4 files

How it works

The markdown from the YAML file is converted to HTML, which is minified (optimised to remove redundant spaces etc), and then put into a folder which btsync is watching, once a change is noticed it is synced to all computers with btsync installed on, including, most importantly the web server.

A Note On DNS

As I’m hosting this on my home broadband connection, which doesn’t have a static IP address, I needed a way to update my DNS record quickly every time my IP address changed. To do this I used a free service DNSdynamic which gives you a subdomain on one of their domains (e.g. example.dnsdynamic.com), I chose andyhedges.http01.com but anything would work, you then install a client which regularly checks your IP address and updates the DNS if need be. This is great but I wanted to use my vanity domain name hedges.net, I therefore configured a CNAME with my DNS registrar to point blog.hedges.net to andyhedges.http01.com and I was in business, DNS-wise at least.

Costs

The cost of my blogging platform breaks down as below:

  • software - £0 (all open source or freeware)
  • hosting - £0 (using my home fibre connection)
  • hardware
    • Raspberry Pi - £28.99
    • Power Supply - £0 (free with phone)
    • Network Cable - ~£1
    • SD Card - ~£4
  • DNS - £0 (using my DNS registrar and Dynamic DNS provider)

There we have it, a blogging platform for thirty four quid that doesn’t require 3rd party hosting. It remains to be seen if my ISP gets cross.

Benchmarks

On the little R’berry Pi a quick benchmark with no optimisation shows it can handle 250 requests per second with a response time of 3ms (across my home gigabit network).

Todo

There are still a few more to-dos:

  • preview mode
  • comments (I think I’ll use disqus or maybe G+)
  • RSS
  • optimising Nginx (e.g. gzip or sdch, SPDY, threads and so forth)
  • smartypants-like substitution
  • Open Source it (tidy code, add licenses put it on github)
  • Set up a 301 Moved Permanently on a virtual host for the Dynamic DNS name

Full details

For those interested the full details of software, hardware and network.

The tools I’ve chosen for the client side are are:

  • Notepad (although sublime, textpad, gedit or vi would do), this is for editing the posts
  • btsync this enables me to keep the webservers and any computer I use in sync with all of the generate content, the source information and templates, more on that later

The development tools:

The software libraries:

  • SnakeYaml this is a YAML binding for Java, YAML is basically a human friendly information format, similar to XML or JSON but unlike those two easy to read and write for us humans.
  • FreeMarker is a templating language similar to JSP or Razor and provides an easy library to integrate it with your projects
  • Actuarius a markdown to HTML converter
  • htmlcompressor an HTML minification library
  • YUI Compressor a CSS minification library

The server side software:

  • Nginx a nice, light, fast HTTP server
  • btsync see above
  • ddclient The Dynamic DNS client that allows me to have my DNS record updating when my ISP changes my IP address
  • Raspbian A Debian Linux variant for the Raspberry Pi

The hardware:

  • Raspberry Pi a circuit board sized computer
  • Indian takeaway container, no this isn’t the wacky name of some kickstarter project, I used one of the plastic tubs that takeaway curry comes in to provide a case for the Raspberry Pi, it keeps dust and/or water off it
  • Mini USB charger plug and cable from my Nexus 5 (I have so many of these and so I chose a smallish one)
  • Network cable from my man draw

The network

  • Existing Fibre connection and associated routers and access points
  • Netgear Gigabit Switch
Andy Hedges
[comment]

First Project Syndrome

I hold the opinion that Services (as in SOA) should, where at all possible, be delivered as separately budgeted and planned work from functional enhancement projects to avoid First Project Syndrome.

To understand what First Project Syndrome is let’s take a look at some graphs (bear with me…).

Capital Cost

Capital cost vs number of projects
Figure 1. Capital cost vs number of projects

As you can see from figure 1. the assertion is that the cost of delivering the first project with a service is higher than for the first project using a point solution (by point solution, I mean grabbing data from source data stores or replicating data into your database or any number of data sync techniques). The reason why this costs more for the first project is:

  • there are overheads with creating a service such as following their prescribed best practise
  • creation of infrastructure to run the service
  • some expert knowledge in SOA practices

However from the first project onward the savings are realised. The reasons why point solutions cost more from the second project onward are because:

  • a point solution often starts from scratch each time, the work has to be redone and redone slightly differently for each specific scenario
  • point solutions layer complexity upon complexity (e.g. tables accessed by many unknown systems, various extract files created for many systems, data shunted to and from multiple undocumented systems)
  • it’s very hard not to make mistakes when syncing data
  • syncing data causes all sorts of edge cases when trying to modify it

Operational Cost

Operational cost vs number of projects
Figure 2. Operational cost vs number of projects

Point solutions are almost always more expensive to manage, with a service built correctly and to specification the operational costs are lower: day one. They are easier to manage, monitor, failover and so on. The significant reason why they have these qualities is because they are conforming to set of good practises that specifically give these qualities. They leverage the wealth of investment made on previous service developments within the organisation. Each service then stands as a container for future enhancement in its particular business domain allowing functionality to be added and still providing the operational characteristics demanded.

More often than not the first project does not consider, in detail, the operational cost of managing the solution on-going or may not have the budget/resource/time to seriously take this in to account. Typically the projects have the more immediate concern of getting the solution shipped to the business.

Other Common Pit Falls

There are a number of other pitfalls of delivering services as part of the first project. In no particular order:

  • the first project’s scope determines the scope of the service making it less suitable for other consumers
  • freezing of the project causes the service to also be frozen where that service would have had wider benefits to the business. The case for the project didn’t stack up and so the assumption would be that the service’s case didn’t either.
  • compromises in the design of the project solution force compromises in the service design

To all these points I would posit that the first business project that requires a service should not be the project that delivers that service. This does, of course, mean that the business case should stack up against more than one business facing project (and if it doesn’t then it probably isn’t worth building).

You wouldn’t try to create a power station as part of the build of a house but every house needs one to function.

Andy Hedges
[comment]