Erlang – The CEO’s View
Gordon Guthrie is CEO/CTO of hypernumbers.com, an early stage semi-stealth start-up.
He has previously had a range of senior technical and business positions including Chief Technical Architect at if.com and more recently as Solutions Architect at BT/City of Edinburgh Council.
I bumped into Gordon at FOWA in London a few months ago, and asked him to write an article about Erlang, which he was extolling at the O’Reilly stand, and he came back with this piece about why Erlang should (and shouldn’t) be used in the workplace.
Erlang – The CEO’s View by Gordon Guthrie
This is a CEO opinion on Erlang –it is not a technical article but it does require technical knowledge about programming and programming languages. The poet Robert Lovell once said: “there is no money in poetry, and no poetry in money”. Readers seeking The Joy Of Erlang (and joy there is) will go away from this screed unsatisfied.
CEO’s don’t have opinions on programming languages. Or they shouldn’t. Programming languages are merely an epiphenomenon of the business requirements – if a programming language can do ‘it’, whatsoever ‘it’ might be, then it’s in.
So why might a CEO, like myself, venture such an opinion? – and such an opinion attached to the new sexy language de jour. Oops. Old hands will immediately know where to pin the blame, a techie ‘playing up’ – for ‘up’ read ‘out of their league’.
Indeed, what opinion could a CEO offer on a programming language. It would seem to me it would have to be one of two stark options either you should use it – or you shouldn’t.
So before I tell you that you should use Erlang, as tell you I will, it is probably worth making the case
(Still) No Silver Bullet
The traditional response to this sort of article is to pull out a battered copy of Fred Brooks’ article No Silver Bullet – Essence And Accidents Of Software Engineering  – and with good reason. No Silver Bullet asserts:
There is no single development, in either technology or management technique, which by itself, promises even one order of magnitude improvement within a decade in productivity, in reliability, in simplicity.
In business speak this translates into a firm injunction to the CEO that there is not make or break commercial decisions on programming language alone for it to be within their core remit. Certainly there are critical decisions about technology that will impact the performance of the company, but the job of the CEO is to get the best technologist in, build the best corporate culture and enable the troops to get on with it with the tools they think are best fit for the job. Decisions on technology are best left to the CTO and their team. In my other job as CTO I have written  about Erlang from a technical perspective.
Brooks separates two different aspects of software – the essential and the accidental. The essential aspects are those that pertain to the business problem that the software is addressing, and the accidental aspects are those that are constraints placed on the outcomes by actually using real software on real machines with real limitations.
Brooks’ argument is that step-changes in software productivity can only come by addressing the essential aspects – there aren’t enough accidental constraints left that their being addressed can deliver the goods.
There are two ways to promote Erlang in these circumstances. The first is as one of the group of functional programming languages (a ‘family’ that includes Lisp, Haskell and F# amongst others) which focuses on addressing the central question of state. Functional languages allow the decomposition of programmes into functions with side-effects (ie that depend on state) and those (the majority) that don’t. Side effect-free code is deterministic – the same inputs give the same outputs every time.
This addresses the core essential complexity of software programming. To quote NSB:
From the complexity comes the difficulty of enumerating, much less understanding, all the possible states of the program, and from that comes the unreliability. From complexity of function comes the difficulty of invoking function, which makes programs hard to use. From complexity of structure comes the difficulty of extending programs to new functions without creating side effects. From complexity of structure come the unvisualized states that constitute security trapdoors.
This line of argument focuses on using the differing semantics of different individual programming languages (and families thereof) to partition the possible state space of the programme and thus reduce complexity dramatically. Functional languages are designed around side-effect-free functions that are their essential (in the Brooks sense) nature. Functionality with side-effects (like writing to disk for instance ) are accidental in nature. The FP argument goes that if you don’t design a language such that side-effects are considered constraints on your language then you are mixing accidental and essential problems willy-nilly in all code written in that language – you should develop your systems in a functional language. As a techie I agree with this argument, you should have a bias to FP – but as a CEO I don’t think it is important enough to warrant my time.
The second line of argument is a sort of super-set of the first one which focuses on the Erlang concurrency model. Erlang uses ‘processes’ as basic units of concurrency – processes which are analogous to operating system processes. In Erlang even functional components that have side-effects have no shared state (at the conceptual programming language level). This can be made a convincing argument at a technical level.
But Brooks is right. There is no silver bullet and the Erlang programming language isn’t it, which would appear to leave this article somewhat holed below the waterline. Maybe it is time to make the positive case for Erlang.
Erlang – The Long View
NSB lists four Promising Attacks On The Conceptual Essence:
· buy not build
· requirements refining and rapid prototyping
· incremental development, grow, not build, software
· great designers
On this side of the house, Erlang appears to be a clear loser. There is no evidence that Erlang is superior to other languages in terms of rapid prototyping or incremental development. By definition great software designers are not tied to languages except in terms of the availability of experienced ones – and ‘small’ languages like Erlang naturally lose to ‘large’ ones like almost all others on that line anyway.
‘Buy not build’ really should be ‘procure not build’ with the rise of open source and free software. In language choice this is normally expressed in terms of library support – and on these grounds Erlang also loses. There are certainly ‘missing libraries’ in Erlang – particularly with respect to string management – and these are partly a consequence of limited uptake. The cussedness of Erlang and its obdurate refusal to ‘play nicely’ with other languages doesn’t help – no handy piggybacking.
But it is on this unpromising territory that I will stand my ground. The essence of this argument is that with Erlang and you buy something more primitive than software libraries which you would otherwise have to build – and the cost-saving is so great that there is a compelling financial case for using Erlang almost without exception in certain classes of businesses.
In order to understand what ‘this primitive’ is, it is helpful to go back to the beginning of our trade, yes all the way back to the beginning, the “hello Mr Babbage, how’s it going with auld Ada Lovelace, nudge, nudge, wink, wink?” beginning.
In the beginning the computer was a hardware object sufficient to the task. Babbage conceives of a calculating machine in 1822 and Turing builds the first big computer farm during the Second World War to decode enigma traffic.
These early computers are monolithic, physical implementations of the solution to the problem domain – a single physical design shown schematically to the right:
It is worth recapitulating this timescale. Babbage conceptualises, but does not build, the first computer. 120 years later the first proper one is built and the ‘burning platform’ for this implementation is the Nazi hordes at the gate – one might reasonably come to the conclusion that building monolithic computers is not really that cost effective.
The critical point about Babbage computational machines was that a change to the business requirements meant a change to the physical design of the Bombe. The introduction of a 4-rotor machine in 1941 meant that a new version of the Bombe had to be designed and build.
Turing being the genius he was, was able to address the change management problems of computers. His Universal Turing Machine proved that any problem that was calculable could be decomposed into a hardware and a software element. There had previously been ‘software’ for ‘calculation-like’ machines (punch card driven looms and pianos, for instance) but Turing’s insight was that ‘everything’ was calculable, not just ‘somethings’. Turing postulated the ‘first great bifurcation’ – that between hardware and software, (figure to the left):
In a Turing machine multiple pieces of software can run on a single hardware platform and we see the birth of programming. Turing doesn’t stop there though, he also postulates the second great bifurcation – that between programme and data, shown to the right:
Now the hardest thing about the past is trying to ‘unknow’ things you take for granted –but once this was hard stuff. Again it is worth recapitulating the development cost of systems after the second bifurcation. Each ‘application’ was custom built in software – albeit running on a common hardware platform. In the epilogue to the 20th Anniversary Edition of The Mythical Man–Month Fred Brooks talks about the start to his programming career that is both comfortingly familiar and deeply alien:
The first computer I worked on, fresh out of Harvard, was the IBM 7030 stretch computer, Stretch reigned as the world’s fastest computer from 1961 to 1964; nine copies were delivered.
Yes, that is 9. Fancy running an application? Well let’s start by working out how we intend to layout the files on the DASD, clearly you can’t begin coding until you know how you intend to write your IO subsystem, can you? Gradually coding teams built up libraries of software that they used on application after application. The developers slowly split into programmers and sysprogs – the sysprogs taking care of non-functional aspects of the application.
Voila! operating systems. Again the boundary between the two seems immutable and clear, an obvious natural boundary; as clear as maths. The operating system was always out there just waiting to be discovered. Except of course it isn’t. A gentle reading of the Posix standards  will soon disabuse you of any notion of clarity. The boundaries of the operating system are as definitive as a statement about the length of the coastline of Norway. Take the entry on vi. To comply with the Posix standard an operating system must provide a version of the vi editor, to wit:
The vi (visual) utility is a screen-oriented text editor. Only the open and visual modes of the editor are described in IEEE Std 1003.1-2001
It is pretty clear that you could exclude one of the modes of vi quite cheerfully without ‘breaking’ the operating system – it is more import that the boundaries of an operating system are drawn clearly than they are drawn at a particular point. The operating system is just another convention; but the reason the convention thrives is because it is bloody useful.
They say that man is a pattern-matching animal, and by now a number of you will be thinking – “he’s about to make the case that Erlang is The Fourth Bifurcation ”. Nearly right. I will now make the case that Erlang/OTP is the first implementation of an Application System – which is the product of the 4th bifurcation, (right):
So what are the characteristics of an application system? Well lets turn to the maestro, the father of Erlang, Joe Armstrong. His Doctoral Thesis is called Making Reliable Distributed Systems In The Presence Of Software Errors. The salient point of the introduction is :
When we make a fault-tolerant system we need at least two physically separated computers.
An application system runs on more than one computer. Network up a Windows machines, a Linux machine, a Solaris machine and a FreeBSD one. Start an Erlang shell on each machine with a common shared cookie, a bit of the old-ping-pang-pong in the shell and shazam! an operating multi-machine cluster – and not a line of code written – you have started up an Application System. Its unwritten lines of code sit suitably lightly on the balance sheet.
Application systems address the essential complexity of the software environment by providing a set of ‘bought not built’ mechanisms for addressing the –ilities:
From a CEO’s perspective using commodity software to deal with the –ilities leaves the paid-for technical team concentrate on the –ality, the functionality. Given that under current circumstances only 10% of the cost of software is the development, with the rest being the post-live servicing, commodifying the bulk of the cost makes obvious sense.
This commodification needs to be seen in the context of the three previous bifurcations. Each bifurcation constitutes an episode of commodification. Each commodification causes a change in the underlying cost-structure of the software industry and each change in that cost base makes incumbents vulnerable and opens up opportunities for new companies.
The critical aspect of the Erlang/OTP that makes is a suitable candidate to be an Application System turns out to be exactly the same as the key aspect of Operating Systems:
· concurrency is provided by processes
Processes are independent units of concurrency that don’t share state and which communicate by passing messages. In operating systems (which run on one computer) a process is only addressable local to the machine. In an application system (which runs on many computers) a processes is addressable across many machines. It follows from this naturally that A/S processes can’t be O/S processes.
Historically the unit of concurrency in an operating system is the ‘user’ – a given process should think it is the only user on the computer. An O/S process should have access to full range of functionality of the physical machine. The O/S process doesn’t actually talk to the hardware – it talks to the O/S which handles the hardware on its behalf.
In an A/S the application system itself runs as an O/S process. If an A/S process wants access to the full range of O/S resources is asks the A/S to access them on its behalf.
Given that O/S’s have a user (or an anthropomorphic batch user) as their basic unit of concurrency it comes as no surprise to find that the upper limits on O/S concurrency tend to mirror the upper limit of concurrent sign-ons for timesharing machines – 4,000 to 8,000 concurrent objects.
In order to get around this limitation many applications use an alternative mechanism of concurrency – the dreaded thread. Unlike processes, threads share state and software errors propagate between them. The threaded model in applications has its analogue in operating system design. There was a very popular operating system that didn’t impose memory page writing constraints between applications – they all shared state. MS-Dos was enormously successful, for a while, but it had, a-hem, stability issues.
By comparison with O/S process, Erlang processes are a lot more lightweight with an upper limit of tens or hundreds of thousands of concurrent processes – if they want heavy lifting they just ask the Erlang Virtual Machine. Critically the Erlang Virtual Machine doesn’t rely on the operating system scheduler to time share between its processes. The VM is just another anthropomorphic batch user receiving one time slice at a time from the Operating System. Its own time-slicer/scheduler parcels that clock-tick amongst its own internal A/S processes. And the VM underlying the Application System has to replicate a whole host of constructs familiar from Operating System design like spawning processes and loading code. This can be of great benefit, as when, for instance, the Erlang code loader is used to hot load changes – running processes are invited to swap themselves out for newer versions whilst executing.
It may seem like this diversion into concurrency models shows a little too much interest in the technical side for a CEO, but it is critical to understanding how A/S programming and O/S programming should interact. It is worth showing this schematically, (right):
You can’t just yoke generic libraries written in an Operating System  language directly into an Application System language, or more precisely you shouldn’t. It is entirely possible to do so. But mixing programming between abstractions is never going to be a sensible idea. There used to be a popular class of operating system that didn’t prevent the application from writing directly to the hardware. The Win3x class of operating system kernel lasted all the way up to Windows 95 and its ‘flexibility’ brought with it the thrice-blessèd BSOD . You could recompile the Linux kernel to allow your apps to write direct to hardware, and you could recompile Erlang to use your single- or multi-threaded
libraries in some other language – but you really don’t want to.
For good reasons there is a clear distinction between user-land Linux developers and kernel developers (or sysprogs as they used to be known). Typically these two sets of coders don’t share libraries. Unless your (typically) C library is specifically intended to be implemented in the VM you really shouldn’t link it in – ie don’t mess with the VM unless you are an asprog – a contraction I fear that will either never catch on, or already means something beastly on the far shores of the internet.
In Erlang the Application System is provided by a series of discrete VM’s running on a number of machines (typically more than one VM per physical machine). These VM’s expose a common cluster and communications semantics to each other. The ‘universal driver’ approach for encapsulating libraries and other non-Erlang add-ons is to wrap the hardware board or Java class or whatever in a set of libraries that expose those common semantics. In Luke Gorrie’s Distel a popular IDE (some techie doo-dah called Extendible Macros something or other) is so enabled: giving you a clustered development environment and multi-tier debugger.
Naturally the emergence of Application Systems will not avoid meeting some resistance. But, again, ‘twas always so. The world once pullulated with shoals of ‘knit your own operating system’ men but they have gone the way of all flesh.
One may well ask why if Erlang/OTP is so good as an Application System it is not better known outside the telecoms industry. It has a long pedigree and a code base of 1.25 million lines of code. The answer I’m afraid is to be found in the capital requirements of telecoms and web start-ups. In capital intensive industries like telecoms the techies don’t become rich. And rich is the only way geeky stuff gets sexy.
That’s enough of this romp around the history of computing. All the experience of the history of computing tells us that at the point of a bifurcation the cost base of the industry is transformed by an order of magnitude or more. The arrival of commodity Google scalability will change the cost base of the industry and when cost bases change, commercial opportunities arrive.
Do I, as a CEO, recommend that you use Erlang? No, but I recommend that you use an application system. This particular Kool Aid currently only comes in one flavour, Erlang, but I’m sure new ones will be along soon…
Henceforth NSB http://www.cs.uu.nl/docs/vakken/pm/docs/no_silver_bullet.html.
There’s a good reason it is well worn and that’s because it is simply excellent. On re-reading it for this article one particular paragraph leapt out at me and made me laugh out loud. On trolling around various start-up offices in and furth of London I have observed the inexorable rise of the super large Apple monitor, 4, 5 or 6 times the screen acreage of a normal laptop screen. Back in 1986 Brooks wrote apropos of graphical programming:
…the screens of today are too small, in pixels, to show both the scope and resolution of any software diagram … the hardware technology will have to advance quite significantly before the scope of our scopes is sufficient to the software design task…
<autoblogger>Jeezo, he’s an idiot, in what way can IO be anything other
than essential to a computer language – muppet.</autoblogger>
<wiseoldhand> In the olden days we had ‘core’ as in ‘core dump’ and it wasn’t volatile – when you switched the machine off and then on again it came back with the memory state restored. This old volatile memory malarkey is a new thing, and it won’t be around forever. Back in the day the Cray-XMP at London University was the only machine in the UK with 1Mb of Ram; so Ram is much, much more available. Mebbies soon we will have so much non-volatile core that you’ll no need to ship stuff in and out… Io can go back to being a lovely wee moon of Jupiter again.</wiseoldhand>
 Sounds like a fillum to me…
 Chapter 3 Section 3.1
O/S language may be defined as one in which you know at write time which physical machine the concurrent units of the code will execute on at run time. Think C, C++, Java, Ruby, PHP, Perl, Python, Lisp, Fortran etc, etc for each of which the code can only run on ‘this’ machine. You might call a library on ‘this’ machine that will talk to ‘that’ machine.
 In the olden days when I wrote the first undergraduate thesis at Bristol University to be ‘word processed’ the word processor I used was a printer. You wrote the text with embedded printer commands in and the ‘programme’ wrote natively to the peripheral hardware.