The voice of business architecture. And more.

Agile. Scale.

I hate to be the one to break the news, and please don’t shoot the messenger.

Complex and geometric relationships don’t scale well. Agile is no exception. As with economies of scale, where diseconomies set in rather quickly, so it goes for scalable Agile—enterprise-scale Agile.

In this segment I articulate why Agile is not scalable by referencing group intercommunication and organisational design models. But first, some background.

A core challenge with scaling Agile is that in my experience, even old-school Agile isn’t practiced competently, so why try to scale it. I believe the logic employed to attempt to justify scaling agile might have been along the lines of some well-meaning managers or opportunistic consultants saying, ‘This looks like a good idea, but I can tell it won’t scale well. Maybe if we add these elements, it will work for us‘.

This parallels a story I shared about my ex-wife’s culinary exploits, where she substitutes recipe ingredients without having ever mastered the original recipe. I wouldn’t put much faith in arriving at a successful outcome. I get the same feeling with so-called enterprise and scalable Agile. I could be wrong. But my argument is not about some fallacy of composition. It’s a simple claim against scalability.

Let’s look at the scalability of communication networks first. In people networks, optimal communication is 7 nodes. 7 people, plus or minus a couple. In the case of Agile, all the nodes don’t have to be in constant synchronous communication, so we can expand our team size a bit, and we can employ a sort of hub and spoke model to our advantage. What we can’t do efficiently is have satellites on satellites on satellites, ad infinitum.

Illustrative Agile Team Structure

Even the most basic Agile model has these constituent roles:

Product Owner. Scrum master, and a Development team—even if this happens to be performed by a single person.

There will be at least one customer or end-user. There are probably multiple customers or end-users. In fact, there may be more than one customer cohort. These might be represented by their own personae.

There needs to be at least one stakeholder. Probably more than one.

And there will be dependencies, whether people, process, platform, or some other system or systems.

This is as barebones as it gets. I suppose theoretically, this could be a one-man band. An audience of one. But I am not wholly concerned with that case. I’m pretty sure that wouldn’t actually qualify as Agile.

And I’ve given the development team short shrift. It’s probably at least a developer or two, a quality assurance resource, perhaps someone operating as a business analyst in some capacity or another. Hopefully, there are design resources as well.

Some have argued that QA is not a scrum team role, but the function is definitely necessary because one of the primary tenets of Agile is working software—not simply apparently working software. If there are integration points and dependencies, these need to be accounted for and tested. Even if a team is employing automated testing and/or Test-Driven Development, the notion of a QA role is still evident. Moving on.

All these people need to understand on one level or another what’s going on. It’s best to be able to receive the information first-hand, though this may be impractical in many cases. Some people can be less connected to real-time crosstalk than others, whilst others need to be hyperconnected. Much of the communication can be asynchronous, and that helps. But this creates a communication latency factor.

Consider how a people communication network model scales. I’ve shared an image on the blog post. It illustrates Metcalfe’s law, which informs how quickly additional nodes add to the geometry.* Ostensibly, Metcalfe’s law can be expressed by the group intercommunication formula that shows that the number of channels necessary for some number of nodes is the number of nodes times the number of nodes minus one, all divided by 2.

Mathematically, this triangular number can be presented as n(n – 1) / 2.

Illustration of Metcalf’s Law

For 3 nodes—the nodes representing people—, there are 3 lines, each representing a communication channel—connecting to each node to facilitates full communication. Every person is able to communicate directly with every other person. There is no second-hand communication, so translation and latency concerns are minimal.

We can extend this model. Add a fourth node, and 6 lines are necessary—a square with an X drawn corner to corner. Add a fifth node, and 10 lines are needed. Imagine a star inscribed within a pentagon. I’ll go on record and say that 10 people would be the size of a typical Agile situation, which accounts for several developers, a product owner, a scrum master, and some stakeholders. I’ll even allow the product owner to serve as a proxy customer with no intermediate feedback loop connecting to the customer cohorts. A 10-person team—10 nodes—creates the need for 45 lines of communication. Let’s just say that it starts to resemble a mandala.

As I’ve mentioned, some of these lines don’t require frequent communication updates, but these 45 lines are needed for peak capacity. Now imagine 2 Agile teams and a scrum or scrum master to coordinate the teams. That requires 21 nodes. Using the group intercommunication formula, we find that we’d need 210 channels to handle maximum capacity.

I am aware that this overstates the need, but allow me to temper that, by reminding that by eliminating channels, that the trade-off is latency and accuracy. When a communication is sent to all nodes, there is a redundancy in the received information. If some nodes are immediately bypassed, then, besides latency there may be challenges in decoding or interpretation. I bring your attention to the telephone game problem, which is also known as the less racially-sensitive Chinese whispers. Let’s go with ‘telephone game’. To spare you from having to Google it, in this game, a person is whispered some phrase or sentence. This person whispers what they heard to another person, who repeats this process until the last person broadcasts what she heard. Spoiler Alert: The probability of the message being identical with the original message is low. That the message retains context is somewhat higher, but it is also low.

In addition to the constraints of Metcalfe’s Law, there is Brooke’s Law, popularised by the venerable classic publication, The Mythical Man Month, wherein we learn that 9 women cannot expedite the gestation process to one month. Some things are simply boundary-constrained, and cannot be scaled. Once a model has been optimised. Say at 7 nodes. Each additional node bogs the model down further.

Perhaps you are still unconvinced. Or perhaps, you agree with arguments that claim this relationship is better illustrated by a log function rather than a geometric function, and this gives you enough headroom to assuage your cognitive dissonance. Proceed at your own peril.

This portion of the segment pertained to how network intercommunication theory limits the size of teams in particular. At the very least, it demonstrates that team communication becomes increasingly suboptimal relative to scale.

Spans & Layers

Now, let’s consider spans and layers within organisational design model concepts. I’ll use this as an analogue for Agile.

Simply put, spans are the number of people a manager has reporting to them. So-called direct reports. A president might have 5 vice presidents. This is a span of 5.

Layers represent organisational strata. The number of levels in an organisation. The depth of the reporting structure. An organisation might be structured as follows, from the bottom up:

Workers report to supervisors, who report to managers, who report to directors, who report to vice presidents, who report to a president. This organisation has 5 layers.

Pardon a quick detour to the land of organisation theory, which happens to dovetail with communication network theory. A wider span affords a manager a greater span of control. When she speaks, she has a direct one-to-many relationship with direct-reports. There are no inherent filters or latencies. If one is seeking communication speed efficiencies, this is the top option. It breaks down where the span gets so great as to cover too many diverse functional areas, and the manager can’t keep track of all of the various functions.

Agile is a relatively flat structure, which is to say, it has few layers. As hard as it might protest, there are ostensibly at least 2 layers. Even if the team have a certain degree of autonomy in how they execute the work, it is assigned from above. A scrum master might be a servant leader, but the product owner is the king of the hill. These people may fall into certain organisational layers, but on the team, save for the product owner, all of these have an equal voice. At least in theory.

Because of this flat structure, the limit on layers constrains our model, bounding our span. This is where the 7 ± 2 comes into play.

To summarise: Agile is not scalable.

One last note on the scalability concept. Some might argue that the model doesn’t need to be optimal. It need only be better than the next best option. I would argue that would probably be Joint Application Development or Rapid Application Development. These methodologies predate Agile and share in common the iterative nature of delivery. Joint Application Development, typically abbreviated to JAD (pronounced Jad) is conceptually similar to Agile and might even benefit from some elements of Agile. The same holds true for Rapid Application Development, RAD or Rad. I won’t elaborate on these alternatives here, save to say that I’d be interested in seeing the results of some studies of some development efficiency of one of these. JAD, in particular, versus Agile. I’m not convinced that Agile would be the clear winner. For now at least, Agile has won the hearts and minds. It’s won the popularity contest. I’ll be keeping an eye out on this.


* Some have since countered Metcalfe’s maths, claiming that the equation should yield a smaller result by the log function n*log(n). However, this is a different context, so I’ll mention but otherwise ignore it—save to note that at 21 nodes the value would be 63 instead of 210.

Leave a reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: