Complexity in Software

In a discussion with a former colleague of mine on the organization of components and on system boundaries, we focused on the complexity inherent to software building. It hit me that we can learn a little from physics here.

The law

The first law of thermodynamics states that

energy can be transformed, i.e. changed from one form to another, but cannot be created or destroyed.

This law, in my mind, can be applied to software development quite generally :

Complexity in software can be transformed, i.e. changed from one form to another, but cannot be created or destroyed.

I was about to make this law my own, but on writing this I found that another named this back in 2003. Matt’s first law of software complexity states :

The underlying complexity of a given problem is constant. It can be hidden, but it does not go away.

Or:

Complexity is conserved by abstractions. In fact, apparent complexity can be increased by abstractions, but the underlying complexity can never be reduced.

Why manage complexity?

The way we construct software is influenced on all levels by complexity. Yes, there are user requirements, even functional requirements that drive us towards a certain design when building software - but any requirement introduces further complexity that need to be managed. A software developer’s primary role is to manage complexity in order to provide anything useful.

In short, complexity

makes software more difficult to write
makes software more difficult to maintain
makes it almost impossible to communicate intent to a fellow programmer (or the next)
makes it progressively more difficult to construct larger components out of smaller ones.

What is complexity in software development?

When looking at the complexity problem, it might be useful to identify categories of complexity in terms of looking at components. I’m veering away from the academic classifications here for to make this discussion a wee bit more practical.

For any working component in a system, that is, a subsection of a software application that is useful, we might look at

Internal complexity : complexity contained by the component, either caused by the implementation of the component itself or inherited by its integration with subcomponents.
External complexity : complexity leaked outside the component’s interface. Other components will inherit this complexity when interfacing with or containing this component.

When looking at a system (a component in itself), we can also identify

Local complexity : the complexity in understanding a specific component/piece of functionality
Global complexity : the complexity in understanding the system in its entirety - all the interactions between the different components within.

Techniques for managing complexity

Since software development clearly has a lot to do with managing complexity, several techniques have sprung up over the years to help manage complexity. There are too many to list here as we could argue that even the move to higher-level languages had complexity as it’s primary motivation.

A helper in managing complexity in recent years has been Object Orientation (OO), especially with the concept of encapsulation stopping the spread of complexity outside of a component. Internally and externally, OO has taught us to minimize coupling, and maximize cohesion. These two principles are the underpinning of most (all?) complexity management techniques in this paradigm.

Closely related to encapsulation, is the idea of service boundaries and a bounded context. Service boundaries are meant to cut off complexity at a specific point, and provide a simple, well-thought out contract to interact with a module, thereby reducing the surface area of interactions with it and hiding the complexity within a module. A bounded context helps to identify models that should be kept separate in order to avoid complexity.

A more technical technique to reduce complexity in a codebase is to apply the Inversion of Control / Dependency Injection pattern/technique. It obviously promotes simplicity in classes, and reduces coupling between components. The former pushes down local complexity - we can zoom in and easily understand a single component, change it, or replace it - but the latter causes an increase in global complexity since the system is more difficult to understand in it’s entirety, and it becomes increasingly difficult to identify all the execution paths in a system.

If this technique pushes up one type of complexity, why implement it? My theory is that global complexity favours the maintenance phase over the development phase. In the development phase the focus is on the creation of code - major components are designed and created, and we are expected to have a clear vision of the system in its whole (we have to deliver this thing, don’t we?). In the maintenance phase, we generally make smaller changes, and the focus is on “zooming in” to an appropriate level to fix, refurbish, or replace components to serve a new purpose. DI makes sense since the maintenance phase accounts for the majority of the effort put into a software product.

Other techniques for managing complexity

In Matt’s post he rightly points out a strategy that is generally underutilized - if the problem is complex, it might be possible to redefine the problem and sidestep that complexity entirely.

Do the simplest thing possible. Your customers will love you for it.

The law

Why manage complexity?

What is complexity in software development?

Techniques for managing complexity

Other techniques for managing complexity

Further reading