Programming for Maintainability 4, The Swiss Cheese Model

Many years ago, my flight instructor once described to me the “Swiss cheese model” in crash analysis and risk management. The thinking goes, you can think of each layer of defense against a bad outcome, such as a plane crash, as a slice of Swiss cheese. For a midair collision, maybe one slice is the pilot’s visual scan. A second slice is ATC’s radar screen. A third slice is a traffic display on a moving map. Each of these slices has holes: Humans have difficulty noticing small specks that stay still. ATC may be distracted by another area of the screen. You may have a moving map outage, or the other aircraft may not be ADS-B equipped. If all of these holes line up, then you have a crash.

We can think about software engineering in a similar way. We want to prevent bugs from entering production. So we have compilers that give us warnings. We have unit tests. The developer tests the code at their desk. We have peer review. The code doesn’t then go directly to master, it goes to a pre-release branch for a while so other developers can run the code. We have dedicated automated tests. Then sometimes there’s a manual testing phase. And only then, is the code released to the wild.

Any one of these layers can prevent a bug from being released. But, as happens too often, if the holes all line up a bug can get through into production.

And you can consider this when you’re programming. Next time you write code, ask yourself - what layers are in place to prevent me from making a mistake?

As is required for this blog, I have a code snippet. In this example, we have a piece of hardware which can operate in two directions - RX and TX - but the operation is symmetric, and simply uses some different constants (for register addresses and such). So, consider this code:

switch (direction)
{
case kDirectionRx:
    contextUid = kRxContextUid;
    currentVersion = kRxCurrentVersion;
    oldestCompatibleVersion = kRxOldestCompatibleVersion;
    break;

case kDirectionTx:
    contextUid = kTxContextUid;
    currentVersion = kTxCurrentVersion;
    break;

default:
    // Throw invalid direction exception
    throw ...;
}

Obviously, looking at this small snippet, it’s easy to see the bug. A simple fix might be:

 case kDirectionTx:
     contextUid = kTxContextUid;
     currentVersion = kTxCurrentVersion;
+    oldestCompatibleVersion = kRxOldestCompatibleVersion;
     break;

Errr…. oops, copy/paste error. The right fix is:

 case kDirectionTx:
     contextUid = kTxContextUid;
     currentVersion = kTxCurrentVersion;
+    oldestCompatibleVersion = kTxOldestCompatibleVersion;
     break;

When I (as well as a more senior tech lead) reviewed this it was embedded in a larger changelist. So we passed through a hole in the “reviewing” slice of cheese. The value of kTxOldestCompatibleVersion is zero, which means that when we tested the code it worked (the variable happened to be initialized to zero). So we pass through a hole in the “testing” slice as well.

So let’s ask: What slices of cheese can we add to this code? How could this code be safer in one way or another? One observation is that the issue is that we invalidated an invariant: That after this switch statement is executed, the variable will have been assigned to one of two values (or an exception will have been thrown). In the vein of DRY (don’t repeat yourself), we only want to do the assignment once:

if (directionIsNotRxOrTx) {
    // Throw exception
    throw ...;
}
oldestCompatibleVersion = <correct value>;

It’s clear from the above code that, unless an exception is thrown, oldestCompatibleVersion will always be assigned a value. There is no way to “forget” to assign a value.

Obviously, in C++, <correct value> could be a ternary:

oldestCompatibleVersion = direction == kDirectionTx ?
    kTxOldestCompatibleVersion :
    kRxOldestCompatibleVersion;

In Rust, you can directly use an if statement:

let oldestCompatibleVersion = if direction == kDirectionTx {
    kTxOldestCompatibleVersion
} else {
    kRxOldestCompatibleVersion
};

As a side note, in Rust, match statements check that their arms are complete, so if we defined the direction variable correctly, we don’t need to have the default case:

enum Direction {
    Rx,
    Tx,
}
...
let oldestCompatibleVersion = match direction {
    Direction::Tx => kTxOldestCompatibleVersion,
    Direction::Rx => kRxOldestCompatibleVersion,
};

Anyway. This is one layer of cheese. What’s another? Remember that there’s three variables, so this code will have that ternary repeated:

if (directionIsNotRxOrTx) {
    // Throw exception
    throw ...;
}
contextUid = direction == kDirectionTx ? kTxContextUid : kRxContextUid;
currentVersion = direction == kDirectionTx ? kTxCurrentVersion : kRxCurrentVersion;
oldestCompatibleVersion = direction == kDirectionTx ? kTxCurrentVersion : kRxOldestCompatibleVersion;

This is much better. There’s still some repetition, though, and it’s not very easy on the eyes. (did you see the bug?)

One final modification I’ll suggest is to make a class, DirectionalValue, which encapsulates this:

template <typename T>
struct DirectionalValue
{
    DirectionalValue(T rx, T tx) : _rx(rx), _tx(tx) {}

    T get(tDirection direction) {
        switch (direction) {
            case kDirectionTx: return _tx;
            case kDirectionRx: return _rx;
            default:
                // Throw exception
        }
    }

private:
    T _rx;
    T _tx;
};

With this, when you define the variables, you’re forced to specify both values:

static const DirectionalValue<uint64_t> kContextUid(1, 2);
static const DirectionalValue<uint64_t> kCurrentVersion(3, 4);
static const DirectionalValue<uint64_t> kOldestCompatibleVersion(0, 0);

And, retrieving the values is cleaner, without the caller having to remember to check direction:

const uint64_t contextUid = kContextUid.get(direction);
const uint64_t currentVersion = kCurrentVersion.get(direction);
const uint64_t oldestCompatibleVersion = kOldestCompatibleVersion.get(direction);

Note also that we can now make the variables const, adding a slice of cheese that prevents us from accidentally modifying the variables.

Also, the class DirectionalValue can now be unit tests, where the previous code wasn’t.

Of course, even this class isn’t perfect. You could pretty easily confuse RX and TX in the definition:

static const DirectionalValue<uint64_t> kCurrentVersion(3, 4); // Is 3 for RX or for TX?

Maybe you could mitigate that by naming the class RxTxDependentValue, although that’s a pretty tenuous connection to make.

Likewise, you can still simply use the wrong class when you’re retrieving a variable:

const uint64_t oldestCompatibleVersion = kCurrentVersion.get(direction);

This is mitigated somewhat by naming.

Additionally, one could argue that these values should be passed in from the constructor, and the person constructing the class containing this code should pass in the correct RX or TX constants. And, though true, that simply moves the burden of choosing the variables up a layer. I’m sure we’ll talk about dependency injection in some future post, so in case I don’t see ya’, good afternoon, good evening, and good night.

This blog series updates every week at Programming for Maintainability