So during the Second World War (the less great one), in the Pacific theater, the Americans and the Japanese fought largely over pacific islands (famously places like Okinawa, Guam[1], Palau, etcetera). This was almost entirely for logistical reasons - The US needed to control the area in order to launch bombers and resupply forces in the region, and they provided access to the Japanese homeland in the event that a direct invasion was required. The two atomic bombs were, for example, launched from North Field on Tinian - a small island in the Pacific approximately a thousand miles South of Japan.
But there were people who already lived on these islands. Disney made a whole movie about them. And when the warring powers landed on their islands and airdropped medical supplies and processed foods and such, it radically changed the lifestyle of the natives. Then the war ended and the powers went home. But some islands developed so-called “cargo cults”: Religions which worshipped the imperial powers as bringers of canned goods, adopting the rituals of the US Army much like a rain dance. The story goes, they would carve headphones out of wood and sit in wooden air control towers looking out over empty patches of grass, thinking that would make the planes come back.
At least that’s how the story goes.
Anyway, that’s just a fun little history lesson that came to my mind.
So. The other day I was watching a LinkedIn training on how to write “clean code,” inspired by the book of the same name, and they showed the following example for determining if an ASCII sentence was a “properly formed” sentence:
if (input[0] == Char.ToUpper(input[0]) && (
input.EndsWith(".") ||
input.EndsWith("?") ||
input.EndsWith("!")
))
(This is Java, by the way)
I understand that this is not going to check grammar or anything, and is a simplistic check. That isn’t the point. The point is, the presenter’s north star was:
if (IsValid(input)) {
…
}
private static bool IsValid(string input) {
return IsCorrectlyCapitalized(input) && EndsCorrectly(input);
}
private static bool IsCorrectlyCapitalized(string input) {
return input[0] == Char.ToUpper(input[0]);
}
private static bool EndsCorrectly(string input) {
return EndsWithStatementMark(input) ||
EndsWithQuestionMark(input) ||
EndsWithExclamationMark(input);
}
private static bool EndsWithStatementMark(string input) {
return input.EndsWith(".");
}
private static bool EndsWithQuestionMark(string input) {
return input.EndsWith("?");
}
private static bool EndsWithExclamationMark(string input) {
return input.EndsWith("!");
}
Creating six new methods to take the place of the one condition. Some of these make sense. IsCorrectlyCapitalized
makes sense, because it is not very clear at all what input[0] == Char.ToUpper(input[0])
does. Arguably, the method should be called either IsSentenceCorrectlyCapitalized
(to emphasize that the input should be a complete sentence, and not e.g. a single word) or IsUpperCase
(to avoid saying that Upper Case is the “correct” case in every situation). I dislike contrived examples, so I won’t go into which case makes sense here (since you could contrive a scenario where either makes sense), but I will mention these two options and their rationales. There may be others.
Likewise, EndsCorrectly
makes sense. It’s a combination of three conditions, so it’s just cleaner, and it says on the tin what it does. Of course, the same two changes apply: Do you want to say what ends correctly? (in which case, SentenceEndsCorrectly
), or do you want to say why this is correct? (in which case, something like EndsWithTerminatingPunctuation
).
IsValid
is reasonable too. If there’s a lot of places where sentences are checked for validity, the two conditions should obviously be wrapped up together. However, although I said that the two methods above made sense, it should be considered whether it makes sense for them to be separate. Does it ever make sense to check if a sentence is correctly capitalized, outside of checking if the sentence is valid? (or, equivalently, if the sentence ends in a proper punctuation?)
If not, providing such a method risks confusing the programmer. If they want to check a sentence, should they call IsValid
, EndsCorrectly
, and/or IsCorrectlyCapitalized
? Obviously, we don’t want IsValid
to contain just the original condition, but languages support a nice middle ground for naming values, called variables:
private static bool IsValid(string input) {
bool ends_correctly = EndsWithStatementMark(input) ||
EndsWithQuestionMark(input) ||
EndsWithExclamationMark(input);
bool is_upper_case = input[0] == Char.ToUpper(input[0]);
return is_upper_case && ends_correctly;
}
And then we get to the last three methods, EndsWith*
. Take a look at the statement input.EndsWith("?")
. Can you tell me what it does?
If you guessed “it returns true if input
ends with a ?
”, you would have guessed correctly. But by creating this function, we’ve converted input.EndsWith("?")
into EndsWithQuestionMark(input)
. It’s not necessarily worse per se, but it’s not better, and code costs cash. Somebody’s going to be real mad when EndsWithQuestionMark("❓")
returns false - that string clearly ends with a question mark, and also clearly does not .EndsWith("?")
. (if your browser doesn’t render it properly, note that is a U+2753 “Black Question Mark Ornament”)
The clean code credo says “add abstraction! make methods, not war!” but it’s easy to get wrapped up in thinking that adding another level of methods will make your code prettier. So, my proposed IsValidSentence
:
private static bool IsValid(string input) {
bool ends_correctly = input.EndsWith(".") ||
input.EndsWith("?") ||
input.EndsWith("!");
bool is_upper_case = input[0] == Char.ToUpper(input[0]);
return is_upper_case && ends_correctly;
}
Down from six methods to one, and still as readable, I would say.
One might wonder, how did we get here? Well, I noticed an interesting thing. The presenter was using IntelliJ, which provides a right-click shortcut to “factor code into a method.” You can highlight some code, right-click it, and click “move highlighted code into a method.” It’ll do all the typing for you, you never even have to look at the generated code, you just get a pretty function that says EndsWithQuestionMark(input)
.
But when you’re refactoring a large codebase and all you have are wooden headphones, it’s easy to say “this is clearly more readable than what we had before…”
[1] Fun fact, the Japanese invaded Guam the day after Pearl Harbor. They additionally invaded Hong Kong, Malaysia, and Thailand on December 8th, the latter of which only lasted 5 hours before Thailand capitulated.
This blog series updates at Programming for Maintainability