The astute reader may have noticed, I pulled a fast one in the last blog post. Last week, we replaced this code:
if (regex_match(portIterator->second, boost::regex("0*([0-9]{1,4}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5]);"))) {
...
}
with this:
const int port_no = std::stoi(portIterator->second);
if (port_no <= 65535) {
...
}
and we sort of left it at that, and moved on to other things.
But these code pieces actually aren’t equivalent, for two reasons. First, stoi
will accept a negative number, so the condition should be: port_no >= 0 && port_no <= 65535
. But also, what happens if portIterator->second
is something really weird, like E leke te eet epples end bennenes
? It throws an invalid_argument
exception.
And that’s what I want to talk about today, error handling. Every language has a different way of doing it. Probably the most common methods are
- no language support/integer return codes (e.g. C),
- exceptions (familiar from what I’ll term the “second gen” languages: C++, Python, Java, C#, etc.) and
- “status chaining” (which is really a way of working around method 1)
Obviously, you should most likely use the error conventions of the language you’re in. If you’re in Python, use exceptions, not return codes. If you’re in C, return an integer which is zero on success. But it is clear that, when you are choosing a language for a task, error handling will affect the maintainability of your code.
Let’s look at the “exceptions” method. A common operation in maintaining code is refactoring code - removing old unused functions, splitting up huge unreadable functions, moving code into a location where it can be reused. When you’re doing these things, you want to be sure that you did them correctly. 90% of internet advice for “how to refactor old code” is “first, write unit tests.” The reason for this is so that as you change the code, you can know that you didn’t break anything.
And that’s exactly what happened here, we broke something. The first hole in our Swiss Cheese model is: “does it look right to the programmer?”
const int port_no = std::stoi(portIterator->second);
if (port_no <= 65535) {
...
}
Nothing really says “oh hey, that function might throw.” If you’re a C++ programmer (which I only am at my day job), maybe you just happen to know that std::stoi
throws. But what if it isn’t from the standard library?
const int port_no = my_parse_port_no(portIterator->second);
Does that throw? You have to look at the implementation of my_parse_port_no
. And then, you get there and it looks like this:
int my_parse_port_no(std::string port) {
// Special case because of a bug we found in 1976.
// Here's a dead link to the bug: https://disabled-bug-tracker.example.com/bug5.aspx
if (is_gobblygook_port(port)) {
return 5;
}
const char* port_string = remove_possible_semicolon(port);
return parse_standardized_portno(port_string);
}
Great! Now you have to look at the definitions of is_gobblygook_port
, remove_possible_semicolon
, and parse_standardized_portno
. And then you have to read the definitions of the functions those functions call, and it turns out that remove_possible_semicolon
isn’t even in your codebase, it’s linked from a static library last compiled in 2005 and the source code was only stored in Lotus Notes which got burninated.
So, does it throw? Good question, go hire an intern and have them go scrounge around and figure it out for a summer project.
There are solutions to this problem. Java has a throws
keyword, which indicates that a function may throw an exception. Transcribing this to the C++ above, it might look like:
int my_parse_port_no(std::string port) throws std::out_of_range, std::invalid_number, my_exception::insufficient_gobblygook {
...
}
but as you’ve noticed, this can get quite long (disclaimer: I don’t have any personal experience with this scheme). Additionally, you have to manually list out every exception that your function could throw, for every function in your call chain.
The Java scheme does have one nice thing about it - this is a compile error:
const int port_no = std::stoi(portIterator->second);
if (port_no <= 65535) {
...
}
because the compiler knows that std::stoi
throws, and it knows that I didn’t add a try/catch around it. This is wonderful! It means that we programmers, who charge by the hour, can spend our time thinking about more important things, like “is portIterator->second a valid pointer we can dereference?” and “where are we going for lunch today?”
But don’t stop reading here and switch all your codebases to Java. This blog isn’t an advertisement for Java. (officially, this blog isn’t an advertisement for anything. I was not paid to place any products in this blog)
Let me take a minute to talk about our sponsor, FlexPak Package Leak Detection Equipment. Tired of producing defective packaged goods, packages, packages with air, vacuum-packed packages and much, much more? The FlexPak box-with-handle can help you with all these things and more!
Also, Rust can help you with all these things and more!
Rust handles errors by returning a Result
object, which is an enum (union for you C folk) containing either an “ok” value or an error:
fn my_parse_port_no(port: &str) -> Result<i32, MyError> {
//
}
This has the property that errors are, like with the Java scheme, difficult to ignore if you care about the result:
let port: i32 = my_parse_port_no(port_string);
will error, because the left side is an i32
and the right side is a Result<i32, MyError>
. You have to consciously choose a way to handle it:
// Will trigger a compiler warning, and also you don't get the value
my_parse_port_no(port_string);
// Explicitly ignore both the error and the value
let _ = my_parse_port_no(port_string);
my_parse_port_no(port_string).ok(); // These are similar
// Will immediately assert and abort the program. An error here is simply unrecoverable.
let port: i32 = my_parse_port_no(port_string).unwrap();
// If this was an error, return from this function with that error immediately.
let port: i32 = my_parse_port_no(port_string)?;
// Do something custom!
let port: i32 = match my_parse_port_no(port_string) {
Ok(x) => x, // Forward the "Ok" (non-error) value on
Err(e) => {
// Handle the error our own custom way. Maybe error and return:
log::error!("Error! I don't like gobblygook! Got error: {:?}", e);
return Err(e);
}
};
// Using the snafu crate: Add some extra information, and pass the error up
let port: i32 = my_parse_port_no(port_string).context(UserOptionParseError)?;
(snafu)
We’ll talk about both pattern matching and the log
crate more in future posts, stay tuned.
There’s also the implicit advantage that you will never get an unexpected error. Originally, when we rewrote that code that got us into this mess, we accidentally added a whole new type of error that our function could call. Both the Java and the Rust methods would’ve caught that.
This method also allows you to do the common operation of capturing an error, and passing it along with context. Errors are intended to give information to the user, so that they can change their behaviour, so which would you like your user to see?
- Error: Could not parse option string “key1=foo,key2=bar,key3=baz,port=blah,key4=pawn,etcetcetc”
- Error: “blah” is an invalid number
- Error: Could not parse option string: Port “port=blah”: “blah” is an invalid number
And so, we arrive at our recurring moral: that the tools your language provides you are the bedrock of your ability to program maintainable programs, and that what you may have considered to be an inconsequential “opinion-based” (Alice like exceptions, Bob likes status chaining, Charlize likes deleting the user’s Documents
folder) can have major effects down the road, over the course of a codebase’s lifetime.
This blog series updates every week at Programming for Maintainability