There’s a debate going on in the world, a tension between the CS and the SWE majors. As a professional programmer, I will be the first to admit that it is exceedingly rare that I compute the big-O of a function I work on. It’s even rare that I work on a function that’s not O(1)
. So, one might argue, what is the point of a four year computer science degree? What advantages has it given me, or my employer?
So, there’s an argument to be had that maybe one needs less education. After all, in my four year degree, never did they teach me of git or Apache or Python. I did take a C++ class, and I can tell you for a fact that I have since forgotten every single lesson. But maybe, the argument goes, I would be more productive if I had taken classes on git and C++ and other such things.
But this is untrue. For several days ago, I had a need which called upon the very depths of my computer science knowledge.
I needed to compare two files. Well, strictly speaking, the output of a program with a file. And so, to compare the hashes, I pulled up my trusty terminal and ran:
$ ./myprogram | xxd | md5sum
Oops! I accidentally included xxd
. Now I’d have to rerun the program (which takes approximately 200ms) without it!
Of course, I didn’t type this out originally. Originally it was:
$ ./myprogram | xxd | less
only after having manually inspected the output, did I decide to compare the two.
My hand hovered over the keyboard to rewrite the line, but then it occurred to me: xxd
is an injective function. That is, for every input, it gives a unique output.
Additionally, md5sum
is a hash function. That means that, for two distinct values b1
and b2
, md5sum(b1)
probably doesn’t equal md5sum(b2)
. After all, this is why people use it to compare to values in the first place.
Therefore, since xxd(b)
is an injective function, if xxd(b1) != xxd(b2)
then b1 != b2
. Thus, if b1 != b2
then md5sum(xxd(b1))
probably != md5sum(xxd(b2))
.
Having satisfied myself with this mathematical proof, I proceeded to get the file hash:
$ cat ./myfile | xxd | md5sum
and proceeded on my day, pleased at the amount of time I was able to save due to what I learned in my theoretical computer science classes.