Explaining Inheritance

I like to use metaphors to explain things. Often, they aren’t very good. But once in awhile I come up with something that seems to work pretty well.

Inheritance in Object Oriented design isn’t the easiest thing to explain, particularly the idea of an “abstract” class. But if we think about classes as literal “buckets” of state, then what’s the equivalent of an abstract class? What about a bucket with a hole in it; a leaky bucket?

A leaky bucket isn’t very useful on its own, you have to plug the hole in one way or another. Depending on how you plug the hole, however, you’ll get different outcomes. Plug the hole with a hose and it can help you water your garden. Plug the hole with a piece of screen or cloth and you’ve built yourself a filter. Plug the hole with a bit of waterproof tape and, well, now the leaky bucket is just a bucket.

Each of these tools requires holding or collecting water, like the shared data and / or behavior present in an abstract class. But each of them requires its own, separate, implementation in order to work properly; this is the specialization of a concrete class.

Your mileage may vary, but I’ve found this to be a useful metaphor when I explain inheritance to students.

Why I like static types

I generally prefer statically typed languages. I just do. All else being equal, I will choose a statically typed language over a dynamically typed language every time. But why? And why did I qualify that statement?

Well, that’s a little more complicated.

Let’s take a look at a simple function that determines whether or not a value is prime. The version below, written in JavaScript, doesn’t technically contain any type information.

function isPrime(n) {
    let i = 2;
    while (i <= Math.sqrt(n)) {
        if (n % i === 0) {
            return false;
        }
        i++;
    }
    return true;
}

When I say that this code doesn’t “technically” contain any types I mean that there are no types specified in the program text. The parameter n, the local variable i, and the return value could be anything, at least as far as JavaScript is concerned. There are, however, some implied types. For instance, the value we provide for n must be in the domain of the % (modulo) operator.

But, as programmers, we care about more than the language-level semantics of our programs.

The idea of a prime number itself only really makes sense for integers greater than 1. It wouldn’t make sense to ask whether the string “hello” is prime, for example. So while JavaScript doesn’t really care about types, we certainly do, because our problem domain (primality) does.

If we rewrite this function in Dart, a language that allows type annotations, we can capture at least some of the semantics of our problem domain within the text of our program.

bool isPrime(int n) {
    int i = 2;
    while (i <= sqrt(n)) {
        if (n % i == 0) {
            return false;
        }
        i++;
    }
    return true;
}

Note that it is no longer possible to pass “hello” to this function (well, you can, but the program won’t run). This is helpful because inputs other than integers don’t make sense within the problem domain anyway. So rather than add code to handle such mistakes, we can change the program so that it will refuse to even compile / run.

The point here is that our problem (finding prime numbers) has types, so it makes sense for our program to have (the same) types.

However, you might have noticed that our second function doesn’t actually have “the same” types as our problem. Ideally, we would like to require that n be an integer greater than 1. Unfortunately, we can’t express this idea with Dart, at least not in a way that would be likely to result in a satisfying experience for users of our function.

While the types don’t get us quite to where we want to be, we can still use runtime checks to finish the job. In this case we can provide some pretty helpful error messages as well. We could also decide to just return false for integers that don’t make sense (this is also a nice example of how we can define errors away). That being said, something is better than nothing, at least in my opinion.

bool isPrime(int n) {
    if (n < 2) {
      return false;
    }
    int i = 2;
    while (i <= sqrt(n)) {
        if (n % i == 0) {
            return false;
        }
        i++;
    }
    return true;
}

Sometimes, the problem domain itself is more difficult to translate into a program and types can help smooth the way. For example, say we have a function that accepts a URL:

function sendRequest(url) {
  // ...
}

What does a URL look like? Do we need the leading “https://&#8221;, or is this a situation where we can use “//” to infer the protocol (like the href attribute on an HTML anchor tag)? Furthermore, how do we verify that what we were handed is a valid URL? That could be a lot of work. We could write a reusable function to validate a URL, but if we’re going to go down that road we might as well make it a type.

void sendRequest(Uri url) {
  // ...
}

Once again, our problem domain has types, and by introducing those types into our program we can both simplify our code and make it easier for others to use.

Earlier, I implied that I might choose a dynamically typed language under certain circumstances. A programming language is just a tool, an abstraction over the machine to facilitate human interaction. The right tool for a job depends on the job.

Writing a browser extension, for example, is easily done in JavaScript (although today TypeScript is closing the gap). Elixir / Erlang can be a great choice for scalable server applications. Racket makes it easy to create DSLs. There is a seemingly infinite selection of machine learning and data analysis tools based on Python. In these cases, and others, the ecosystem that surrounds a language can be important enough that it outweighs other considerations, such as static types.

At the end of the day, I prefer statically typed languages because they allow me to represent more of my problem domain in the program itself. But I try hard to remember that the right tool for the job isn’t the one I like best, but the one that is most likely to result in a correct, useful piece of software.

Never Complains

Occasionally I hear someone compliment a software developer by observing that the individual never complains, even when things get ugly. Now, I realize that sometimes things happen that are beyond our control. I also realize that complaining about things we can’t change, while sometimes cathartic, is almost never materially helpful. I also realize that too much negativity can reduce the effectiveness of a team and its members.

All that being said…

Another way to spell “complaint” is “feedback”, and we know that feedback can be helpful. We deliberately solicit feedback from users (and quite often what we get are actually complaints). So why wouldn’t we encourage feedback (even if it sometimes rises to the level of complaints) from our teammates?

When we encourage one another to keep silent about problems in order not to be seen as “complaining” we can miss opportunities to improve our development processes and team dynamics. We may also miss out on potential process innovations that could improve life for ourselves and our users.

So, rather than encourage silent acceptance of whatever might occur, try to promote constructive complaints (feedback) among your teammates.

Compulsive automation

Programmers tend to have a disease: we compulsively automate. That is, no matter the task, we are always on the lookout for ways to automate it regardless of how much (or little) we gain by doing so. The problem is that we too often end up with very small, or even negative gains.

Automation can be viewed as a kind of optimization, and everyone knows that optimizing too early can cause problems. Certainly a task shouldn’t be automated unless it will need to be carried out repeatedly and doing so will be costly. However, compulsive automation seems to come in a few other varieties as well.

The first is when so much time is spent on automation that it kills, or disproportionately hinders the overall project. In this case, there might be very good reasons for automating, but the resources to actually carry it out may not exist.

This can happen at the very beginning of a project. Prematurely setting up continuous integration, version control, and a reproducible development environment can, in some cases, prevent a project from getting off the ground. Automation at the “end” of a project can also lead to problems. I personally struggle with this more than any of the others. Deploying an application is a great example.

You’ve got your snazzy new app (or whatever) working and you’re ready to show it to the world. You could set up a snowflake server, but everyone knows that’s a bad idea. So you decide to automate. You then proceed to fiddle around with Chef or Ansible until you run out of steam and never actually deploy anything, or you deploy but never actually make any updates (which would have justified the automation effort).

In the long run, automating deployments is the right thing to do. But when you’re deploying a prototype or a side project the extra time required up-front can hurt your momentum. It doesn’t matter how much theoretical time you’ll save in the future if no one ever sees your work.

A second variety of unwise automation is when automation reduces the burden on the person doing the automating but transfers it to others, sometimes even magnifying it in the process. The implementation of information systems tends to be an ugly business. We often forget that many of the ugliest systems actually seem clean and elegant to their users. Sometimes the price of this elegance is manual effort behind the scenes. This effort can often be eliminated, but doing so usually requires either significant technical investment or the imposition of constraints on end-users. I noticed a great example of this phenomenon on Hacker News the other day (which actually inspired this blog post).

It was revealed that the volunteer who has been (manually) aggregating hiring-related posts for the past four years has decided to step down. Shortly thereafter, a specification for hiring posts was proposed. The spec itself isn’t bad, it tries to split the difference between human- and machine-readability and does a decent job of it. However, it would require anyone who wanted to post a job to read, understand, and follow the spec.

This wouldn’t be a big deal if the same people posted jobs over and over again, but the community discourages posts from recruiters and HR employees. This means that most people who post will only post occasionally, increasing the odds of having to re-learn the spec every single time.

It seems reasonable, given that someone was willing to do the job manually for four years, to assume that the amount of effort involved in aggregating jobs posts is manageable. So a spec would save a relatively small amount of time behind the scenes, but at a large (total) cost on the part of the posters.

To be fair, a spec for hiring posts might make them easier to search, but a couple bullet points with suggestions for how to write an effective job post would solve this problem just as well.

The final problematic form of automation is when the automation itself becomes a larger project than the original task. I think this usually happens because we delude ourselves into believing that the automation project will be “easy”. Even when the automation is fairly straightforward, feature creep can turn a 10 line shell script into a 10,000 line application before anyone even realizes what is happening.

However, this kind of automation isn’t always a bad idea. If the automation tool can be released for use by others, the total time saved across all users may be greater than the time it took to build the solution. We see this dynamic a lot with open source software. Of course the time must still be justified internally, perhaps trading time for goodwill from the community.

So what is to be done? Certainly we shouldn’t stop automating, the benefits are just too great. What we should do is always consider the context in which an automation project exists. We should think explicitly about the benefits of automating and when they will be realized, whether automating will actually put an additional burden on users, and whether the realistic cost of automating is actually worthwhile.

Image credit: XKCD: Automation

Maintenance

I started on a new project last week. I started the way I usually do, with a sketch of how it should work, inputs, outputs, and a general idea of the data flow. Then I did a rough prototype. Progress was pretty fast, partly because the concept is simple, and partly because I wrote something similar for a past employer. After hacking on it over the weekend, I spent yesterday doing “cleanup” to get it ready for actual use. Yesterday afternoon I realized that, despite getting quite a bit done, I felt I had hardly made a dent in my “todo” list for the project.

This made me ponder. I feel as though the speed at which I can complete a given project has fallen since I was a kid. I remember being 13 or so and starting something after dinner, staying up all night, and having a working application in the morning. Has something changed since then? Or is it just my perception or poor memory?

On the one hand, the “quality” of the code I write today is much higher. For instance, when I was a kid I saw no problem with using a text field as the canonical storage for a piece of data. I also remember a lot of deeply nested branching statements and extremely long functions. Code quality certainly explains some of the “slowdown” I perceive.

But there is more to it. I didn’t write crappy code as a kid because I had no choice. In some cases I actually knew better, and I certainly had no shortage of books from which I could have learned the rest. As I thought about my younger self during my walk to the metro station last night I realized that the greatest difference between my younger self and my present self is that my younger self didn’t expect to have to maintain any of the code he wrote. Once it was finished, it was finished, I moved on to the next project (in a way, this reflected the prevailing software release cycle of the time, I’m not sure if that influenced me in any way).

Today, I expect to have to maintain the code I write. Every time I write a line I unconsciously consider whether I’ve just written a check I’ll be asked to cash later. This means I rewrite more lines than I used to, or take longer to write them the first time (more thinking, less coding). It also means that I test and document more.

Oh well, back to (slowly) writing code.