Tag Archives: testing

Write custom assertions whenever possible

I’ve been very interested in readable tests with great error messages recently. Mostly because they kept failing and I wanted the most information possible in order to quickly identify the cause. This is another reason why I like TDD: you see the test failing first, so if the error message isn’t great you’ll know straight away instead of months later.

The good testing frameworks provide a way of writing your own custom assertions. I’d never really looked into them that much before, but now I realize the error of my ways. Recently I wrote a test that contained this line:

fileName.exists.shouldBeTrue;

Readable, right? The problem is when it fails:

foo.d:42 - Expected: true
foo.d:42 -      Got: false

And now you have to go read the test and figure out what went wrong. It’s a lot better to get the information that a file was supposed to exist instead right away. So I wrote a custom assertion and was then ready to write this:

fileName.shouldExist;

With the corresponding failure message:

foo.d:42 - Expected /tmp/foo.txt to exist but it didn't

Now it’s a lot easier to pinpoint where the problem is. For starters, you would probably want to start checking the contents of the surrounding directory, having saved the time you would have had to spend figuring out what exactly was false.

Tagged ,

unit-threaded: now an executable library

It’s one of those ideas that seem obvious in retrospect, but somehow only ocurred to me last week. Let me explain.

I wrote a unit testing library in D called unit-threaded. It uses D’s compile-time reflection capabilities so that no test registration is required. You write your tests, they get found automatically and everything is good and nice. Except… you have to list the files you want to reflect on, explicitly. D’s compiler can’t go reading the filesystem for you while it compiles, so a pre-build step of generating the file list was needed. I wrote a program to do it, but for several reasons it wasn’t ideal.

Now, as someone who actually wants people to use my library (and also to make it easier for myself), I had to find a way so that it would be easy to opt-in to unit-threaded. This is especially important since D has built-in unit tests, so the barrier for entry is low (which is a good thing!). While working on a far crazier idea to make it a no-brainer to use unit-threaded, I stumbled across my current solution: run the library as an executable binary.

The secret sauce that makes this work is dub, D’s package manager. It can download dependencies to compile and even run them with “dub run”. That way, a user need not even have to download it. The other dub feature that makes this feasible is that it supports “configurations” in which a package is built differently. And using those, I can have a regular library configuration and an alternative executable one. Since dub run can take a configuration as an argument, unit-threaded can now be run as a program with “dub run unit-threaded -c gen_ut_main”. And when it is, it generates the file that’s needed to make it all work.

So now all a user need to is add a declaration to their project’s dub.json file and “dub test” works as intended, using unit-threaded underneath, with named unit tests and all of them running in threads by default. Happy days.

Tagged , , , , ,

The importance of making the test fail

TDD isn’t for everyone. People work in different ways and their brains even more so, and I think I agree with Bertrand Meyer in that whether you write the test first or last, the important thing is that the test gets written. Even for those of us for whom TDD works, it’s not always applicable. It takes experience to know when or not to do it. For me, whenever I’m not sure of exactly I want to do and am doing exploratory work, I reach for a REPL when I can and don’t even think of writing tests. Even then, by the time I’ve figured out what to do I usually write tests straight afterwards. But that’s me.

However, when fixing bugs I think “TDD” (there’s not any design going on, really) should be almost mandatory. Almost, because I thought of a way that works that doesn’t need the test to be written first, but it’s more cumbersome. More on that later.

Tests are code. Code is buggy. Ergo… tests will contain bugs. So can we trust our tests? Yes, and especially so if we’re careful. First of all, tests are usually a lot smaller than the code they test (they should be!). Less code means fewer bugs on average. If that doesn’t give you a sense of security, it shouldn’t. The important thing is making sure that it’s very difficult to introduce simultaneous bugs in the test and production code that cancel each other out. Unless the tests are tightly coupled with the production code, that comes essentially for free.

Writing the test to reproduce a bug is important because we get to see it go from red to green, which is what gives us confidence. I’ve lost count of how many fake greens I’ve had due to tests that weren’t part of the suite, code that wasn’t even compiled, bugs in the test code, and many other reasons. Making it fail is important. Making changes in a different codebase (the production code) and then the test passing means we’ve actually done something. If at any point things don’t work as they should (red -> green) then we’ve made a mistake somewhere. The fix is wrong, the test is buggy, or our understanding of the problem and what causes it might be completely off.

Reproducing the bug accurately also means that we don’t start with the wrong foot. You might think you know what’s causing the bug, but what better way than to write a failing test? Now, it’s true that one can fix the bug first, write the test later and use the VCS to go back in time and do the red/green dance. But to me that sounds like a lot more work.

Whether you write tests before of after the production code, make sure that at least one test fails without the bugfix. Even if by just changing a number in the production code. I get very suspicious when the test suite is green for a while. Nobody writes correct code that often. I know I don’t.

Tagged , , ,

To learn BDD with Cucumber, you must first learn BDD with Cucumber.

So I read about Cucumber a while back and was intrigued, but never had time to properly play with it. While writing my MQTT broker, however, I kept getting annoyed at breaking functionality that wasn’t caught by unit tests. The reason being that the internals were fine, the problems I was creating had to do with the actual business of sending packets. But I was busy so I just dealt with it.

A few weeks ago I read a book about BDD with Cucumber and RSpec but for me it was a bit confusing. The reason being that since the step definitions, unit tests and implementation were all written in Ruby, it was hard for me to distinguish which part was what in the whole BDD/TDD concentric cycles. Even then, I went back to that MQTT project and wrote two Cucumber features (it needs a lot more but since it works I stopped there). These were easy enough to get going: essentially the step definitions run the broker in another process, connect to it over TCP and send packets to it, evaluating if the response was the expected one or not. Pretty cool stuff, and it works! It’s what I should have been doing all along.

So then I started thinking about learning BDD (after all, I wrote the features for MQTT afterwards) by using it on a D project. So I investigated how I could call D code from my step definitions. After spending the better part of an afternoon playing with Thrift and binding Ruby to D, I decided that the best way to go about this was to implement the Cucumber wire protocol. That way a server would listen to JSON requests from Cucumber, call D functions and everything would work. Brilliant.

I was in for a surprise though, me who’s used to implementing protocols after reading an RFC or two. Instead of a usual protocol definition all I had to go on was… Cucumber features! How meta. So I’d use Cucumber to know how to implement my Cucumber server. A word to anyone wanting to do this in another language: there’s hardly any documentation on how to implement the wire protocol. Whenever I got lost and/or confused I just looked at the C++ implementation for guidance. It was there that I found a git submodule with all of Cucumber’s features. Basically, you need to implement all of the “core” features first (therefore ensuring that step definitions actually work), and only then do you get to implement the protocol server itself.

So I wanted to be able to write Cucumber step definitions in D so I could learn and apply BDD to my next project. As it turned out, I learned BDD implementing the wire protocol itself. It took a while to get the hang of transitioning from writing a step definition to unit testing but I think I’m there now. There might be a lot more Cucumber in my future. I might also implement the entirety of Cucumber’s features in D as well, I’m not sure yet.

My implementation is here.

Tagged , , , , , , , , , ,