Tag Archives: tdd

Commit failing tests if your framework allows it

In TDD, one is supposed to go through the 3-step cycle of:

  1. Write a failing test
  2. Make it pass
  3. Refactor

The common-sense approach is to not commit the failing test from the first step, since that would thrown a spanner in the works when you inevitably have to bisect your commit DAG trying to figure out where a bug was introduced.

I’ve come to a realisation recently – failing tests should be commited, but only if the testing framework being used allows you to mark failures as successes. For instance, in my D testing framework unit-threaded, I’d commit this silly example:

@ShouldFail("WIP")
unittest {
    1.shouldEqual(2);
}

If you’re not familiar with D, it has built-in unit tests, and unittest is a keyword. @ShouldFail is a User Defined Attribute, part of the library indicating that the unit test it applies to is expected to fail, and allows the user to specify an optional string describing why that’s the case. It could be a bug ID as well.

The test above passes if any of the code in the unittest block throws an exception, i.e. it passes if it fails. This way we can have a single commit of the failing test that motivated the code changes that follow it, and we can’t forget to remove @ShouldFail – in fact, if the programmer implements the feature / fixes the bug correctly, they should expect to see the test suite go red. If that doesn’t happen, either the production code or the test is buggy.

I’m not aware of many frameworks that allow a programmer to do this; pytest has something similar. If yours does, commit your failing tests.

Advertisements
Tagged , ,

The importance of making the test fail

TDD isn’t for everyone. People work in different ways and their brains even more so, and I think I agree with Bertrand Meyer in that whether you write the test first or last, the important thing is that the test gets written. Even for those of us for whom TDD works, it’s not always applicable. It takes experience to know when or not to do it. For me, whenever I’m not sure of exactly I want to do and am doing exploratory work, I reach for a REPL when I can and don’t even think of writing tests. Even then, by the time I’ve figured out what to do I usually write tests straight afterwards. But that’s me.

However, when fixing bugs I think “TDD” (there’s not any design going on, really) should be almost mandatory. Almost, because I thought of a way that works that doesn’t need the test to be written first, but it’s more cumbersome. More on that later.

Tests are code. Code is buggy. Ergo… tests will contain bugs. So can we trust our tests? Yes, and especially so if we’re careful. First of all, tests are usually a lot smaller than the code they test (they should be!). Less code means fewer bugs on average. If that doesn’t give you a sense of security, it shouldn’t. The important thing is making sure that it’s very difficult to introduce simultaneous bugs in the test and production code that cancel each other out. Unless the tests are tightly coupled with the production code, that comes essentially for free.

Writing the test to reproduce a bug is important because we get to see it go from red to green, which is what gives us confidence. I’ve lost count of how many fake greens I’ve had due to tests that weren’t part of the suite, code that wasn’t even compiled, bugs in the test code, and many other reasons. Making it fail is important. Making changes in a different codebase (the production code) and then the test passing means we’ve actually done something. If at any point things don’t work as they should (red -> green) then we’ve made a mistake somewhere. The fix is wrong, the test is buggy, or our understanding of the problem and what causes it might be completely off.

Reproducing the bug accurately also means that we don’t start with the wrong foot. You might think you know what’s causing the bug, but what better way than to write a failing test? Now, it’s true that one can fix the bug first, write the test later and use the VCS to go back in time and do the red/green dance. But to me that sounds like a lot more work.

Whether you write tests before of after the production code, make sure that at least one test fails without the bugfix. Even if by just changing a number in the production code. I get very suspicious when the test suite is green for a while. Nobody writes correct code that often. I know I don’t.

Tagged , , ,

To learn BDD with Cucumber, you must first learn BDD with Cucumber.

So I read about Cucumber a while back and was intrigued, but never had time to properly play with it. While writing my MQTT broker, however, I kept getting annoyed at breaking functionality that wasn’t caught by unit tests. The reason being that the internals were fine, the problems I was creating had to do with the actual business of sending packets. But I was busy so I just dealt with it.

A few weeks ago I read a book about BDD with Cucumber and RSpec but for me it was a bit confusing. The reason being that since the step definitions, unit tests and implementation were all written in Ruby, it was hard for me to distinguish which part was what in the whole BDD/TDD concentric cycles. Even then, I went back to that MQTT project and wrote two Cucumber features (it needs a lot more but since it works I stopped there). These were easy enough to get going: essentially the step definitions run the broker in another process, connect to it over TCP and send packets to it, evaluating if the response was the expected one or not. Pretty cool stuff, and it works! It’s what I should have been doing all along.

So then I started thinking about learning BDD (after all, I wrote the features for MQTT afterwards) by using it on a D project. So I investigated how I could call D code from my step definitions. After spending the better part of an afternoon playing with Thrift and binding Ruby to D, I decided that the best way to go about this was to implement the Cucumber wire protocol. That way a server would listen to JSON requests from Cucumber, call D functions and everything would work. Brilliant.

I was in for a surprise though, me who’s used to implementing protocols after reading an RFC or two. Instead of a usual protocol definition all I had to go on was… Cucumber features! How meta. So I’d use Cucumber to know how to implement my Cucumber server. A word to anyone wanting to do this in another language: there’s hardly any documentation on how to implement the wire protocol. Whenever I got lost and/or confused I just looked at the C++ implementation for guidance. It was there that I found a git submodule with all of Cucumber’s features. Basically, you need to implement all of the “core” features first (therefore ensuring that step definitions actually work), and only then do you get to implement the protocol server itself.

So I wanted to be able to write Cucumber step definitions in D so I could learn and apply BDD to my next project. As it turned out, I learned BDD implementing the wire protocol itself. It took a while to get the hang of transitioning from writing a step definition to unit testing but I think I’m there now. There might be a lot more Cucumber in my future. I might also implement the entirety of Cucumber’s features in D as well, I’m not sure yet.

My implementation is here.

Tagged , , , , , , , , , ,