TDD is not (necessarily) microdesign!

Incredible how much have people been talking about TDD and problems with it over last few weeks. I thought it is mostly settled topic: it probably is not ideal, but it is pretty much best we can do for now in a long term. Yet many suddenly (or not so suddenly) decided that TDD sucks, is dead and creates more problems than it solves. One that I’ve stumbled upon and that made me write this post is: The pitfalls of test driven development (http://beust.com/weblog/2014/05/11/the-pitfalls-of-test-driven-development/).

I’ll just touch one point now: “TDD Promotes Microdesign over Macrodesign”. Go on, read that paragraph, I have time. OK, now that you read it – do you agree with it? I don’t. From the section title forward it seems to be based on very wrong idea about TDD – that TDD is the only approach you are using to write your application. Let’s clear this up. TDD is not a way (usually) to design your whole application from top to bottom. TDD is approach to write good, test covered code. TDD helps to design classes, components, maybe packages. But not applications or solutions to problem.

Imagine you have large business problem to solve. Would you really expect TDD follower to just jump in and start writing tests and code? Come on, no sane software engineer would do that. You think about the problem, you try to understand domain, all the details, all the caveats of this particular business problem. You are not thinking about the code at this moment at all. Maybe solution can be created without code at all? Are there other possible solutions to the problem? We don’t always need to write new programs, we need to solve problems.

Then you start imagining your application. What would you need, what big parts (modules) seem to be useful. Do you need distributed application to handle complex calculations? Will it be client-server solution? How much data will it process? How often? Those are all requirements you have to think before sitting down to code something.

Only after you have this broad idea of what you are doing, you can sit down and start writting code. And then TDD is what you follow. You write components and classes with tests to assert your invariants and to provide API for usage from other points in system. And you are better this way – TDD will help you create better, smaller classes, more practical solutions and so on. But on lower level. You’ve dealt with macrodesign before you even considered writting code in the first place. Now it is time for microdesign.

Yes, software writting is iterative process. That’s why having set of unit tests to cover your back is so useful. You will need to do some changes in your code. It’s better to have some tests that will assure you that all is working fine after changes. You can write them after you’ve created code or you can write it TDD style. Does not matter all that much. But don’t throw away TDD because of some missconceptions. TDD is not solution to all problems. It is solution to one specific problem – low level code testability. Not application high level design.

Advertisements

Limit your abstractions

“You can solve every problem with another level of indirection” they say. But don’t be fooled! Full quote goes something like: “You can solve every problem with another level of indirection, except for the problem of too many levels of indirection”. And keep that in mind when writing your code.

I’m currently working on refactoring quite important piece of code in big, complex project. Fun thing to do, I enjoy it tremendously. But I’m new to the project and I’m lost in the code. Thankfully I got some sequence diagrams and sequence diagram along with long explanation what this part of system does and why. Then I looked at the code, trying to debug it few times to know where I am and what I’m working with.

What I saw there looked nothing like the diagrams I got. After few hours (days?) I could find some relation between code and description I received but it was way to hard. Even though the system is complex and functionality of this part is not exactly piece of cake, the whole thing had so much code and so many different layers that it is sill nearly impossible for me to tell where am I and what the code does in this place.

Interfaces are cool. Extracting functionality to small, well defined classes is what I like. But once you start jumping between five or six different classes to complete one small action you start feeling that something is definitely not right. At the moment I have opened something like 20 or 30 different classes to go through the workflow, and this is just preparation for actual calculations happening in different place (thankfully this is the part I’m not going to touch this time).

It may feel better to extract class from one piece of code. But I’ve noticed that the same type, Report in this case, is represented by four or five different classes with slightly different naming and functionality. But it is still exactly the same report in exactly the same system! Why not just settle for one, well defined object? Need to get something from database – jump between three or four objects before you will get to actual query.

Flatten your abstractions. Extract classes where needed, but don’t hide everything behind interfaces when not needed. Those are your domain objects, there’s no harm in your main object knowing about exact implementation of one of sub objects, those two are tightly coupled anyway. Sometimes we need to get our hands dirty in all those pesky details. That’s OK, it needs to be done. Extract methods to make it easier to understand. Make code short and concise. But fight the urge to extract every single detail into new class hidden behind interface because it needs to be testable. That’s not how tests are supposed to look most of the time.

So when to abstract things into interface, different class, different layer altogether? I’m not sure myself. Right now I’m just following my guts, making mistakes in fixing them in future when I see that something is not right. One day, I believe, I will master this to create perfect code, but not today and not tomorrow. But I’m getting there, you can be sure!

Composition vs. Inheritance

This is pretty popular question – which is better, inheritance or composition? I bet you wouldn’t need to look for too long to find out that popular opinion is that composition is better. Not surprisingly I have to agree with that completely. Many computer science courses will show you inheritance as a way to go in Object-Oriented language and naturally many people (me included) assume that it’s good, clear and easy. After all we reuse code, aren’t we? But it won’t take you much time in any non-Hello World project to figure out that it causes more trouble than it gives you benefits

I’ve recently seen a code that follows pattern like this:

public class Document
{
    public int Id { get; set; }
    public string Content { get; set; }
}

public class DocumentSerializable : Document
{
    public void WriteDocument(Stream stream)
    {
        stream.Write(this.Id.ToString());
        stream.Write(Encoding.UTF8.GetBytes(this.Content));
    }

    public void ReadDocument(Stream stream)
    {
        this.Id = stream.ReadInt();
        this.Content = stream.ReadString();
    }
}

This is of course pseudo-code, but it give’s the idea. Let’s skip it for now why someone needed custom serialization mechanism (assume that this was for some reason needed) and concentrate on how it was developed. We have inheritance, code is pretty simple, anyone can take care of it, add new property etc. No biggie. So why this is bad design?

First of famous SOLID principles is Single Responsibility Principle – one object should have not two, not three but exactly one purpose. In our case Document object stores (and manages) some kind of document content. It’s the reason this class was introduced. Then we are adding another responsibility to this object – it needs to know how to save and load itself from stream of data. Why? We don’t exactly know, but it keeps code in one place so we go with it. Second responsibility isn’t that bad after all, it’s just one more.

Now see, that if you are going to work with Document, you are in fact going to work with DocumentSerializable – there is not much point in using plain Document since you cannot save it for later. Still you are fine with it, no one needs to know you are working with other type, that’s the beauty of inheritance, we can have virtual methods and properties, we can treat those objects like they were plain vanilla Documents and no one will notice.

That is until you will need to collaborate with other system for example. Like in this particular case. Document had to be shared with other system over which we had no control. This system expected to get Document, nothing else. It also expected it in form of XML file. .NET supports XML serialization out of the box, so we don’t have much to do – one would think. But such solution failed quickly – serialized object indicated clearly in XML that it is not Document but insisted it is DocumentSerializable – something target system never heard (nor should have heard) about. So what do we do? Can we force XmlSerializer to mark our derived type as base type? Nope, not that I know of – it wouldn’t make much sense anyway, would it? I mean – derived type can introduce new fields, properties etc. which base type deserializer wouldn’t know how to handle.

So what do we do now? We can write our own Xml serialization mechanism – sure we can. But that’s just boring, we will make mistakes along the way, we will need to take care of the code in future, that’s no fun at all. We can write another method that would convert DocumentSerializable into Document – that will be simple, we can even use Reflection to automate this. But that’s another responsibility put into this object (with probably complicated business logic, handling documents is never easy). We could think of some other solution – it would work, I’m sure. But see what we’re doing now? We are looking for a way to work around, hack, our own design of system to achieve something simple. Doesn’t it smell bad?

How could it be done differently? With use of composition of course (or not even composition, those objects could be completely separated). For example:

public class Document
{
    public int Id { get; set; }
    public string Content { get; set; }
}

public class DocumentSerializable
{
    public Document Document { get; set; }
    public void WriteDocument(Stream stream)
    {
        stream.Write(Document.Id.ToString());
        stream.Write(Encoding.UTF8.GetBytes(Document.Content));
    }

    public void ReadDocument(Stream stream)
    {
        this.Document = new Document()
        Document.Id = stream.ReadInt();
        Document.Content = stream.ReadString();
    }
}

This way we can have serialization in separate class, but work with Document all the time, not having to worry about derived type at any time. Is it huge difference? Looking at this example probably not. But imagine having business logic in there, imagine there are dozens of such types in the system. Suddenly it all starts to be more and more complicated with inheritance, but still easily manageable and testable with composition and following Single Responsibility Principle