AngularJS for web UI?

Well, yes. AngularJS. Why not? It got a lot of attention recently (or maybe I just noticed that) and it seemed like a good idea to try something new after having most of my .NET web experience with ASP.NET WebForms (and you don’t want to go that way if you don’t need to). So I picked it and started having fun.

And fun it was, indeed. Pretty simple to start and create something. Nicely working, updating UI form model, updating model from UI. Clean code, clean HTML markup. I must say I’m impressed, didn’t expect it to be that easy and nice (the same feeling I had with NancyFX recently, I start to see a pattern – I’m doing what I want, how I want and with tools I want – me like it).

But there are some dark corners in Angular that got me when I least expected it. First – double declared ng-app attribute. Well, my fault of course. I set it on html element first in Master Page, done few things, changed few things and added by mistake second ng-app attribute in actual page, using this master page. Small error, but application stopped working, nothing behaved like expected. Blank screen, no bindings. Took me quite a while to figure it out. Oh, how I wish Angular told me – hey, man. I’ve noticed you put two ng-app attributes in your page. Sure you wanted to do that? But nope, nothing, zero, nada.

Second time it bite me was when I was doing some model data manipulation. Everything working fine in JavaScript, but in UI – nothing gets refreshed. I used ng-click binding to get function executed. Oh, how much time I spent (but not wasted completely) looking for error. Everything seemed to be ok, code works, just UI isn’t right. Looked at documentation, looked at StackOverflow, looked at blogs. Nothing helped. Learned a lot about Angular, but I just wanted my bindings to work! Well, look at this code:
<a href="#/" ng-click="manipulate()">Link</a>
Nothing extremely complicated. What’s wrong? It have href attribute set. Oh boy, I had nice facepalm once I noticed it. With Angular routing set up, going to #/ caused re-creation of my controller, reloading data and refreshing page. But it all happened so fast that I simply did not noticed and it all looked to me like my code is not working. Remove href and guess what? It was working like a charm from the beginning.

I’m pretty sure I will have quite a few problems with Angular. But it seems to be such a great tool that I am willing to solve them and learn along the way just for fun of working with something that clever.

NancyFx is a way to go

Another quickie. I wrote tiny domain for fun. Now I wanted to write some UI (HTML) code and api to access domain. First I was thinking about standard .NET WebService, but at Twitter I’ve stumbled at NancyFx. Took me about 5 minutes from this twit to having Nancy deployed in my solution, configured and returning test value. And it’s awesome. No more over complicated configuration. No more web.config modifications just to make some things work. For simple things at least Nancy seems great choice. Hope it will keep surprising me in positive way over and over again!

End of a day with TDD

Quick one today. You are doing TDD and it is fun and it feels right. Know how to do better? End each day/work session by writing failing test to the next thing you wanted to write. When you will be back to your code next time you will immediately know what you were doing, what you planned to do next and you will start right away. This is pretty good solution to the problem of looking through the code, source control history logs etc. to figure out where you left last time. Works for me!

Composition vs. Inheritance

This is pretty popular question – which is better, inheritance or composition? I bet you wouldn’t need to look for too long to find out that popular opinion is that composition is better. Not surprisingly I have to agree with that completely. Many computer science courses will show you inheritance as a way to go in Object-Oriented language and naturally many people (me included) assume that it’s good, clear and easy. After all we reuse code, aren’t we? But it won’t take you much time in any non-Hello World project to figure out that it causes more trouble than it gives you benefits

I’ve recently seen a code that follows pattern like this:

public class Document
{
    public int Id { get; set; }
    public string Content { get; set; }
}

public class DocumentSerializable : Document
{
    public void WriteDocument(Stream stream)
    {
        stream.Write(this.Id.ToString());
        stream.Write(Encoding.UTF8.GetBytes(this.Content));
    }

    public void ReadDocument(Stream stream)
    {
        this.Id = stream.ReadInt();
        this.Content = stream.ReadString();
    }
}

This is of course pseudo-code, but it give’s the idea. Let’s skip it for now why someone needed custom serialization mechanism (assume that this was for some reason needed) and concentrate on how it was developed. We have inheritance, code is pretty simple, anyone can take care of it, add new property etc. No biggie. So why this is bad design?

First of famous SOLID principles is Single Responsibility Principle – one object should have not two, not three but exactly one purpose. In our case Document object stores (and manages) some kind of document content. It’s the reason this class was introduced. Then we are adding another responsibility to this object – it needs to know how to save and load itself from stream of data. Why? We don’t exactly know, but it keeps code in one place so we go with it. Second responsibility isn’t that bad after all, it’s just one more.

Now see, that if you are going to work with Document, you are in fact going to work with DocumentSerializable – there is not much point in using plain Document since you cannot save it for later. Still you are fine with it, no one needs to know you are working with other type, that’s the beauty of inheritance, we can have virtual methods and properties, we can treat those objects like they were plain vanilla Documents and no one will notice.

That is until you will need to collaborate with other system for example. Like in this particular case. Document had to be shared with other system over which we had no control. This system expected to get Document, nothing else. It also expected it in form of XML file. .NET supports XML serialization out of the box, so we don’t have much to do – one would think. But such solution failed quickly – serialized object indicated clearly in XML that it is not Document but insisted it is DocumentSerializable – something target system never heard (nor should have heard) about. So what do we do? Can we force XmlSerializer to mark our derived type as base type? Nope, not that I know of – it wouldn’t make much sense anyway, would it? I mean – derived type can introduce new fields, properties etc. which base type deserializer wouldn’t know how to handle.

So what do we do now? We can write our own Xml serialization mechanism – sure we can. But that’s just boring, we will make mistakes along the way, we will need to take care of the code in future, that’s no fun at all. We can write another method that would convert DocumentSerializable into Document – that will be simple, we can even use Reflection to automate this. But that’s another responsibility put into this object (with probably complicated business logic, handling documents is never easy). We could think of some other solution – it would work, I’m sure. But see what we’re doing now? We are looking for a way to work around, hack, our own design of system to achieve something simple. Doesn’t it smell bad?

How could it be done differently? With use of composition of course (or not even composition, those objects could be completely separated). For example:

public class Document
{
    public int Id { get; set; }
    public string Content { get; set; }
}

public class DocumentSerializable
{
    public Document Document { get; set; }
    public void WriteDocument(Stream stream)
    {
        stream.Write(Document.Id.ToString());
        stream.Write(Encoding.UTF8.GetBytes(Document.Content));
    }

    public void ReadDocument(Stream stream)
    {
        this.Document = new Document()
        Document.Id = stream.ReadInt();
        Document.Content = stream.ReadString();
    }
}

This way we can have serialization in separate class, but work with Document all the time, not having to worry about derived type at any time. Is it huge difference? Looking at this example probably not. But imagine having business logic in there, imagine there are dozens of such types in the system. Suddenly it all starts to be more and more complicated with inheritance, but still easily manageable and testable with composition and following Single Responsibility Principle

Perl to the rescue

Have you ever considered using Perl to make your work easier? I know I didn’t. I didn’t know Perl at all, never used it, hardly even know how the code can look. Friend once joked that Perl is write-only language since you will never be able to understand the code you wrote a month ago. Of course if you’re lucky enough you are coding your everyday work code in C# or Java and you have all you could wish for. If, on the other hand, you are adventurous, like me, you are working with a language used by approximately 1523 (estimates by me) people in the world which does nothing good except for making your life harder.

So I’m integrating two systems. Fun. What’s not fun is that the data is shared using FTP and CSV files. What’s not fun even more is when your colleagues decide that normal CSV is lame file format and they decide to make some customization. So now we have TSV – Tilde Separated Values! Why tilde (~)? Because it probably won’t show up in the data, while commas probably will. You can escape commas in strings using quotes? Yea, but that’s lame.

Now, all reasonably engineered software would allow to specify separator when importing CSV into memory object. But read the first paragraph – I’m working with language that let’s me read whole file into memory, but not line by line. It let’s me find characters, but substrings must be done manually. No split or regex support. No fun at all. And I need to get that CS… TSV file into memory. It also needs to run on production, where may be no Java, .NET etc. And you get your permissions limited. And time limited. And people are looking at you funny when you want something installed.

But I know that Perl is there and some people wrote some stuff in it. It is also known for great support of regular expressions which is something I was looking for. So I started Google, learned how to read and write file (and that’s amazingly nice and easy!) and then spent two or three hours trying to figure out regex that will do what I want. Find any string with comma in it, select the whole column (either from line start or from tilde to line end or next tilde). When you have this pattern, replace it with quoted string. Then replace all tildes with commas. Sounds easy when said that way. But try putting it to a regex, my friend, and you are knocking on those special madness doors that we have deep inside us.

Code for detecting strings with comma is simple:

.*,.*

which translates to any character 0 or more times, then comma, then any character 0 or more times.

We have first problem: string can start either at beginning of line of from a tilde, so we have something like this:

(^|~)

^ is for line start, ~ for tilde. Pipe “|” is or operator and we are limiting this or with parenthesis. And what about end of line? That’s very similar:

(~|$)

$ is for line end, rest the same as above.

(^|~).*,.*(~|$)

Will it work? Sure. Will it work as intendet? Not-at-all. Regex usually try to find the longes possible matching string, which will give us wrong results. We want to select just a column, not whole line! Se the sample script below:

    #!user/bin/perl

    my $string = "1 2 3~1, 2 3~1 2 3";

    print "$string\n";
    $string =~ s/(^|~).*,.*(~|$)/"$&"/;
    print "$string\n";

and sample output:

1 2 3~1, 2 3~1 2 3
"1 2 3~1, 2 3~1 2 3"

Definitely not what we are looking for. By the way, $& is perl’s way to say – give me the whole matched pattern. Sure there must be something to tell regex engine to pick just the shortest possible match. This operator is “?”. We can add it after operators like * or +.

$string =~ s/(^|~).*?,.*?(~|$)/"$&"/;

We’re getting there slowly, now output is:

1 2 3~1, 2 3~1 2 3
"1 2 3~1, 2 3~"1 2 3

So the pattern ended earlier, but still starts at the beginning of line instead of tilde. Why? Because we said that our string can contain any character (dot “.” operator), while in fact it cannot. It cannot contain tilde character, since this would mean we’re getting to the next column.

$string =~ s/(^|~)[^~]*?,[^~]*?(~|$)/"$&"/;

almost there:

1 2 3~1, 2 3~1 2 3
1 2 3"~1, 2 3~"1 2 3

The only problem now is that tildes are included into matched patter, but we would rather leave them out. Can we? Sure we can, hold on, that one’s fun!

$string =~ s/(^|(?&lt;=~))[^~]*?,[^~]*?((?=~)|$)/"$&"/;
<p>And we're finally there:</p>
    <code>1 2 3~1, 2 3~1 2 3
    1 2 3~"1, 2 3"~1 2 3</code>
<p>Exactly what we wanted! What we did there? We've applied look-behind and look-ahead operators to tildes. Those are called zero-width assertions, meaning that they will check for matching character, but will not include this character into selected match itself. ?&lt;= is look-behind (check if there is character defined before the match) and ?= is look-ahead (check if after the match there is character). Testing it on more lengthy string shows one more, small issue:</p>
    <code>1, 2 3~1 2 3~1, 2 3~1 2 3~1 2, 3
    "1, 2 3"~1 2 3~1, 2 3~1 2 3~1 2, 3</code>
<p>Only first match was selected. By default perl regex will look just for a first match in line, but if at the end you will add option g, it will look for all matches.</p>
    $string =~ s/(^|(?&lt;=~))[^~]*?,[^~]*?((?=~)|$)/"$&"/g;

And our final output is:

1, 2 3~1 2 3~1, 2 3~1 2 3~1 2, 3
"1, 2 3"~1 2 3~"1, 2 3"~1 2 3~"1 2, 3"

Pretty, isn’t it? And the whole script to convert one file format to the other took about 8 lines of code + comments for my future self. Is it perfect? No, there are some edge cases I’m sure it will fail, but it is good enough for the data I’m going to work with.

It was extremely easy to reproduce this regex now, that I’m writing this blog post, two days after creating it at work. But figuring this out on Thursday morning was far from easy. I can never remember all those regex operators, where to set up them, how to group variables, create optional characters etc. Reading regular expressions is also pain, seems like bunch of random characters. But the power in them is oh-so-great! Don’t over use them. But when there is a clear need for them – don’t hesitate! Just add clear comment on what that regex is doing and test it carefully.