No New Ideas

What's in a name

2009-07-19T14:13:00.002+01:00

It is really important to call out the names of the big box patterns in a system and the responsibilities in each box. Even if it is immediately obvious to you how your application should hang together it may not be obvious to all the members of your team. As soon as there is uncertainty people will make guesses and a whole heap of code lands in the wrong place. This is refactoring you didn't need to incur.

Once you have identified the aspects of the code that you have names for get a picture on the wall so you can point or stare at it when you are having one of those "where should we put xyz?" conversations in your team.

Also be vigilant for code escaping it's box - and name the anti patterns related to each nefarious deed. My favourite is the Fat Controller anti pattern - where view/service/domain logic wheedles it's way into the controller leading to a spate of Thomas the Tank engine related jokes.

Also try to be super consistent with the naming conventions used for your core application structure - it makes a huge difference.

NSynthesis 0.1.1 has escaped!

2009-06-18T15:13:00.006+01:00

I'm delighted to announce that NSynthesis 0.1.1 has finally escaped from my laptop where it has languished for many months. It can be downloaded from here.

NSyntheis is a .NET implementation of the Ruby project, Synthesis. The idea is to connect tests which mock expectations to a corresponding test of a real implementation of the mocked call. Why? To give you more confidence that all those itty bitty unit tests add up to a comprehensive test suite.

Using PostSharp to hook into the gap between your test assembly and your production code, NSynthesis monitors your unit tests as they run. For each expectation set in your tests, NSynthesis demands that you have executed every possible concrete implementation during the same test run. NSyntheis uses PostSharp to modify the test code only, so there are no changes required to your production code base.

Currently, NSynthesis requires that you are using the following tools

NUnit 2.5
Rhino Mocks
Nant

The project is definitely at the curiosity stage of development. It needs to be run in on a proper project - so if you fancy giving it a go let me know how you get on with it.

The main issue which needs to be fixed next is handling multiple test runs in a build. Currently, there is no memory of the various unit test stages which you may have in a more advanced build (e.g. standalone tests, wired tests, functional tests). This is an issue if the mocked calls and the real code calls are not contained in the same test run.

Many thanks to Alex Scordellis, especially for his help with the evil .NET reflection API.

Bowling Cards

2008-11-13T21:04:00.007+00:00

Recently, my team decided we had to tackle a number of problems with our aging application code base which were causing us pain. There were a number of technology and design problems which needed sorting out. The challenge for us was to continue to deliver new features while taking measures to not only stop the signs of rot, but also to make an effort to improve the code base. If we could improve the code where we were most active we would be able to deliver features more rapidly.

The first stage was a technical retrospective, facilitated by Liv, our project manager. The whole team got together and we brainstormed the problems with the code base which were slowing down active development and causing us pain. With everything hurting us out in the open, the team voting for the issues they thought were causing us the most pain. The focus was delivery, we weren't looking to make the code more beautiful for aesthetic reasons; we were trying to reduce friction to help us deliver more code more rapidly. With the key problems agreed, we moved on to discuss how we could fix them. So far so good.

Some of our problems could be fixed the traditional agile way write a technical card, estimate the cost and agree who would do the work and when.

Others were ongoing "try not to repeat this anti pattern" or "remember to do X" type solutions, so there was no action as such, other than to remember to not do "X" or to do "Y".

We also had a third class of problem. Long running repair or rework tasks. For example, the team has switched mocking libraries from NMock to Rhino Mocks. While we find we are more productive when we mock using Rhino mocks, we still have a legacy of old unit tests which periodically bite us when we need to refactor. The idea of scheduling a task to rewrite all our unit tests using Rhino Mocks is just plain wrong for a number of reasons. On humanitarian grounds, I couldn't bring myself to ask a member of my team to spend a week rewriting all our tests - we'd need to put her on suicide watch. Also, from a business value perspective, we only need to rewrite those tests which cover the active areas of the code. The application has been around for a while, and we are only adding features in some areas of the code. If we are not going to be slowed down by NMock riddled tests in a remote area of the code, why waste client money fixing them? Enter the bowling card.

The anatomy of a bowling card is very simple. The title should be a simple statement of intent. Here, we seek annihilation of all NMock calls in our unit tests.

In the middle is a grid, with a box for each step we need to take to win. In this case, we have a box for each test class which includes the NMock library.

Anyone gets to work on the bowling card, whenever the time is right. As a team you can decide when this should be. For this card the rules were that if you find yourself maintaining or extending an NMock based class, take the time to fix it by rewriting the tests to use Rhino mocks, or no mock at all.

There may or may not be a deadline. For this card, we don't have a deadline. There is no pressing need to replace all the NMock calls, only those in code hotspots.

You may or may not schedule the card to be played. For this card we didn't schedule work - there is no real business justification to do that. For another technical card which tracked removing inappropriate methods from a very large legacy repository, we were so slowed down by the problem that we did 'play' the card whenever we had capacity during an iteration. In this case there was clear business value completing the card would not only speed us up, but would allow bigger architectural refactorings to take place later on in the project.

The boxes are also very important. Each box is a step towards the goal. We take the steps in two stages.

Spares and Strikes

When a developer or a pair have fixed one of the steps locally, they get to draw a single line over the box, or a 'spare'. It is not converted to a 'strike' until the code change is checked in, built and tested if appropriate. We have found this to be really useful. The spare is a reminder to check in! With a lot of these tasks, we found it was easy to get carried away and bite off too much. Invariably this led to trouble, broken code and despair, leading to reverting the code and feelings of hopelessness. Once a pair has more than a couple of spares on the board, they start to feel the pressure to 'bank' and tend to check in more frequently.

We really like our bowling cards, they are great to remind us to make the world better, and to celebrate progress, no matter how small it feels at times. In fact Bernardo and Sinan started another card today to track legacy classes with inadequate unit test coverage - and there are a lot of boxes on that sucker!

Completing a bowling card is a celebration event. Navya was so happy, she made our first one a party hat! Huzzah!

Bowling cards have helped us improve our code. They remind us that we chose to fix something, rather than allow our code to tip. It is going to potentially take a long time, but we can all help by getting strikes when we have the chance. We also have a measure of progress that we can track, and proof that we are making things better one step at a time.

ActiveRecord lessons learnt: #2 Use lazy loading and batching to improve performance

2008-11-08T17:02:00.007+00:00

By default Castle ActiveRecord eagerly loads your object data. For a small set of connected objects, this default behaviour is fine. The problems start when it is possible to navigate from the object that you have loaded to other potentially large object graphs. As an example let's imagine that we are writing some code to track books in our house. Say I want to track which books I have by author, along with the bookshelf that I'm keeping them on. I'd also like to track chapters of interest and which editions of the book that I have. I also want to tag my bookshelves with keywords describing the types of books on the shelf.

The Bookshelf and the Author classes both contain collections of Books. Nominally, the Author owns the books and the shelf has a reference to the object. I've also decided that I need to navigate bi-directionally between the book and the author classes. Here's a basic implementation using ActiveRecord.

using System.Collections.Generic;
using Castle.ActiveRecord;

namespace BooksAndStuff.Models
{

[ActiveRecord("Bookshelf")]
public class Bookshelf
{
  [PrimaryKey(PrimaryKeyType.HiLo)]
  public int Id { get; set; }

  [HasMany(typeof(string), Table = "Tags", ColumnKey = "BookShelfId",
      Cascade = ManyRelationCascadeEnum.AllDeleteOrphan, Element = "TagName",
      Lazy = Switches.LazyCollections, BatchSize = Switches.BatchSize)]
  public IList<string> Tags { get; set; }

  [HasMany(typeof(Book), Table = "Books", ColumnKey = "BookShelfId",
      Cascade = ManyRelationCascadeEnum.None,
      Lazy = Switches.LazyCollections, BatchSize = Switches.BatchSize)]
  public IList<Book> Books { get; set; }
}

[ActiveRecord("Books")]
public class Book
{
  public Book() {}

  public Book(string title, Author author, string isbn,
              IList<string> chapters, IList<string> editions) {
      Title = title;
      Author = author;
      Chapters = chapters;
      ISBN = isbn;
      Editions = editions;
  }

  [PrimaryKey(PrimaryKeyType.HiLo)]
  public int Id { get; set; }

  [Property(NotNull = true)]
  public string Title { get; set; }

  [HasMany(typeof(string), Table = "Editions", ColumnKey = "BookId",
      Access = PropertyAccess.Property, Element = "Edition",
      Lazy = Switches.LazyCollections, BatchSize = Switches.BatchSize)]
  public IList<string> Editions { get; set; }


  [HasMany(typeof(string), Table = "Chapters", ColumnKey = "BookId",
      Access = PropertyAccess.Property, Element = "Chapter",
      Lazy = Switches.LazyCollections, BatchSize = Switches.BatchSize)]
  public IList<string> Chapters { get; set; }

  [Property(NotNull = true)]
  public string ISBN { get; set; }


  [BelongsTo(NotNull = true)]
  public Author Author { get; set; }
}


[ActiveRecord("Authors", Lazy = Switches.LazyAuthor, BatchSize = Switches.AuthorBatchSize)]
public class Author
{
  [PrimaryKey(PrimaryKeyType.HiLo)]
  public virtual int Id { get; set; }

  [Property(NotNull = true)]
  public virtual string Name { get; set; }

  [HasMany(Cascade = ManyRelationCascadeEnum.AllDeleteOrphan,
      Lazy = Switches.LazyCollections, BatchSize = Switches.BatchSize)]
  public virtual IList<Book> Books { get; set; }

}

public static class Switches
{
  public const bool LazyCollections = false;
  public const bool LazyAuthor = false;
  public const int BatchSize = 1;
  public const int AuthorBatchSize = 1;
}
}

I wouldn't normally have a switches class, but it is very useful for the next part. We are going to change the lazy loading and batch size properties of the active record attributes to understand how it influences the way NHibernate fetches the data from the database. To get a feel for the behaviour of NHibernate, we will need some test data. Say that we have a single shelf with 3 books on it, each with a different author. We have another 2 books by one of our authors, but they are not on a shelf (I lent them to Steve).

To test this we need to start up active record:

        [TestFixtureSetUp]
  public void FixtureSetUp()
  {
      var connectionString = @"Data Source=localhost;initial catalog=BooksAndStuff;
                              user=BooksAndStuff_user;password=Password01";
      var properties = new Dictionary<string, string>
       {
           {"show_sql", "true"},
           {"connection.driver_class", "NHibernate.Driver.SqlClientDriver"},
           {"dialect", "NHibernate.Dialect.MsSql2005Dialect"},
           {"connection.provider", "NHibernate.Connection.DriverConnectionProvider"},
           {"connection.connection_string", connectionString}
       };

      var source = new InPlaceConfigurationSource();
      source.Add(typeof(ActiveRecordBase), properties);
      ActiveRecordStarter.Initialize(source, typeof(BookShelf).Module.GetTypes());
      ActiveRecordStarter.CreateSchema();
  }

  [SetUp]
  public void SetUp()
  {
      ActiveRecordStarter.CreateSchema();
  }

This method creates this in the database:

        private void GenerateTestData() {
      var lewisCarol = new Author {Name = "Lewis Carol"};
      var xmasCarol = new Author {Name = "Xmas Carol"};
      var luisLuisCarol = new Author {Name = "Luis Luis Carol"};

      var chapters = new List<string>{ "Chapter 1", "Chapter 2" };
      var editions = new List<string>{"hard back", "spanish"};

      var book1 = new Book("Alice in Wonderland", lewisCarol,"xxx-yyy", chapters, editions);
      var book2 = new Book("Alice in Wonderland II - Revenge of the Cheshire Cat", xmasCarol,"xxx-yyy", chapters, editions);
      var book3 = new Book("Not well known",lewisCarol, "xxx-yyy", chapters, editions);
      var book4 = new Book("Not well known either", lewisCarol, "xxx-yyy", chapters, editions);
      var book5 = new Book("Some book",luisLuisCarol,"xxx-yyy", chapters, editions);

      lewisCarol.Books = new List<Book> {book1, book3, book4};
      xmasCarol.Books = new List<Book>{book2};
      luisLuisCarol.Books = new List<Book>{ book5 };

      var bookshelf = new Bookshelf
                          {
                              Tags = new List<string>{"Fantasy", "Old"},
                              Books = new List<Book> {book1, book2, book5}
                          };

      using (new SessionScope())
      {
          ActiveRecordMediator<Author>.Save(lewisCarol);
          ActiveRecordMediator<Author>.Save(xmasCarol);
          ActiveRecordMediator<Author>.Save(luisLuisCarol);
          ActiveRecordMediator<BookShelf>.Save(bookshelf);
      }
  }

Now we have everything in place, we can write a test to load in the data:

        [Test]
  public void PlayingWithBatchFetchingAndLazyLoading()
  {
      GenerateTestData();

      using (new SessionScope())
      {
          Console.WriteLine("*******************");
          Console.WriteLine("Batch Size:{0} LazyCollections:{1} LazyAuthor:{2}",
                            Switches.BatchSize, Switches.LazyCollections, Switches.LazyAuthor);

          Console.WriteLine("***\nLoading my book shelves");
          var shelves = ActiveRecordMediator<Bookshelf>.FindAll();

          Console.WriteLine("***\nCounting books on shelf: ");
          Console.WriteLine("shelves[0].Books.Count =" + shelves[0].Books.Count);

          Console.WriteLine("***\nChecking I have the right book");
          Console.WriteLine("\tshelves[0].Books[0].Id:" + shelves[0].Books[0].Id);

          Console.WriteLine("***\nCounting the book's chapters");
          Assert.That(shelves[0].Books[0].Chapters.Count, Is.EqualTo(2));

          Console.WriteLine("***\nHow many books does the first book's author have to his name?");
          Assert.That(shelves[0].Books[0].Author.Books.Count, Is.EqualTo(3));

          Console.WriteLine("***\nHow many books does the third book's author have to his name?");
          Assert.That(shelves[0].Books[2].Author.Books.Count, Is.EqualTo(1));

          Console.WriteLine("***\nHow many books does the second book's author have to his name?");
          Assert.That(shelves[0].Books[1].Author.Books.Count, Is.EqualTo(1));

          Console.WriteLine("***\nWhat is the count of the chapters in a given book?");
          Assert.That(shelves[0].Books[0].Author.Books[1].Chapters.Count, Is.EqualTo(2));

          Console.WriteLine("***\nHow many chapters does the third book have?");
          Assert.That(shelves[0].Books[2].Chapters.Count, Is.EqualTo(2));
      }
  }

When you run this, you should see the following output from NHibernate:

*******************
Batch Size:1 LazyCollections:False LazyAuthor:False
***
Loading my book shelves
NHibernate: SELECT this_.Id as Id0_0_ FROM BookShelf this_
NHibernate: SELECT books0_.BookShelfId as BookShel5___2_, books0_.Id as Id2_, books0_.Id as Id2_1_, books0_.Title as Title2_1_, books0_.ISBN as ISBN2_1_, books0_.Author as Author2_1_, author1_.Id as Id5_0_, author1_.Name as Name5_0_ FROM Books books0_ inner join Authors author1_ on books0_.Author=author1_.Id WHERE books0_.BookShelfId=@p0; @p0 = '98305'
NHibernate: SELECT chapters0_.BookId as BookId__0_, chapters0_.Chapter as Chapter0_ FROM Chapters chapters0_ WHERE chapters0_.BookId=@p0; @p0 = '65541'
NHibernate: SELECT editions0_.BookId as BookId__0_, editions0_.Edition as Edition0_ FROM Editions editions0_ WHERE editions0_.BookId=@p0; @p0 = '65541'
NHibernate: SELECT books0_.Author as Author__1_, books0_.Id as Id1_, books0_.Id as Id2_0_, books0_.Title as Title2_0_, books0_.ISBN as ISBN2_0_, books0_.Author as Author2_0_ FROM Books books0_ WHERE books0_.Author=@p0; @p0 = '32771'
NHibernate: SELECT chapters0_.BookId as BookId__0_, chapters0_.Chapter as Chapter0_ FROM Chapters chapters0_ WHERE chapters0_.BookId=@p0; @p0 = '65540'
NHibernate: SELECT editions0_.BookId as BookId__0_, editions0_.Edition as Edition0_ FROM Editions editions0_ WHERE editions0_.BookId=@p0; @p0 = '65540'
NHibernate: SELECT books0_.Author as Author__1_, books0_.Id as Id1_, books0_.Id as Id2_0_, books0_.Title as Title2_0_, books0_.ISBN as ISBN2_0_, books0_.Author as Author2_0_ FROM Books books0_ WHERE books0_.Author=@p0; @p0 = '32770'
NHibernate: SELECT chapters0_.BookId as BookId__0_, chapters0_.Chapter as Chapter0_ FROM Chapters chapters0_ WHERE chapters0_.BookId=@p0; @p0 = '65537'
NHibernate: SELECT editions0_.BookId as BookId__0_, editions0_.Edition as Edition0_ FROM Editions editions0_ WHERE editions0_.BookId=@p0; @p0 = '65537'
NHibernate: SELECT books0_.Author as Author__1_, books0_.Id as Id1_, books0_.Id as Id2_0_, books0_.Title as Title2_0_, books0_.ISBN as ISBN2_0_, books0_.Author as Author2_0_ FROM Books books0_ WHERE books0_.Author=@p0; @p0 = '32769'
NHibernate: SELECT chapters0_.BookId as BookId__0_, chapters0_.Chapter as Chapter0_ FROM Chapters chapters0_ WHERE chapters0_.BookId=@p0; @p0 = '65539'
NHibernate: SELECT editions0_.BookId as BookId__0_, editions0_.Edition as Edition0_ FROM Editions editions0_ WHERE editions0_.BookId=@p0; @p0 = '65539'
NHibernate: SELECT chapters0_.BookId as BookId__0_, chapters0_.Chapter as Chapter0_ FROM Chapters chapters0_ WHERE chapters0_.BookId=@p0; @p0 = '65538'
NHibernate: SELECT editions0_.BookId as BookId__0_, editions0_.Edition as Edition0_ FROM Editions editions0_ WHERE editions0_.BookId=@p0; @p0 = '65538'
NHibernate: SELECT tags0_.BookShelfId as BookShel1___0_, tags0_.TagName as TagName0_ FROM Tags tags0_ WHERE tags0_.BookShelfId=@p0; @p0 = '98305'
***
Counting books on shelf:
shelves[0].Books.Count =3
***
Checking I have the right book
shelves[0].Books[0].Id:65537
***
Counting the book's chapters
***
How many books does the first book's author have to his name?
***
How many books does the third book's author have to his name?
***
How many books does the second book's author have to his name?
***
What is the count of the chapters in a given book?
***
How many chapters does the third book have?

Blimey - that was a lot of SQL! 16 Select statements. Also, notice how it was all executed on the initial load of the Bookshelf. The logic for loading ran something like this:

Load all the data from the Bookshelf table for all the bookshelves
For each bookshelf
1. Load in all the data from the books and authors table for each book on the shelf and its corresponding author
2. For each book on the shelf
  1. Load all the Chapters for the first book on the shelf
  2. Load all the editions for the first book in the shelf
3. For each author
  1. Load all the data from the books table for the author
  2. Repeat the book loading logic for every book belonging to the author which was not on the shelf
4. Load in all the tags for the shelf

All that so that we can ask for the first shelf. We haven't even begun to touch the object yet. Why did NHibernate do this? All of these objects can theoretically be reached from the bookshelf. Because we may want to touch any of these objects after we have retrieved the BookShelf from the database, NHibernate decided it better go and grab the lot so that we could get to these objects if we had to. This is eager loading, and is the default behaviour of ActiveRecord unless you explicitly instruct it otherwise. Notice also, that we have an N+1 selects problem too. There is a single select statement being issued for every collection for every object. This is a a very expensive way to fetch data from the database.

The key to optimising the fetch behaviour is how you use the the Lazy and BatchSize properties of the various ActiveRecord attributes.

Removing N+1 Select with BatchSize

The BatchSize property can be used on collections or on ActiveRecord classed directly. When you use the setting on a collection, NHibernate will attempt to populate the collection for [BatchSize] of the objects which NHibernate is aware of an has not populated yet.

We'll start by adding batching

    public static class Switches
{
  public const bool LazyCollections = false;
  public const bool LazyAuthor = false;
  public const int BatchSize = 2;
  public const int AuthorBatchSize = 10;
}

Batch Size:2 LazyCollections:False LazyAuthor:False
***
Loading my book shelves
SELECT this_.Id as Id3_0_ FROM BookShelf this_
SELECT books0_.BookShelfId as BookShel5___2_, books0_.Id as ...
SELECT chapters0_.BookId...WHERE chapters0_.BookId in (@p0, @p1); @p0 = '65541', @p1 = '65540'
SELECT editions0_.BookId...WHERE editions0_.BookId in (@p0, @p1); @p0 = '65541', @p1 = '65540'
SELECT books0_.Author...WHERE books0_.Author in (@p0, @p1); @p0 = '32771', @p1 = '32770'
SELECT chapters0_.BookId...WHERE chapters0_.BookId=@p0; @p0 = '65537'
SELECT editions0_.BookId...WHERE editions0_.BookId=@p0; @p0 = '65537'
SELECT books0_.Author...WHERE books0_.Author=@p0; @p0 = '32769'
SELECT chapters0_.BookId...WHERE chapters0_.BookId in (@p0, @p1); @p0 = '65539', @p1 = '65538'
SELECT editions0_.BookId...WHERE editions0_.BookId in (@p0, @p1); @p0 = '65539', @p1 = '65538'
SELECT tags0_.BookShelfId...WHERE tags0_.BookShelfId=@p0; @p0 = '98305'
***

That improved the behaviour somewhat. We're down to 11 statements. NHibernate is now resolving the objects in the collections 2 at a time. This is not very useful as we typically have more than 2 books on a shelf, so we have 2 trips round the loop to load in all the data for our shelf. If we up the BatchSize to 5 we can get everything in 7 statements:

Batch Size:5 LazyCollections:False LazyAuthor:False
***
Loading my book shelves
SELECT this_.Id as Id3_0_ FROM BookShelf this_
SELECT books0_.BookShelfId...WHERE books0_.BookShelfId=@p0; @p0 = '98305'
SELECT chapters0_.BookId..WHERE chapters0_.BookId in (@p0, @p1, @p2); @p0 = '65541', @p1 = '65537', @p2 = '65540'
SELECT editions0_.BookId...WHERE editions0_.BookId in (@p0, @p1, @p2); @p0 = '65541', @p1 = '65537', @p2 = '65540'
SELECT books0_.Author...WHERE books0_.Author in (@p0, @p1, @p2); @p0 = '32771', @p1 = '32769', @p2 = '32770'
SELECT chapters0_.BookId...WHERE chapters0_.BookId in (@p0, @p1); @p0 = '65539', @p1 = '65538'
SELECT editions0_.BookId...WHERE editions0_.BookId in (@p0, @p1); @p0 = '65539', @p1 = '65538'
SELECT tags0_.BookShelfId...WHERE tags0_.BookShelfId=@p0; @p0 = '98305'
***

Bear in mind that the BatchSize only influences how many of the objects which will be populated in the current known set of objects. In the example above, even though we only have 5 books we still have make two trips to the database to load all the books in. The first time NHibernate decides it needs to load in a Book is when it loads the shelf. At this point in time, there are only 3 known books so they are fetched. Later on, while loading in the authors, the 2 remaining books are identified and fetched in a second batch.

Using Lazy Loading

The biggest problem with the fetch patterns above is the large amount of data being retrieved which may not be required. This is potentially very wasteful. Lazy loading allows you to control this. With our current object model, it probably doesn't make sense to always pull in the author (and the related books of the author) every time we load a bookshelf. How about we look at the results of lazy loading the author.

    public static class Switches
{
  public const bool LazyCollections = false;
  public const bool LazyAuthor = true;
  public const int BatchSize = 5;
  public const int AuthorBatchSize = 10;
}

*******************
Batch Size:5 LazyCollections:False LazyAuthor:True
***
Loading my book shelves
SELECT this_.Id as Id3_0_ FROM BookShelf this_
SELECT books0_.BookShelfId... WHERE books0_.BookShelfId=@p0; @p0 = '98305'
SELECT chapters0_.BookId ... WHERE chapters0_.BookId in (@p0, @p1, @p2); @p0 = '65541', @p1 = '65537', @p2 = '65540'
SELECT editions0_.BookId ... WHERE editions0_.BookId in (@p0, @p1, @p2); @p0 = '65541', @p1 = '65537', @p2 = '65540'
SELECT tags0_.BookShelfId ... WHERE tags0_.BookShelfId=@p0; @p0 = '98305'
***
Counting books on shelf:
shelves[0].Books.Count =3
***
Checking I have the right book
shelves[0].Books[0].Id:65537
***
Counting the book's chapters
***
How many books does the first book's author have to his name?
SELECT author0_.Id ... WHERE author0_.Id in (@p0, @p1, @p2); @p0 = '32769', @p1 = '32770', @p2 = '32771'
SELECT books0_.Author ... WHERE books0_.Author in (@p0, @p1, @p2); @p0 = '32771', @p1 = '32769', @p2 = '32770'
SELECT chapters0_.BookId ... WHERE chapters0_.BookId in (@p0, @p1); @p0 = '65539', @p1 = '65538'
SELECT editions0_.BookId ... WHERE editions0_.BookId in (@p0, @p1); @p0 = '65539', @p1 = '65538'
***
How many books does the third book's author have to his name?
***
How many books does the second book's author have to his name?
***
What is the count of the chapters in a given book?
***
How many chapters does the third book have?

This has dramatically changed the way we retrieve data from the database. Now we do not load any of the books which are not on the shelf until we touch the author property of the book. The initial load is much cheaper, although we still pull in a load of data which is not required. How about we lazy load the collections as well?

*******************
Batch Size:5 LazyCollections:True LazyAuthor:True
***
Loading my book shelves
NHibernate: SELECT this_.Id as Id3_0_ FROM BookShelf this_
***
Counting books on shelf:
SELECT books0_.BookShelfId... WHERE books0_.BookShelfId=@p0; @p0 = '98305'
shelves[0].Books.Count =3
***
Checking I have the right book
shelves[0].Books[0].Id:65537
***
Counting the book's chapters
SELECT chapters0_.BookId... WHERE chapters0_.BookId in (@p0, @p1, @p2); @p0 = '65537', @p1 = '65540', @p2 = '65541'
***
How many books does the first book's author have to his name?
SELECT ... WHERE author0_.Id in (@p0, @p1, @p2); @p0 = '32769', @p1 = '32770', @p2 = '32771'
SELECT books0_.Author... WHERE books0_.Author in (@p0, @p1, @p2); @p0 = '32769', @p1 = '32770', @p2 = '32771'
***
How many books does the third book's author have to his name?
***
How many books does the second book's author have to his name?
***
What is the count of the chapters in a given book?
NHibernate: SELECT chapters0_.BookId... WHERE chapters0_.BookId in (@p0, @p1); @p0 = '65538', @p1 = '65539'
***
How many chapters does the third book have?

As you can see, the up front cost of loading the bookshelves is now very cheap, we are progressively loading in data as we touch more and more of the object graph. I would suggest that this is generally the best behaviour. In this case we never retrieve the tags or editions from the database because we don't touch them. If another method wants to load a bookshelf to edit its tags, we won't incur the cost of loading all the books on the shelf (and there could be 100's of those).

In summary, I'd recommend that you default to setting Lazy loading and batch fetching for all collections and consider lazy loading large aggregate classes - such as the Author in this example, so that you don't inadvertently pull out huge chunks of the database. Also, keep watching your SQL output by setting show_sql to true now and again. Quite innocent seeming changes to your domain model can quickly magnify the number of objects that can be navigated to from another domain object. If you don't have a Lazy constraint between your objects to act as an NHibernate firebreak, then you will cause the amount of SQL issued to increase dramatically.

Don't forget that lazy loading can only work when you remain inside the hibernate session. If you detach your object from the session then you will get a LazyInitializationException thrown when you attempt to access a collection of object that has not been initialised.

ActiveRecord lessons learnt: #1 Never forget there's a database

2008-11-05T13:51:00.007+00:00

We have been using Castle's Active Record very heavily on my current project we have recently spent a lot of time performance tuning the performance of the application and its use of ActiveRecord. We learnt a lot over the last few of weeks and in the next few posts I'll try to capture some of these lessons.

ActiveRecord is a great abstraction of the database for a .NET application. NHibernate and Hibernate are OK, but leave you with a legacy of XML mapping files in your application, which is no fun at all. The appeal of ActiveRecord over vanilla NHibernate was the chance to shed all that XML cruft. Now, with a minimal few simple attributes and a simple repository class I can transform my plain old C# object into a persistent domain object with ease. It is almost as if the database was not there at all...

[ActiveRecord]
public class MyAggregate
{
  private int id;
  private readonly IList<SomeObject> items;

  [PrimaryKey(Access = PropertyAccess.FieldCamelcase, UnsavedValue = Entity.IdUnassignedString)]
  public int Id{get { return id; } }

  [HasMany(typeof(SomeObject), Access=PropertyAccess.FieldCamelCase)]
  public IList<SomeObject> SomeItems{ get { return items; } }
}

Look at that! Just three Attributes and my object is ready for the database. Marvelous!

This has been the source of many problems for us. It was just too easy to work with the database. In the green field days of the project, you could almost forget that the database existed at all. We had ActiveRecord auto generate the database, and we wrote test fixtures which could create test data in memory or in the database. This was superb, as we could write standalone unit tests to drive out the behaviour of the domain and application components, soon we had hundreds of fast database free unit tests. We could also write wired tests for our repositories to prove that we could persist domain objects to and read them back from SQL server. The same test fixtures could be reused in the automated acceptance tests to modify the application data as required.

Fantastic! So what went wrong? Well, we built a slow app, at least it started out slow when it hit UAT (it's much, much faster now). We fell victim to Joel's Law of Leaky Abstractions which states that "All non-trivial abstractions, to some degree, are leaky". In this case, we have ActiveRecord abstracting NHibernate which in turn abstracts the database access. The minute you forget this, you will get bitten. The Hibernate team certainly acknowledge this. Just three pages into the first chapter of Java Persistence with Hibernate Christian Bauer and Gavin King remind you that:

"To use Hibernate effectively, a solid understanding of the relational model and SQL is a prerequisite. You need to understand the relational model and topics such as normalization to guarantee the the integrity of your data, and you'll need to use your knowledge of SQL to tune the performance of your Hibernate application"

In other words, try to remember that you are still talking to a relational database; we're just making it easier for you to do so from your OO world.

I really like ActiveRecord and I would recommend using it on .NET projects when there is the need an OR Mapping tool. The fault lay was us, the developers. We didn't pay enough attention to the fact that there was a database at the end of the calls that the repository was making. We didn't spot that some of the calls which our test code was making were potentially very expensive in real usage scenarios. And so we got bitten by the law of leaky abstractions, which manifest itself through a number of bad usage patterns which caused immediate performance problems.

Avoidable Mistake - Absence of good quality test data

We took too long time to recognise we had performance problems because we didn't work hard enough to obtain truly representative test data for our app. If we had made more of an effort to use 'real' data sooner, then we would have spotted much more, much earlier in our project's lifecycle. There is no substitute for realistic data. Get or generate some and plan to integrate it into your application build process. Make it easy for developers to use realistic data. We now have a nant target which restores a UAT or production database onto our developer machines and upgrades it to the latest version using dbdeploy.NET. This is great as it means we can also automate testing our data and schema migration scripts with a production-like database.

Avoidable Mistake - Adoption of the "select all and filter with a bit of LINQ" anti pattern

This is a real killer. The code snippet below illustrates the problem

public Book FindBooksWitTitle(string title)
{
  return repository.AllBooks().Select(book=>book.Title == title).FirstOrDefault();
}

This is fine from a C# perspective, and is nice, clean, expressive code. The problem is that we had to pull all the books out of the database to run the query. Fine with a small test database, but very bad with the Amazon book catalogue.

Provide a richer API through the repository instead and push the filtering to the database:

 public class BookRepository : IBookRepository
  {
      //Snip

      public IList<Book> GetFirstBookWithTitle(string title)
      {
          DetachedCriteria crits = DetachedCriteria.For(typeof(Book));
          crits.Add(Expression.Eq("Title", title));
          return ActiveRecordMediator<Book>.FindFirst(crits);
      }

      //Snip
    }

This is a specific example of a general problem - making the application do work it doesn't need to do.

I'd generally be very suspicious of repositories which return unfiltered lists of any big aggregates. They can be very useful for test purposes, but can really kill you if they get into the production code. Keep an eye out for filtering being performed in C# code which would be simpler and more efficient as a relational database query. These tasks should be pushed down into your repository classes and rewritten in HQL or Criteria queries.

Avoidable Mistake - Loading in data that you don't need

Bear in mind that calls to the database are expensive, especially if said database is on a remote machine. Every time you go to the database for something that you don't need, you are going to slow down your app and needlessly waste resources. Here are some examples of mistakes we made:

Loading objects from the database to perform unnecessary validation.
Eagerly loading objects on page load "just in case" the app might use them.
Repeated loads of objects from the database due to page lifecycle events.
Loading aggregates when a report query or a summary object would be better

Avoidable Mistake - Forgetting to check the SQL being generated

Make sure you check the SQL statements being used by NHibernate to satisfy your requests. Sometimes you will get quite a surprise. To enable logging of the SQL statements to the console when running your tests, add

<add key="show_sql" value="true" />

to the activerecord section in your app.config file . If you want to log application output, Ken Egozi has a post showing you how to do it here.

Dogfooding NSynthesis

2008-09-05T08:38:00.003+01:00

Earlier this year George and I came up the with idea of trying to join up the mock interactions in our unit tests to improve the confidence that our tests were coherent and that all our mocked interactions hung together. The result of this was Synthesis.

The only problem was that I'm working on .NET projects at the moment. Enter NSynthesis.

The aim of NSynthesis is to provide 'code coverage for mocked interactions for .NET'. It uses PostSharp to detect mock expectations being set by your test code and verifies that there is a test of the real production implementation in the same test run. If this call is missing, we fail the build. Alex and Bernardo and I are busy working on getting NSynthesis to the point where we can have a 0.0.1 release and it is looking quite promising now. Out of curiosity, I thought I'd grab the trunk build and try to run NSynthesis on a subset of my current project.

I disabled the generic support as we are still working on this and it is not really stable yet. I also filtered the classes which we are using to only include the our services namespace. Services should be easy to unit test, as there is always a repository or another service between them and the real world. We also find this is where we have the most mocked interactions.

The results. 54 unit tests analysed, 12 unique mock calls made, 6 unique untested mock calls!
That's 50% of the mock interactions not being unit tested! Blimey!

So. There is definitely value in using this tool! It is not that this code is untested - there are lots of wired tests and acceptance tests. However, what it highlighted was that we could have tested a lot more code at a lower level, and possibly would have sped up our test suite as a result. It was a surprise for the team too - we all thought we were already unit testing everywhere we could.

When I extended NSynthesis to run against the whole code base, I ran into another problem. What broke NSynthesis was the situation where a class 'acquires' an implementation of an interface.

Here, my repository does not actually implement Exists itself, rather it derives from an existing implementation. In the code, this was our own BaseRepository, which in turn inherited from Castle's ARRepository<T>. Now when I mock the repository, I mock the interface. When I test my repository, I'm going to test the Repository class itself. NSynthesis then completely failed to realise that I have a tested implementation for IRepository because it is looking for a concrete implementation by a class which implements the interface IRepository. And BaseRepository does not.

I think getting this working could be fun :-)

Synthesis comes to Manchester

2008-04-01T08:16:00.003+01:00

Following on from our presentation to EuRuKo 2008, George and I will be speaking at the next Thoughtworks Manchester Geek Night on the 8th April.

I'm an alumnus of UMIST - so I'm really looking forward to seeing the old place again, it has been a long time since I was last in Manchester!

Synthesis and test confidence

2008-03-22T18:14:00.008+00:00

George and I spoke to a bunch of friends about Synthesis a few weeks ago at the inaugural Reading Geek Night. Around six of us met in a pub in Reading and got really confused looks from the punters looking on on us as we waxed enthusiastically about TDD concepts and how we thought that they could be improved. I found it very interesting to get feedback from a group of really smart non ThoughtWorkers . They don't all practice “full on agile” but they certainly do understand what it means to build and ship working code, on time, with tight business pressures. It has taken a couple of weeks to digest the feedback, hence the slow time to post about the event.

So, here is another attempt to explain why we may need synthesis on our projects in the light of Reading Geek Night #1 and previous discussions with the guys at ThoughtWorks UK previous to this....

If you practice TDD, then the chances are that you already have a large number of unit tests. You may have a bunch of other automated tests of different types as well (functional, integration, performance...), and if you do, that's great. Keep doing that.

At the other end of the spectrum, you will also have some form of acceptance testing structure in place. The format of this varies from project to project. It could be a set of manual scripts which your Testing/QA team/Nominated Guy run, or it could be a bunch of system tests using FIT or one of the BDD frameworks. Maybe you paid for a commercial product and have something like a bunch of Rational Robot scripts.

We currently have a large disconnect between our high level tests and our low level unit tests, especially when you consider how confident you would be that the system works as desired by running one or more if the different types of test in the system.

At the bottom of the scale there is a system which has no tests. I would not be at all confident that this system worked. At the top end of the scale, I have a system which is running live in the real production environment and is being used by the actual user base. I am very confident that this system is working. At some point in the past, most businesses came to the conclusion that IT cannot blindly put systems live to discover if they work, hence all the interest in testing techniques.

High level system tests give us much more confidence that a system may work in production than unit tests because of the simple fact that they are exercising the production code in a more realistic manner; the data is more real, and all the interactions between the components are 'real', or as close to real as we can get in a test environment.

Unit tests provide a much weaker level of confidence due a number of issues when you want to use them to prove that a system works.

Unit tests are incomplete
We cannot unit test all of our code. Note that good design practices can massively reduce the amount of 'untestable' code in the system and by practicing TDD and running code coverage tools it is simple to provide a view of how well we are doing. However, there are always parts of the system which you either cannot test or neglect to unit test for some reason.

Unit tests are disconnected from other unit tests
Unit tests test a small amount of code, by definition, and often in isolation. If you are testing interactions between components then these are simulated using mocks. How can I be confident that I got my mock interactions correct? Also, how can I be confident that I have tested or even completed coding up the real version of whatever it is I am mocking? Again, we can mitigate this by reducing the number of interaction based unit tests and by writing state based unit tests when we can. However sometimes it is much more natural to follow the interaction test approach, and this incurs a risk that we are not simulating our interactions accurately.

Please don't take away from this that unit tests are bad. Unit tests are the lifeblood of a healthy project and are great for proving that the individual cogs are properly machined. What they do not tell you is whether you have the right number of cogs, or if they all fit together properly. If I want to know that components A,B, and C really play well together, then I need to write another aggregate test which puts ABC together and validates the unit tests got the interactions correct in a more realistic environment. George and I would call these functional tests, but you may use another term on your project. You may not have these tests on your project either – and that could be more worrying still – this means there could be gaps in your coverage at the levels that developers work at.

So, if I have functional tests plus unit tests, I am more confident that my system is working before I commit to running the slow system tests. B y running functional tests I have confidence that I have wired up all my well behaved units of code in a meaningful manner and that they are playing well with each other as I predicted. The chances are, that the wider system is going to work. I still am not as confident that the system works as I will be when my acceptance test suite is run by [insert machine/human here], but I can probably sleep well enough.

Now, some functional tests are inescapable and add a view into an aspect of the system that a unit test just cannot provide. A great example if you are using Spring would be to prove that you can get your components from the container and that they are wired up correctly. However, if you are purely proving that your simulated interactions are valid in the unit tests, then repetition is creeping into the process - which is wasteful.

This is where Synthesis comes in. Synthesis monitors the simulated interactions which you create in your unit tests and verifies that there is a corresponding unit test that exercises and validates the real object in your system. If everything joins up, then your build passes. If there are disconnects, then the build fails. So, if I have a unit tests for A, B, and C, then it is fine for me to simulate A's interactions with B and C using mocks, providing synthesis can match all my expectations with real tests which make the very same calls to the real A. The result is a synthetic bigger test, where A, B, and C are virtually linked by their expectations.

Obviously, there is a sliding scale of 'matching'. At one end you have basic method name matching, and at the other end of the scale there is full data matching for every call. The closer you get to true data matching, the more confidence is gained, but the more restrictions are placed on developers. We are not sure how far we need to go in order for a team to get an optimal balance between speed and safety – but I'd be very interested to get opinions on this matter :-)

Currently, Synthesis validates the call signature completely, but does not check the content of data passing between the mocks and data used to test an object for real. We plan to add this soon. However, there is a benefit to be had by ensuring that all your interactions join up. This will catch method mismatches, dynamic calls which are not tested such as Active Record queries, and gaps in your test coverage which have simply been missed by those fallible human developers!

Dear Microsoft

2008-01-05T18:00:00.000+00:00

I thought it was very sporting of you to somehow break your MSDN subscriber downloads page for your flagship IE7 web browser:

But not for Firefox:

Well done! I didn't even realize that the site had any problems at all until I switched over to my virtual machine of XP from Ubuntu in order to download the ISO I needed via your custom file transfer application. Why do you insist that I use this at all? It seems really odd.

Anyway, perhaps I should just try to get your File Transfer Manager to run in Wine and I could have a seamless experience browsing your web site from a proper OS....

Love and hugs,

Stu

Damn Skippy

2007-02-11T23:32:00.000+00:00

From a conversation whilst pairing with a client developer as we attempted to test infect some legacy code last week.

Me: Phew! Well, that little bit is under test now. We can start to fix it up now.

Client Dev: That seemed too hard....you know, if the guys who wrote that had had to put in tests as they went, the API would NEVER have looked like this. It would just have been too painful to work with.

It really is nice to witness the penny drop. True job satisfaction.

Test Code and Production Code – two distinct beasties

2007-02-05T21:03:00.000+00:00

There are some common qualities that production and test code must possess in order to enhance their status as ‘good code’. Typical dimensions which one could apply may be conciseness, clarity of code, performance, elegance and extensibility (the list goes on). These are all good methods of assessing ‘good code’. However what differentiates good test code from good production code is not so clear when only these traditional qualities are taken into account. It may be more useful to consider what the drivers for the two type of code are.

For production code, these traditional values are useful, but cannot be assessed meaningfully if the code does not fulfil a business requirement. Ultimately this is the only driver of production code – it has to get the job done for the business. If your widget selling app for Widgets R Us cannot help to sell widgets, then no one will really care how speedy, elegant and extensible your Ultra Widget Framework is. The business code also needs to be robust enough for the business demands to allow the application to stay up and running cheaply enough for them to get a return on the build. It is a harsh world we live in and production code has to cut it in an unjust world.

So what about the test code? Test code has a different set of drives which compliment the production code. The primary driver is to assist in delivering high quality production code which robustly meets the requirements of our client. To this end the code also needs to drive the design and document the expected behaviour of our production code.

Driving the design of the code pushes the code towards delivery of production requirements, but this is only part of the story. Test code can used to accurately model the expected behaviour of the system and ensure that all key behaviour aspects of the system really do behave as per the requirements. By driving the design of the code with tests the system can meet the key delivery and quality drives of the code, but allows the developers the opportunity to make other improvements which improve the traditional values of the code. For example, a developer is more likely to refactor towards an elegant pattern which is emerging when a safety net in place than without one. If our tests measure performance then we are more likely to improve our performance qualities.

The test code is also a driver to document the behaviour accurately and clearly in order to provide a suitable level of traceability to the team and the stakeholders to demonstrate that the business need really have been met. When the test code is clear and comprehensive then the application benefits from not only the traditional test safety net, but also executable documentation that can highlight where and how the application has veered from the course of the business requirements.

So do traditional values not matter any more? Not at all - these values are crucial to quality code – in test and in production. However, you may assess the values differently in test and production code based when you consider what the code is driving to achieve. What may be reasonable in production code may be unsuitable for test code, as the application of a particular approach may obstruct the drives of test code to drive the design of the production code or clouds the documentation qualities of the test code by making a test less clear.

Put another way, just like in acting, it is important to consider your code’s motives first and then apply the traditional values in the context of what the code is trying to achieve.

Where there's no sense, there's no feeling

2006-11-14T19:53:00.000+00:00

I finally got a chance to go beating again on Saturday. I help out as a beater at a couple of the local pheasant shoots; a good day's beating is about as much fun as you can have in the countryside with your trousers on!

The season started over a month ago now and I have already been to one shoot. Unfortunately though I have been unable to take Lyra with me due to her suffering from 'lady dog problems'. Luckily for both of us this is all over and done with now. So we eagerly headed off to Ram Alley for a spot of pheasant chasing.

As this was Lyra's first outing of the season, I was a little nervous about how she would perform, would there be any signs of canine resentment for being left behind on the previous shoots? Would the lack of training during those summer months I had spent away from home show at all?

The first drive put all my fears to rest. I had to work Lyra through a patch of maize. This sort of cover crop can cause problems - dogs get very excited chasing through the cover looking for game (it probably feels like one of those jungle chase scenes in Predator to the dog) and the lack of visual contact can mean you loose control relatively easily. But she did really well - even retrieving a bird that had been overlooked by the pickers up :-)

And so all through the day we went from good to better until we got home. Only then was it clear that my, by now dog tired, hound was finding it hard to settle down on the carpet. On closer inspection we realized that, although her paws and muzzle seemed completely immune to nettle rash, the inside of her ears were not. All inside both ear flaps she was read raw with nettle rash!

Clearly she was having so much fun during the day that she hadn't noticed that her ears were glowing. I think I can understand though. Lyra is a working dog, and she is driven to work in the same way that many of the truly good developers I know are driven to code. Imagine a world where coding was only permitted September-January and you could only get access to a really useful computer 2 or 3 days a week at best. This is what it is probably like for Lyra (shudder). If this were the case, I don't think you would stop coding if you got a paper cut on a couple of fingers...

Given the way the winter is going there will be nettles in the woods for a few weeks yet and there's no way I can leave the dog behind. So, now I'm surfing the net for nettle rash cures for Weimaraners...

Is the clutter in your test trying to tell you something?

2006-11-14T17:04:00.001+00:00

We’ve all encountered it at some point or another. Your unit test is doing the job, but the test is so bogged down with setup code that it is impossible to see the wood for the trees. The interesting test code is in there somewhere, but where exactly?

The common causes of clogs in your test code are:

Complex object creation code.
Preceding interactions with the test subject.

Complex object creation is simple to deal with. You can create test fixtures to create the object for you, or better still you can refactor the code to try and simplify things a little.

But what about the interaction clutter? This is more subtle in the way that it insinuates itself into the test code. The clutter sneaks up on you. It demands more and more code to be added due to 'unavoidable' interactions with one object after another. Soon the useful to useless LOC ratio in the test case has tipped. Suddenly it becomes the norm that you should have N interactions in the test case to simulate calls to the database and N more calls out to other supporting services, even though the aspect of the code which you want to test has no real need to go through these extra steps.

For a particularly contrived example, lets imagine that we are building an online Pie ordering service for Weebl & Bob's Pie Delivery Service…and we are implementing it using WebWork.

We are in the final stages of implementing our pie ordering action. We want to verify that on execution the action invokes a number of support services:


public void testShouldOrderBobAYummyPieOnSubmit() {
 PieOrder yummyPieOrder = new PieOrder(“yummypie”, 1, “for Bob”);
 PieOrderingAction pieAction = new PieOrderingAction(orderService);
 pieAction.setAction(Action.SUBMIT);
 pieAction.execute();
 assertEquals(yummyPieOrder, orderService.getLastPieOrdered())
}

But our test fails to run - it never reaches our orderService throwing nasty null object exceptions instead. But why? Looking more carefully at our execute method we make a number of discoveries:

When our action executes, it actually loads our order from the database using its repository and the supplied pie-ID. This test doesn’t supply a repository or a pie-ID. So we buckle down and add it into the test.
Weebl and Bob’s Pie Delivery Service often runs out of stock. So before we can allow our customer to order our pie we need to call out to our pieInventoryService to ensure that the yummy pie actually exists and is ready to be cooked for our customer. Grrrr. We add more support code into the test.

Now our test has become a bit of a monster:


public void testShouldOrderBobAYummyPieOnSubmit () {
  PieOrder yummyPieOrder = new PieOrder(“yummypie”, 1, “for Bob”);
  PieOrderingAction pieAction = new PieOrderingAction(orderService);
  pieAction.setPieOrderRepository(pieOrderRepositoryMock);
  pieOrderRepository.expects(once()).method(loadOrder)
     .will(returnValue(yummyPieOrder));
  pieAction.setPieInventoryService(pieAlwaysInStockStub);
  pieAction.setAction(Action.SUBMIT);
  pieAction.setPieID(yummyPieOrder.getPieD());

  pieAction.execute();

  assertEquals(yummyPie, orderService.getLastPieOrdered())
}

OK, so this isn’t the end of the world. But it isn’t so hot either in pastry-free, real world scenarios, this could be much worse. The problem here is that our implementation of execute has 3 phases:

Load PieOrder
Verify PieOrder
Submit PieOrder


public class PieOrderPacingAction…{
    public void execute(){
          //Load pie order…
          //validate pie order…
          //submit pie order…
    }
}

We really only want to test phase 3 in isolation. Someone has already implemented tests and production code for steps 1 and 2 – we don’t want to repeat ourselves.

Time to listen to the test. Our test code is suggesting that this is not an optimal implementation. We could choose to hide all this away in obscure setup code, or we could try to find a better way of implementing our code. So we pick the best option and look into the API of our actions and discover that we could implement this differently. WebWork supports actions which implement prepare() and validate() methods. Woot! We can move our code into these calls instead and simply our production code.


public class PieOrderPacingAction…{
 public void prepare(){
    //Load pie order…
 }
 public void validate (){
    //validate pie order…
 }
 public void execute(){
    //submit pie order…
 }
}

The beauty of this is that, in theory we can chop up our tests.

Our first set of tests check that the action loads the PieOrder during prepare().
Our second set of tests prove that the action validates its order during validate()
Given that the action has already loaded the order..
Our third set of tests verify that the action will submit the pie order during execute()
Given that the action has already loaded and validated the order.

So our test case looks like this now:



public void testShouldOrderBobAYummyPieOnSubmit ()  {
 PieOrder yummyPieOrder = new PieOrder(“yummypie”, 1, “for Bob”);
 PieOrderingAction pieAction = new PieOrderingAction(orderService);
 TestUtils.setPrivateField(pieAction, “pieOrder”, yummyPieOrder);
 pieAction.setAction(Action.SUBMIT);

 pieAction.execute();

 assertEquals(yummyPie, orderService.getLastPieOrdered())
}

Those following closely will have noticed a call to set the pieOrder field via reflection. So why did we set the private field here (using evil reflection of all things!)? Well, this comes back to not test driving the code the wrong way. We could have added a setter to allow a more traditional way of setting the object state. However, this is not useful to the real world. We are only setting the internal field in order to get our object into the correct state to test it. It is less evil to use reflection to put an object into a valid state for a test situation than it is to add access to internal state legally for all code.

So what are the issues that we have to watch out for here?

Obviously the test is now more tightly coupled to the implementation details of our test subject. If we change the internal structure of our object, we could well break our tests. We can mitigate this by not making our object complicated. In this example we only reach into our object to set one field. Try and keep things this simple when using this technique

There is a risk that the tests don’t join up properly. If you are going to reach into an object to set its state, you must have corroborating test cases which prove that the object is in fact in this state at the end of the preceding calls in its lifecycle. It may also be worth writing a small number of ‘integration tests’ which run the object through the whole call cycle (here, prepare -> validate -> execute) in order to demonstrate the expected behaviour of caller and to prove that you got things right with the lower level testing.

Buried Intent. Bad for campers, bad for TDD

2006-10-19T14:29:00.000+01:00

Remember that your test code is not production code. The purpose of your test code is to prove that your production code works and to provide live documentation. Qualities which make good test and production code are not necessarily the same. For example, it is important that a test case clearly demonstrates the expectations of a test case inline. This is especially true when working on an XP project, as these tests are your primary means of documenting the expected behaviour of your test classes. If another developer cannot read your tests easily and determine what was intended of the production code then trouble will ensue

Consider the following test case:


public void testValidateOffPeakUserConnectionRejectedDuringPeakTimes() {
    ClientConnectionRequest connectRequest = new ConnectionRequest("bob", "12345");
    setMonitorExpectations("bob", "Some account", "12345", ON_PEAK, true, false, true);

    boolean requestResult = connectionMonitor.validate(connectRequest);
    assertEquals(false, requestResult);
}

Superficially, this test seems to be OK. The test is short and sweet, and we are checking that our monitor rejects the request. But is going on exactly? Well, to work this out we have to dig down into the setMonitorExpectations method. More information is revealed:


private void setMonitorExpectations(
    String clientName,
    String accountName,
    String clientId,
    boolean introductory,
    boolean unlimited,
    boolean offPeakOnly) {

    UserAccount account = new UserAccount(12345L);
    account.setAccountName(accountName);
    account.setClientName(clientName);

    buildAccountType(introductory, account);

    repositoryMock.expects(atLeastOnce()).method("find").will(returnValue(account));
    repositoryMock.expects(once()).method("isUnlimited").will(returnValue(unlimited));

    if (offPeakOnly) {
         account.setAccountRestrictions(AccountUsage.RESTRICTED);
        repositoryMock.expects(once()).method("getAllowedConnectPeriods")
                .with(eq(12345L),eq(clientId)).will(returnValue(offPeakOnly));
    }
}

Oh, so we are setting up a bunch of attributes of the account, conditionally flagging the account as restricted, and then going on to add an expectation. But we still don't have the full picture, as there is yet another call out to buildAccountType.


private void buildAccountType(boolean introductory, UserAccount account) {
    if(introductory) {
      account.setAccountType(AccountType.INTRODUCTORY);
    }
}

Now we have a a number of problems.

Although the test code works, and is compact it is failing to fulfil the role of documenting the production code clearly. In order to determine what my test methohd is trying to prove I have to dig down a number of levels.
I suspect that the helper method is actually too rich for this method. There is conditional logic in there to change the expected test behaviour based on the test situation. This may be acceptable in production code, but conditional logic has no place in test code.
The fact that the interactions with other objects are buried in the helper functions is also hiding potential issues in the production code. In this case it is the repeated calls to the repository. By inlining the expectations it would be more obvious that the classes have a chatty relationship, and this could be harnessed by changing the test expectations and driving the production code to become more efficient.

This is a natural trap which we can fall into when we are constructing unit tests. Our IDE allows us to extract methods so easily these days that we often find that we are extracting "common code" from our test calls because we believe that this is helping to keep the code base compact and efficient. In fact, this naive refactoring can case real problems. Now that our test code is less clear to a casual reader, the overhead of maintaining the test suite could rise!

Ways Out Of this Situation?
In this situation the simplest route out is to inline all the buried methods and step back. In this case we initially end up with:


public void testValidateOffPeakUserConnectionRejectedDuringPeakTimes() {
    ClientConnectionRequest connectRequest = new ConnectionRequest("bob", "12345");

    UserAccount account = new UserAccount(12345L);
    account.setAccountName("Some account");
    account.setClientName("bob");

    if (ON_PEAK) {
        account.setAccountType(AccountType.INTRODUCTORY);
    }

    repositoryMock.expects(atLeastOnce()).method("find").will(returnValue(account));
    repositoryMock.expects(once()).method("isUnlimited").will(returnValue(true));

    if (true) {
        account.setAccountRestrictions(AccountUsage.RESTRICTED);
        repositoryMock.expects(once()).method("getAllowedConnectPeriods")
                .with(eq(12345L), eq("12345")).will(returnValue(true));
    }

    boolean requestResult = connectionMonitor.validate(connectRequest);
    assertEquals(false, requestResult);
}

This is clearer than we had before, and when the method is rearranged and the account creation logic extracted into a helper method we end up with:


public void testValidateOffPeakUserConnectionRejectedDuringPeakTimes() {
    UserAccount account = buildAccount(
        12345L, "Some account name", "bob client", INTRODUCTORY, RESTRICTED);
    ClientConnectionRequest connectRequest = 
        new ConnectionRequest(account.getClientName, account.getId());

    repositoryMock.expects(atLeastOnce()).method("find").will(returnValue(account));
    repositoryMock.expects(once()).method("isUnlimited").will(returnValue(true));        
    repositoryMock.expects(once()).method("getAllowedConnectPeriods")
        .with(eq(12345L), eq("12345")).will(returnValue(true));
    
    boolean requestResult = connectionMonitor.validate(connectRequest);
    assertEquals(false, requestResult);
}

Note that this refactored test is really not much larger than the original. However the difference is that the test expectations are clearly visible. As a result, the developer is more likely to be in a position to improve the code. Here the mock object interactions are showing that we have an overly chatty relationship with another object. This could be a sign that we are suffering from the 'feature-envy' anti pattern in our production code, which in this example could lead to us making bitty calls to the database and could become a performance bottleneck. By exposing the relationship explicitly in the test case, we can now choose what to do about the situation.

Live with the problem - this is not an important feature
Change the expectations to drive improvements in the production code. It would now be easy to change the test case to forbid the look up calls to the repository and drive the production code down a more performant path.
Break up the method under test.

The Testing Anti-Patterns Drafts Vol. 1 (draft2)

2006-09-16T12:04:00.000+01:00

George and I both consider ourselves fortunate enough to work as members of teams where TDD is practiced by default.

There are two reasons beyond the obvious as to why we find it relatively harder to code in a non TDD approach. Test Driving our code enhances our vision on how to design/model/instrument the application's universe. Also, starting with a test means work begins and ends with coding, not meetings, discussions or modelling our vision in pictures bound to be proven unrealistic when the first coding bottlenecks arise.

This is not to say that discussions or modelling are intrinsically bad things, a good brisk white board discussion can really help – but don't ever forget that this is a relatively abstract activity.

Tests have a much closer relationship with your code that allows you to discover and document the behaviour of your system in a much more detailed manner. In fact it is the close relationship between tests and code that can lead to problems. This relationship must be kept in balance if the process is to be successful.

Production code is the bread winner. This is where your business value lies. However, due to the nature of the TDD process, the production code is also somewhat dim-witted; it doesn't really have a clear vision or motivation in life. The test code is there to explain to the code what is expected of it and to highlight any mistakes the code makes by identifying in a precise manner why the code doesn't quite do the right thing and to point out how to head off in the correct direction. In some respects this could be likened to the relationship between a boxer and his trainer. Ultimately, it is the boxer who has to go into the ring and win the fight, but it is the trainer who gets the boxer into the zone mentally and physically.

Another equally important role of the test code is to document the expected behaviour of the production system. This is fantastic! Suddenly, the technical documentation is no longer a static and dusty tome on a shelf, rather an ever accurate and clear guide to what the code really does. If your code is not fit for purpose, this should be made glaringly obvious in your tests, and by changing your documentation you should be able to change the behaviour of your code to make it do the right thing in the right way.

However, we don't live in a perfect world and testing properly is not easy, despite the presence of opposable thumbs and shiny new Macs. Once you get past the simplistic examples and into the real world you find that it is difficult to write effective tests, and often the tests are much harder to craft than the actual code. In many ways, this should not come as a surprise. When you write a test, you are attempting to satisfy a number of aspects of the system: quality, design, documentation. And all of this expressed though the medium of a software language!

So, what makes our tests go bad? There seem to be a number of patterns starting to emerge, but to list them all here is going to take too long – I think George and I are going to have to break this out (hopefully with lots of assistance from our friends ;-)). In my mind the coarse categories of TDD anti patterns are:

Driving the code the wrong way.
Hiding deficiencies in the code.
General test smells

Test Driving The Wrong Way

This is the situation where your code starts to bend towards the test code to meet requirements of the testing framework rather than to meet the needs of the application. Examples of this include:

Adding calls to your production data layer code to allow setup/teardown/query operations which are not required by the application.

Providing access to properties which should not be made visible normally (this is what George is getting at with Design Pervasive Testing)

Forcing the use of IOC where in fact it makes more sense to create objects internally or access static methods.

Clearly there is a sliding scale of smelliness here. Overuse of IOC is not really such a bad thing – the code will still function as required. It just leaves you thinking that there has to be a better way.

Adding a deleteAllClients() method to your production code is a big deal though!

Sometimes, this can be sign that the test framework that you are using to test your application is not suitable, as switching to a different technology can help get around the smell.

For example - the use of DbUnit may make it easier to put your database into the correct state for a functional test and remove the need for the dodgy setup code in your data layer. Newer mocking libraries like JMockit can remove the absolute requirement for IOC based design patterns by allowing you to hook the creation of new objects and accessing static methods.

Hiding deficiencies in the code

This is where we seem to be suffering the most. It is very easy to hide problems in the code base instead of actually driving them out with your tests. Examples of this phenomenon:

Hidden behaviour. The production code performs a variety of complex activities in a given situation. However this is not clear because the test has buried the behaviour in deep in a number of (often obscurely named) helper functions. This is often a sign that you have too many complex relationships between your objects, or that the conversations the component has with its collaborators is overly chatty.

Supporting actors stealing the show. The test is so polluted with setup code that you can't actually make out what the test is trying to show you. This can often be a sign that the step is too complex.

General Test Smells

Badly named tests. This is especially bad when you consider the need to document the code through the medium of tests.

Inappropriate use of stubs. For example, stubbing a simple data type.

Etc.

In summary, I think we really need to care about the quality or our test code and learn to treat it quite differently to the production code, realising that the two portions of our code base serve different purposes in our development activities.

So...over to George to fill in some more blanks ;-)

“The Testing Anti-Patterns Drafts” is a collaborative effort between George “spring is overrated” Malamidis and myself which aims to identify cases of Testing gone bad. It consists of a single document that will undergo constant enhancements and modifications, in a “pair-authoring” manner, utilising our respective weblogs as the platform. We hope to get input from anyone following the document, our goal being to produce an interesting resource for the TDD, or Testing Oriented in general community.

Son, it's time to be a geek...

2006-09-10T16:16:00.000+01:00

This weekend a shiny new desktop arrived at the Caborn household. After many long months of procrastination, I finally bought a PC for my kids to use. The previous family PC had met an untimely end due to a spilt glass of orange juice about 6 months ago.

While the kids were at school I spent a nice geeky morning setting it up for them and now Heather is happily playing preschooler games and Skyping to my laptop downstairs. Fantastic.

My son wants to be a computer geek like his dad when he grows up. Well, not exactly like me; he plans to drive a DB7 and be a rock star Mondays and Fridays. So, if he's going to lead such a cool life, he needs to learn to program. I'd like to do this with him, in the same way that my dad tried to teach me how to be an engineer by getting out the Meccano set. PCs are easy to get to grips with and Phillip already understands how to put together a logical argument.

At the age of six I think he may be ready to start to learn to program. The only question is: what language should we start with? I posed this question to a number of fellow ThoughtWorkers and, unsurprisingly they came back with a variety of answers!

Lisp - "You should teach the guy a proper language and Lisp has everything in it"
Logo - "There nothing more rewarding than drawing a picture with a turtle and it made me what I am today!"
Flash - "Flash is easy to use and there's instant gratification - it is very visual"
Lego Mind Storms - "Its just so cool".
Java - Java is easy to learn and I'm familiar with it.
Javascript - I can't remember why this was a good idea.

So. I'm not certain, but I think I may go with Flash. I like the visual aspects of flash and most of the games and silly things the kids love on the net are made with flash. Wouldn't it be great to write a flash game together including scanned artwork provided by Heather (age 4)....

Time to learn to do stuff with flash. I think I've got about a week before Phillip finishes his latest PS2 game and remembers that dad promised to teach him to code. Should be plenty of time to learn to do something cool in flash!

It's tiddly, it's a wiki...

2006-08-01T23:09:00.000+01:00

It's TiddlyWiki!

I'm playing with this at the minute as a replacement for the myriad of little emails, .txt files and other crud that I scatter across my laptop. It seems very nice and extremely user friendly.

The appeal for me at the moment is that I can shove the whole lot onto a USB stick and keep it with me wherever I am.

On graffiti and broken windows

2006-07-10T13:49:00.001+01:00

In his book Tipping Point, Malcolm Gladwell describes how graffiti and broken windows can have a dramatic effect on the behaviour of the residents in a city. For those of you who have not read Tipping Point (and I strongly recommend you read the book), the key premises go something like this:

Social change does not occur in the smooth linear way which many people imagine.

Often social values will suddenly transition or ‘tip’ from one state into another. Social change such as crime rates and fashion can often behave in a manner which is similar to the spread of diseases.

A tip can often be achieved by the compound effect of relatively benign factors.

Too much graffiti and too many broken windows can tip a neighbourhood from being a good neighbourhood into a crime ridden no-go area. Additionally, the behaviour of key individuals in a social group can make quite bizarre actions (such as suicide) not just acceptable in a group, but also fashionable and desirable.

In order to repair a problem, you need to perpetually guard against seemingly insignificant factors in order to effect change.

When the new york underground decided to try to clean up the crime and vanadalism on the tube network, they started with the graffiti. The thoery was that by cleaning up the appearance of the trains the feel of the tube network would be improved and people would start to feel more secure and crime would be discouraged. In order to clean up graffiti on their tube network they did it in a staged and defensive manner. A small number of trains were designated as ‘clean trains’. These trains were not allowed to slip and become covered in graffiti even though other trains remained ‘dirty’. The number of clean trains was exteneded in a sustainable manner at the rate the tube system could cope with until the problem tipped and became controllable.

There goes the neighbourhood...

Applied to neighbourhoods, the principal is that an area can hit a point at which there are so many broken windows and walls covered in graffiti that it can change peoples social values. It suddenly becomes 'OK' to break more windows and deface property, and this can move on to more serious crimes being committed once people get a taste for misdemeanours. At this point the neighbourhood has tipped and will rapidly go downhill.

In order to affect change and tip a bad district back into the light, it is necessary to actively repair broken windows and clean up graffiti, because without improving the environment that people live in there will not be enough social impetus to allow the residents to control and discourage antisocial behaviour.

Back in the world of software…

It is interesting to consider if the concepts of the tipping point can be applied to software? I believe that they can. Many applications are perceived by their developers and maintainers to contain either 'good' or 'bad' code. Good code is much cherished by the teams who maintain them – and they bring joy and happiness to the world. Bad code is a millstone around the necks of the maintenance teams and is painful to maintain. But how do we judge good and bad code?

A number of factors can come into play here, but the main dimensions that I feel developers and QA's tend to use to decide if code is good or bad are perceived design quality, and the number of defects in the code base. Poorly designed or ‘smelly’ code is the graffiti of software, and bugs are our broken windows.

Code Quality and graffiti

Code quality is interesting. There are a number of motivating factors which drive for high quality code. Primarily it comes down to the values of the team. If a team values good quality code, then it will attempt to write production code ‘the right way’. However, in any project there are a number of competing drivers which can hamper the realisation of the quality code.

In my mind the biggest anti-quality drivers are:

The fact that there is a tight deadline.
Members of the team who do not value quality code.
Working in a bad district.

The first 2 of these anti quality drivers are common fare. But what about the bad code district? Despite the desire of a team to write good code, they may well struggle if they find themselves in a bad neighbourhood. If the code contains too many code smells (there is a high level of graffiti and broken windows) you may well find that you are producing more smelly code.

Why? Because the developers have lost hope.

The motivation to write good code and behave is much reduced if every preceding developer has treated the code so badly. It becomes very easy for our developer to scrawl on the walls by writing a piece of smelly code or smash a few windows with the odd poorly handled exception because there are so many examples of this about.

The very same developer would be much more likely to run up some scaffolding or lay down the dustsheets with the odd unit test in a good neighbourhood, before going on to build in that new feature.

However, you can get on top of this and agile techniques can be most effective. For a start – agile methodologies value good code. They also build in checks and balances which look to defend against the failure modes of human beings. This is important. If developers were machine like, they would not care if they were working in a bad neighbourhood and so there would be relatively little impact on the quality of new code introduced there. However humans are influenced by their surroundings and your process needs to take this into accounts. A number of XP practices can help here:

TDD – this is essential to defend against the broken windows initially. You must not add to the problems in your run down area by recklessly adding code. Use the techniques outlined in “Working with Legacy Code” to gradually ‘test infect’ your code base.
Pair programming – two developers have more courage and are more likely to “do the right thing”. Pair programming is a very effective way of cementing the desired values of the team – you can even use this to ‘inject values’ into the team by clever choices of pairs. If you don’t do pair programming – try using design and code reviews to achieve the same effect.
Continuous integration – regular builds maximise the return on investment of the TDD by providing lots of regular feedback. This helps to build momentum which is essential to get is you are to tip your code from a run down neighbourhood into an up and coming district.
Code coverage – not strictly an XP thing – but an important guardian of your defended streets – build this into your CI system.
Acceptance tests – write them for the new features and get them into the build. If you have time you could try to retro fit them but I have never seen this work well. Better to add them as you add new features or modify bugs.

So – if you have the values and practice in place, how to you ‘tip’ the code base?

If the team values improving the code base and is supported, then you can make tactical improvements. Don’t try to repair the neighbourhood all at once. Rather identify the houses and streets which you repeatedly visit and isolate them. Fix the broken windows by building out unit tests and functional testing to cover just these areas and defend them against the rest of the neighbourhood. Once you have a module under test – defend it with code coverage and automated tests to allow you to spot any breakages and patch them up immediately. Do not allow these clean areas to fall. (todo – relate to the tube trains here).

So – we’re all done then…?

Well, not really. No neighbourhood stays clean and tidy without ongoing effort. You need to actively guard against the problems which can drag your neighbourhood back down to the skids:

Maverick coders who churn out vast amounts of poor quality code and are not controlled by their managers or team mates.
Project schedules - prefabs of code are hastily erected to serve a short term need. But are still there 50 years later. This is fine in the short term – but you must plan to repay your design debts in a timely fashion or the surrounding properties in your code will start to suffer.
False values. You will not succeed if your team does not truly believe in the values they are supposed to believe in. This is the biggest killer of all - as people do not perform at their best when their hearts aren’t in the job at hand.
Lack of policing. All too often there is not enough effort put into place to police your new district. If you don’t crack down on the perps who break builds and flout the rest with non-TDD coding, then you will suffer from a rising crime rate, more graffiti and more broken windows - "there goes the neighbourhood...”

The inspiration for this entry came from a discussion last Friday in the pub with Jon. He has already blogged about this, but I wasn't going to let that stop me ;-)