Wednesday, April 13, 2005

Testing software with G-clamps

Do you like G-clamps? I do. They’re great when you want to hold pieces of wood together while the glue sets, or even better holding a huge piece of plywood to the bench when cutting sweeping curves with the jigsaw.
'G-Clamp
To get to the point, I use G-clamps most often when explaining how I think about testing software. The metaphor is that I take some part of a software system and put a clamp around it. The part of the system being tested is the part inside the clamp. The clamp has two “ends”; at one end I pretend to be a user of this section (clicking buttons or calling methods) and see what happens, while at the other end I provide an environment for the test (put data in the database or simulate the response of an external system). In this kind of testing there are 3 roles:
1. The narrative of the test, or the tester, pretending to be a user of the system
2. The system being tested.
3. The components not being tested, but which 2 needs in order to work. The behaviour of these components must be simulated.

Or 1. Stimulate, 2. Operate, 3. Simulate (catchy! - not)
These roles map onto a G-clamp as follows:



G-clamps come in lots of different sizes, just as I like to test different sized pieces of my software system. I try to unit-test every part of the system (with small clamps) and I want to have integrations tests that test the whole system at once (with a huge clamp fastened around the whole system).




G-clamps also have some handy properties:
They are mobile, suggesting that you can test lots of different parts of the system with the same tool.
They have an adjustable screw reminding me that I can choose the thickness of section to test.

TDD helps with many aspects of design. One of these is understanding how each component depends on those around it. Although it’s not always possible, it’s desirable for the dependencies to form an acyclic graph, i.e. it’s a good thing to avoid cyclic dependencies. This applies to both large components like tiers, and to tiny components like individual classes. I find it helpful to visualise dependency graphs with the most depended-on things at the bottom of the page. If I draw my components out following this pattern then I find that I always “stimulate” further up the page than where I “simulate”. In other words, I find that all my clamps are vertical.


Where do mock objects fit into this picture? I see them as a convenient way of providing the bottom half of the clamp, most suited to a unit testing scenario, where you are only testing one layer of the graph. Are mocks desirable in all unit tests? If you are testing at the bottom of dependency graph then you are doing something wrong if you are using mocks because there is no lower level to simulate. However my experience is that objects that can be unit tested in isolation like this are very rare.

What type of test is best? (What is the best size of clamp?) Our vocabulary for types of test is poor right now (and some tests probably shouldn’t be called tests at all). I think it’s important to separate the concerns of what you intend to gain from your test, and what the scale of the test is. Also it’s important to appreciate that lots of different test scopes are possible. We tend to think of integration tests as horrible big beasts that use all the layers in our stack, while in reality we can write integration tests for just two classes, and every scale right up to the whole stack. So Alan's quote:

that if you never tested any collaborations with live objects, your rapid feedback test suite had limited use.

Is spot on; your rapid feedback suite should include tests of varying scale, some integrating with “real” objects. These don’t need to be the big heavy integration tests that take half an hour to run; you can write something just a shade bigger than a unit test, or whatever you need to find the things that break often.

In general we could test every line of code many times, just by unit testing, and testing all combinations of adjacent components. My proposal is that all code should be unit tested, and integration tests should exist where they help to trap bugs. So the choice of which scales of integration test are best (or which sizes of clamp are best) comes down to a question of which scales will be most efficient at trapping bugs.

1 Comments:

At 12:20 am, Blogger Fabio Pereira said...

Hi, this GClamp metaphor about testing has been very useful for me. I'm doing a presentation at ThoughtWorks this weekend and I'll talk about your post. I'll make sure I mention it was your idea... And add the link to the post... However, all the images are gone :(
Cheers and congrats for the post
www.fabiopereira.me/blog

 

Post a Comment

<< Home