Seven popular causes of unit test indeterminism

This post was almost in its entirety written in 2010. I found it recently in the Drafts folder and decided to publish it now. Most of the content is still relevant.

Most people want their test to be determinate and reproducible. Most people don’t want the added excitement that comes with a unit test beeing sooo green when run in the IDE and failing so hard on the continuous integration server.  In xUnit Test Patterns: Refactoring Test Code Gerard Meszaros calls a test exhibiting this kind of behaviour for an Erratic test.

Root causes for erratic tests include platform differences (causing tests to exhibit different behavour when executed on different computers), lack of state isolation (causing build failures only when tests are executed in an unfortunate order) and lack of proper synchronization (causing test to fail completely randomly).

I think we all can agree that Erratic tests are anti patterns of the first order. Hence here goes (in no particular order):

Seven popular causes of unit test indeterminism

1. Usage of Set instead of List. The java.util.Set contract doesn’t define any order for its elements, thus is new HashSet<String>("a","b").equals(new HashSet<String>("a","b")) not always true. Most times it is though. Though far less often true on the CI server than on the work station, in my experience. Advice: only use set colections in production code if it really makes sense (the set specific properties  has a meaning in the domain).

2. Line break representation. A rather common scenario in corporate settings are the devs (reluctantly) running Windows work stations while the integration server is hosted on a Linux box. Line breaks is represented quite differently on these systems and can cause problems both with production and test code. System.getProperty("line.separator") to the rescue!

3. Default character set. An InputStreamReader per default treats the InputStream it wraps as a byte stream pumping characters encoded in the platform default character set. The default character set varies wildly from operating system to operating system (and locale to locale). If your test data includes high bit characters (i.e. characters not contained within the the 2^7 symbols of the original ASCII spec) – as it should – you better be specific about the encoding of your InputStream. Advise: always specify character set encoding explicitly when working with character streams e.g.  new InputStreamReader(is, CharacterSet.valueOf("UTF-8")).

4. Case insensitive file names. Another platform dependent favourite: Windows is perfectly happy to new FileInputStream("ListOfAccountsToProcess.data") even though the file happens to have been christened  ListOfaccountsToProcess.data. When the code is moved to Unix enviroment the lower case a in the file name will suddenly start to matter.

5. Static state is something best avoided like the plague. It really kills test isolation, requiring pain staking (and easily broken) manual clean-up code to be inserted everywhere. Even if you have done your best to avoid static data, one sometimes have to work with Java system properties. These are inherently static – setting a property can potentielly influence any test run thereafter.

6) Time flies. Failure to take account of the most intrinsic property of time: time advances. Now is not the same time as a split second ago. The expression new LocalDate().isBefore(new LocalDate(2013,12,24)), while perfectly true as of this writing, will not be particularly true after this christmas. I prefer to trigger CI build not only after commits but also periodically for, among other reasons, to expose these kinds of deficiencies as early as possible.When testing time dependent logic, be careful about hard coding dates in the test data, prefer to work with dates relative to now. Or, even better, ensure that all access to the current time in application code is through an easily mocked interface.

7) Database integrations tests. Programatic integration tests are controversial, in a perfect world we may not have a need to these slow and fragile misfits but for now few application can go with out them – particularly wrt. persistence. On the many problems with integration tests is the question of how the test fixture is managed. In unit test preconditions are typically set up in memory within the scope of each tests, thus each tests a guaranteed a fresh pristine test harness. This approach is typically not feasible when implementing data base integration tests, even when working with in memory database the cost of dropping, creating and populating the database with test data is comes with a prohibitive cost.  Other options such as manually trying to reset the database to its original state after each tests or writing test methods that shares a common fixures are laborious and unattractive and can lead to poor defect isolation.

The venerable Spring Framework provides a TestRunner that promise a luring solution. By virtue of a @Transactional annotation the frame work starts a fresh transaction before execution of the test method, a transaction that is rolled backed post test method execution – thus guaranteeing perfect test fixture isolation even though the database is reused.

The problem is only that many features of the the typical ORM stack is not properly exercised if the transactions is not commited! In fact it is only until the transaction commits the JPA PersistenceContext/Hibernate session flushes to database. Additionally are database constraints typically deferred util commit time.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s