Wednesday November 21, 2007 Robert Donaway
I've recently been working on Java code that calls a couple of web services to update another system when the data in our system changes. I used a tool to generate the client code that actually calls the web service. The existing code in the system was not written by me, but I think it’s nicely designed and separated into the usual layers of data access, service and view. So when I began I didn't have a whole lot of freedom in how I was going to implement the new feature -- basically I had to create service layer objects that use other services to get data from our system, insert this into the web service client object model and then invoke the service.
Originally I had intended to use Test Driven Development (TDD) to implement this but quickly found this very difficult. My first problem was that I had no clue what methods or method signatures would be in these service objects or even what a successful test would be until I played around with the existing API's. So I decided to abandon TDD and just started implementing things little by little and refactored frequently. In less than two days I had working code that I was pretty happy with and I was anxious to try it with some test data against the web service. I spent another day getting some meaningful data loaded into my test database and by the end of the third day I had successfully tested the process in from end to end.
Well, I was quite pleased with myself and was about to check in all that nice code I wrote and then remembered -- #$%^@, I haven't written any unit tests! Of course, I was tempted to just check the code in anyway, but this is a system that's used by a lot of people, and has a reputation for being bug-free (thanks to my Near Infinity colleagues!). Also, the other developers on the project notice when our code coverage percentage goes down. I didn't want my reputation to suffer, so I thought, I'll be a hero and make sure that all this code I've written has good unit tests with 100% code coverage. I also thought that this will probably take about a day. After all, I'm not a testing newbie. I've been writing JUnit tests daily for over six years now and I'm quite familiar with the tools used in the project, such as DBUnit and HSQL. Right. Well, seven work days later I gave up writing unit tests. I hadn't found a single new bug, but had achieved over 90% test coverage of my new code, which was higher than our current average, so I just said "screw it" and checked the code in.
By the end of this ordeal, I was so angry and resentful I was about ready to strangle someone. And I thought, is there something wrong with me? Am I just whinging? Why did it take so long to write the tests? Am I a moron? Am I going senile? Is it wrong to have a 4:1 ratio of time spent on test to time spent on source? Is it because I didn't use TDD? Is this really the best use of my customer's time? Should I quit software development and become a dog walker? Should I commit suicide? Well, rather than choosing the last option, I began searching the internet for some answers. Below I've included links and brief summaries of some of the articles I read.
After reading and thinking, thinking and reading, I've come up with a new philosophy regarding testing. I'm certainly not saying this is right for everyone, or for languages other than ones like Java and C#, or even for anyone other than me -- just something to consider.
First, software testing is important, and automated tests can help you discover bugs as soon as they crop up, including bugs introduced by future refactoring and adding new features. However, more automated tests mean more code in your system, which makes the system more complex and harder to maintain. Thinking back over my career, the time I've spent maintaining and fixing unit tests doesn't really pay for the relatively few bugs they've uncovered. In fact, when I have refactored code, many of the bugs are introduced by the unit tests themselves, not the source code. Hence, like most issues in life, good testing requires a balance between two extremes, trying to minimize risk without incurring excessive costs. After all, if you bought a $100 product and the store offered a guarantee that it would never break or fail for any reason, would you pay an additional $400 for that guarantee? Or even an additional $100? Probably not, and your customers wouldn’t either.
Insisting on 100% code coverage is one of the extremes to avoid. It's almost as harmful as the other extreme of not writing any tests. Moreover, code coverage percentages and other metrics such as C.R.A.P. and Cyclomatic Complexity are just tools to give engineers greater insight into parts of their code and its tests. They help developers achieve the balance between the extremes, writing tests that are worthwhile and most likely to uncover bugs in the future. However, these metrics are virtually meaningless when applied to an entire system. For one class, 100% coverage might be appropriate, for another even 20% might be too much. The bottom line is that the individual developer, along with anyone reviewing the code, is really the only one that can make this determination.
Well, rage on against my opinions if you must. But at least take a look at some of the links below with an open mind. In the meantime I'll try to read more about TDD to see if, in retrospect, I could have used this methodology.
Is Complete Test Coverage Desirable - or Even Attainable? discusses testability from a theoretical and practical approach.
100% Test Coverage? tries to answer the question by referring to a number of other articles. In a comment, Cedric Buest points to one of his blog posts, saying "Trying to achieve 100% test coverage is not just silly, it's dangerous."
Testing Web Apps Effectively with twill is geared towards Python developers, but has a lot of interesting comments, such as "I think that the art of testing is not in how to test but rather in what to test. The best advice and answer to 'How do I test a web application?' is probably 'Make a priority list of the things you would like to test, and test as little as possible'."
Motley says: "Test both private and public methods" presents a fictional conversation between Motley, who initially insists on 100% test coverage and Maven who gradually helps him to conclude "I should just test public methods and code coverage will just come if I use TDD. ... getting to that magical 100% code coverage number is often not worth the extra effort, and although code coverage is a good measure of your tests, the analysis is more important."
Can You Have Too Many Unit Tests? is a bit dated, but I think still worth reading.
How do you achieve 100% test coverage? suggests deleting code that is difficult to test. I'm not sure about that one, but there are quite a few interesting comments that discuss TDD and some of the issues I am raising.
There are some fairly radical points of view also, such as Wil Shipley's Unit testing is teh suck, Urr. tempered by bbum's Unit Testing.
In his blog post on unit testing, a response to Wil Shipley, Michael Tsai says "My overall point is that time is limited, so you should use it wisely. And this is why extensive unit testing is a big win. Yes, it’s not possible for your tests to cover all the pathways through the code, with all the possible inputs. And even if they could, it probably wouldn’t be a good idea to spend your time writing tests to do that."
I have used Hibernate on several projects that required the users be able to enter rich text comments for many of the major domain objects.
In each case there needed to be a one-to-many relationship between each entity and comments, so that different users could comment on an entity
over time. To store these in the database, my DBA's have requested a universal COMMENT table that would hold comments for any entity
in a CLOB column. Among other benefits, this allows for easy searching over the entire set of comments.
Here is the (Oracle) DDL for a situation like this, where two domain objects, Person and Vehicle, stored in the
PERSON and VEHICLE tables, store their comments in the COMMENTS table:
CREATE TABLE COMMENTS
( COMMENT_ID NUMERIC(12) NOT NULL
, CREATED_ON DATE NOT NULL
, CREATED_BY VARCHAR(50) NOT NULL
, COMMENT_TYPE CHAR(1) NOT NULL
, PERSON_ID NUMERIC(12)
, VEHICLE_ID NUMERIC(12)
, COMMENT_TEXT CLOB NOT NULL
, PRIMARY KEY (COMMENT_ID)
, FOREIGN KEY (PERSON_ID) REFERENCES PERSON(PERSON_ID)
, FOREIGN KEY (VEHICLE_ID) REFERENCES VEHICLE(VEHICLE_ID)
, CHECK (COMMENT TYPE IN ('P', 'V'))
, CHECK ((PERSON_ID IS NULL AND VEHICLE_ID IS NOT NULL) OR (PERSON_ID IS NOT NULL AND VEHICLE_ID IS NULL))
);
Note that this table has a single synthetic primary key, which is generated sequentially as records are inserted. Also, I chose to have separate
columns for the PERSON and VEHICLE keys so that I can enforce integrity with a foreign key constraint. The check constraint
ensures that exactly one of these is not null. The COMMENT_TYPE column will be 'P' if the comment is about a Person and 'V'
if the comment is about a Vehicle.
COMMENT_TYPE can be inferred from the data in the PERSON_ID
and VEHICLE_ID columns, but it will come in handy, as we will see later.
In our Java application, both the Person and Vehicle objects have a collection of Comment objects.
However, a Comment object should also have a reference back to its domain object, so that we can navigate to it easily.
One way to model this type of relationship is to create specific PersonComment and VehicleComment classes, which extend the
generic Comment class. Here is some sample code, leaving out the accessor methods and other details:
public class Person {
private Long personId;
...
private Set comments;
...
}
public class Vehicle {
private Long vehicleId;
...
private Set comments;
...
}
public abstract class Comment {
private Long commentId;
private Date createdOn;
private String createdBy;
private String commentText;
...
}
public class PersonComment extends Comment {
private Person person;
...
}
public class VehicleComment extends Comment {
private Vehicle vehicle;
...
}
So how do we map these classes using Hibernate? Since all instances of a Comment will be stored in the COMMENTS table,
we use the "Table per class hierarchy" mapping strategy. The COMMENT_TYPE column will be the discriminator that tells
Hibernate what concrete type to construct when querying for Comment objects. Here is the main part of the mapping file, Comment.hbm.xml,
that maps the PersonComment and VehicleComment classes to the COMMENTS table:
<hibernate-mapping>
<class name="Comment" table="COMMENTS">
<id name="commentId" column="COMMENT_ID"> ... </id>
<discriminator column="COMMENT_TYPE"/>
<property name="createdOn" column="CREATED_ON"/>
<property name="createdBy" column="CREATED_BY"/>
<property name="commentText" column="COMMENT_TEXT" type="clob"/>
<subclass name="PersonComment" discriminator-value="P">
<many-to-one name="person" class="Person" not-null="true">
<column name="PERSON_ID"/>
</many-to-one>
</subclass>
<subclass name="VehicleComment" discriminator-value="V">
<many-to-one name="vehicle" class="Vehicle" not-null="true">
<column name="VEHICLE_ID"/>
</many-to-one>
</subclass>
</class>
</hibernate-mapping>
Notice that there is no Java property that maps to the discriminator column. Once we specify it in the subclass tag, we can forget
about it since Hibernate takes care of populating it in the database. The collection mappings for the Person and Vehicle
classes are straightforward. Here is the one for Person:
<set name="comments" inverse="true">
<key><column="PERSON_ID"/></key>
<one-to-many class="PersonComment"/>
</set>
One very nice feature of this mapping strategy is the way queries work. If you are using HQL, the query "from Comment where ..." will
return a list of Comment objects of varying concrete types, either PersonComment or VehicleComment depending
on the value of the discriminator column in the corresponding row. When iterating through this list, you can use the instanceof keyword
to determine whether the object has a getPerson() or getVehicle() method. This will allow you to get at the details of the
entity the comment is about.
A while back I tried this mapping strategy using Hibernate 2, but it didn't give me predictable results. With Hibernate 3 it works well.

