Mapping a universal comment table using Hibernate 3.0

| | Comments (8) | TrackBacks (0)

I have used Hibernate on several projects that required the users be able to enter rich text comments for many of the major domain objects. In each case there needed to be a one-to-many relationship between each entity and comments, so that different users could comment on an entity over time. To store these in the database, my DBA's have requested a universal COMMENT table that would hold comments for any entity in a CLOB column. Among other benefits, this allows for easy searching over the entire set of comments.

Here is the (Oracle) DDL for a situation like this, where two domain objects, Person and Vehicle, stored in the PERSON and VEHICLE tables, store their comments in the COMMENTS table:

CREATE TABLE COMMENTS
( COMMENT_ID   NUMERIC(12) NOT NULL
, CREATED_ON   DATE        NOT NULL
, CREATED_BY   VARCHAR(50) NOT NULL
, COMMENT_TYPE CHAR(1)     NOT NULL
, PERSON_ID    NUMERIC(12)
, VEHICLE_ID   NUMERIC(12)
, COMMENT_TEXT CLOB        NOT NULL
, PRIMARY KEY (COMMENT_ID)
, FOREIGN KEY (PERSON_ID)  REFERENCES PERSON(PERSON_ID)
, FOREIGN KEY (VEHICLE_ID) REFERENCES VEHICLE(VEHICLE_ID)
, CHECK (COMMENT TYPE IN ('P', 'V'))
, CHECK ((PERSON_ID IS NULL AND VEHICLE_ID IS NOT NULL) OR (PERSON_ID IS NOT NULL AND VEHICLE_ID IS NULL))
);

Note that this table has a single synthetic primary key, which is generated sequentially as records are inserted. Also, I chose to have separate columns for the PERSON and VEHICLE keys so that I can enforce integrity with a foreign key constraint. The check constraint ensures that exactly one of these is not null. The COMMENT_TYPE column will be 'P' if the comment is about a Person and 'V' if the comment is about a Vehicle. Of course, the COMMENT_TYPE can be inferred from the data in the PERSON_ID and VEHICLE_ID columns, but it will come in handy, as we will see later.

In our Java application, both the Person and Vehicle objects have a collection of Comment objects. However, a Comment object should also have a reference back to its domain object, so that we can navigate to it easily. One way to model this type of relationship is to create specific PersonComment and VehicleComment classes, which extend the generic Comment class. Here is some sample code, leaving out the accessor methods and other details:

public class Person {
    private Long personId;
    ...
    private Set<PersonComment> comments;
    ...
}

public class Vehicle {
    private Long vehicleId;
    ...
    private Set<VehicleComment> comments;
    ...
}

public abstract class Comment {
    private Long commentId;
    private Date createdOn;
    private String createdBy;
    private String commentText;
    ...
}

public class PersonComment extends Comment {
    private Person person;
    ...
}

public class VehicleComment extends Comment {
    private Vehicle vehicle;
    ...
}

So how do we map these classes using Hibernate? Since all instances of a Comment will be stored in the COMMENTS table, we use the "Table per class hierarchy" mapping strategy. The COMMENT_TYPE column will be the discriminator that tells Hibernate what concrete type to construct when querying for Comment objects. Here is the main part of the mapping file, Comment.hbm.xml, that maps the PersonComment and VehicleComment classes to the COMMENTS table:

<hibernate-mapping>
<class name="Comment" table="COMMENTS">
    <id name="commentId" column="COMMENT_ID"> ... </id>
    <discriminator column="COMMENT_TYPE"/>
    <property name="createdOn" column="CREATED_ON"/>
    <property name="createdBy" column="CREATED_BY"/>
    <property name="commentText" column="COMMENT_TEXT" type="clob"/>
    
    <subclass name="PersonComment" discriminator-value="P">
        <many-to-one name="person" class="Person" not-null="true">
            <column name="PERSON_ID"/>
        </many-to-one>
    </subclass>
    
    <subclass name="VehicleComment" discriminator-value="V">
        <many-to-one name="vehicle" class="Vehicle" not-null="true">
            <column name="VEHICLE_ID"/>
        </many-to-one>
    </subclass>
</class>
</hibernate-mapping>

Notice that there is no Java property that maps to the discriminator column. Once we specify it in the subclass tag, we can forget about it since Hibernate takes care of populating it in the database. The collection mappings for the Person and Vehicle classes are straightforward. Here is the one for Person:

    <set name="comments" inverse="true">
        <key><column="PERSON_ID"/></key>
        <one-to-many class="PersonComment"/>
    </set>

One very nice feature of this mapping strategy is the way queries work. If you are using HQL, the query "from Comment where ..." will return a list of Comment objects of varying concrete types, either PersonComment or VehicleComment depending on the value of the discriminator column in the corresponding row. When iterating through this list, you can use the instanceof keyword to determine whether the object has a getPerson() or getVehicle() method. This will allow you to get at the details of the entity the comment is about.

A while back I tried this mapping strategy using Hibernate 2, but it didn't give me predictable results. With Hibernate 3 it works well.

8 Comments

Excellent post Bob. It's wonderful to see 1. the use of surrogate keys and 2. not overloading one column as a foreign key to multiple tables.

I realize you menioned COMMENT_TYPE can be inferred, but the duplication bothers me and I can't help thinking what happens if this gets out of sync with the PERSON_ID and VEHICLE_ID foreign key's. Surely there is some way to set the discriminator column as PERSON_ID and the discriminator-value as "NOT NULL" or something like that?

These days people would be insane to use an ORM tool if they cannot use surrogate keys, e.g. if their DBAs are, shall we say, not very accomodating -- i.e. the ones who have never actually had to build an application on top of their pretty, academic, 4th and 5th normal form data structures. Second, there is no way I know of in Hibernate to use the discriminator column in the way you are suggesting. ORM tools are not designed in general to understand multiple foreign keys within a class hierarchy, particularly in the table-per-class-hierarchy strategy. Besides, you should use unit tests that prove your code works properly and that it sets the foreign key and discriminator correctly as in this example. Of course there is always the chance it could happen, but usually is not worth worrying about versus the effort. Also, though this structure works, I actually do not prefer having all those foreign keys in the same table, and if I didn't need bi-directional navigation I might simply use unidirectional one-to-many with join tables. That way there is no "foreign-key pollution" in the target table, in this case the COMMENT table. Now this does create more tables in the database but there is always a tradeoff. Also, while I agree that using a single "overloaded" foreign key is not ideal, sometimes it is very convenient. One application I work on uses it and uses a Hibernate "many-to-any" mapping type which is quite obscure and has its own warts but in Hibernate 2.x as Bob mentioned the mapping described didn't work for me (granted it was two years ago).

I could have prevented Lee's concern by writing a better check constraint:

CHECK ((PERSON_ID IS NULL AND VEHICLE_ID IS NOT NULL AND COMMENT_TYPE = 'V') OR (PERSON_ID IS NOT NULL AND VEHICLE_ID IS NULL AND COMMENT_TYPE = 'P'))

Good catch!

I'm curious to find out if this strategy would work if we merged the PERSON_ID and VEHICLE_ID columns into an ENTITY_ID column. That would get rid of all of the nulls in the table, but we couldn't enforce foreign key constraints. I bet Hibernate could handle this, especially since you would map ENTITY_ID only in the two subclasses.

Yes, that check constraint does it, I'm happy now.

Regarding minimizing null foreign keys (as Bob mentioned) it might be better to subclass comment types than merge foreign keys and lose referential integrity. I think this is also what Scott was saying (please correct me if I'm wrong):

CREATE TABLE VEHICLE_COMMENTS
( COMMENT_ID NUMERIC(12) NOT NULL
, VEHICLE_ID NUMERIC(12) NOT NULL
, PRIMARY KEY (COMMENT_ID)
, FOREIGN KEY (COMMENT_ID) REFERENCES COMMENTS(COMMENT_ID)
, FOREIGN KEY (VEHICLE_ID) REFERENCES VEHICLE(VEHICLE_ID)
);

Then since VEHICLE_COMMENTS.COMMENT_ID is both a primary key and a foreign key you end up with a one-to-one relationship and no more null foreign keys and you still get referential integrity. You'd need to add a similar PERSON_COMMENT too of course.

I suppose this way you also end up with one Hybernate class for each table. Would Hybernate handle an approach like this?

Hibernate can handle primary key associations and it can handle one to many associations using an intermediate join table. In addition, both of these are transparent to your domain model so you would *not* need a separate domain model class for all the join tables, which is a plus.

And this is the type of design I was talking about, i.e. you'd have no foreign keys in COMMENT and you'd have separate join tables for each type of comment you wanted, e.g. VEHICLE_COMMENT and PERSON_COMMENT. This is probably the "correct" design from a normalization standpoint, but it is still annoying to need all those separate join tables. However, it is at least transparent to your domain model.

The last thing we've been tossing around is whether a generic ENTITY_ID and type discriminator column is good or not. While it might not be properly normalized, it is possible to implement this and still get data integrity using a trigger that does the referential integrity checking in the database. In the application where we used this type of design, we didn't even bother with the trigger, since we own the tables and no one else can write to them. (We have unit tests that prove we are always doing the correct thing inserting the data.)

Andrew Avenoso said:

The one benefit I see to using a generic ENTITY_ID and type discriminator column is that you can decide to add comments to a new table without having to modify your DDL.
While this is not a big deal for most projects, if you're on a project where database changes are delayed by weeks to months of procedure, it can be a big boon.

Sarah said:

I recently came across your blog and have been reading along. I thought I would leave my first comment. I don't know what to say except that I have enjoyed reading. Nice blog. I will keep visiting this blog very often.

Sarah

RS said:

How would I do this if I use an intermediate table to make the realize the relationship with Comments.

Instead of having PersonID and VehicleID in Comments, I have an intermediate table called Party with PartyID
Person table has a column PartyId, so does the Vehicle table. The comments table has PartyID too. They all reference Party.PartyId

Leave a comment


Type the characters you see in the picture above.

0 TrackBacks

Listed below are links to blogs that reference this entry: Mapping a universal comment table using Hibernate 3.0.

TrackBack URL for this entry: http://www.nearinfinity.com/mt/mt-tb.cgi/479