Saturday, July 19, 2008

Foreign Keys Are Your Friends

Imagine a database where only some of the natural parent-child relationships are actually represented as foreign key relationships. Some are, some aren't... and now you have to write queries to join these tables.

Say, in MobiLink download_cursor scripts to implement synchronized partitioning among multiple databases, where the partitioning is controlled by data in some other table, not the one being downloaded.

How do you get from one table to another when there's no foreign key path between them? What columns do use in the INNER JOIN ... ON clauses?

Well, it's not easy. You might be tempted to use a NATURAL JOIN to automatically generate ON clauses based on columns with the same name in both tables. In the real world, however, NATURAL JOIN sucks... sometimes columns that should be used have different names, and other times columns that shouldn't be used have the same names.

And so you pick the wrong columns, and get the wrong rows. Or you pick too many columns and get too few rows. Or too few columns, and get too many rows.

With explicit foreign key relationships, a developer stands a very good chance of writing correct, reliable queries involving complex joins by simply inspecting the database schema. Foreign keys are a way for the database designer to communicate with the developer, providing direction: "this is how you get from here to there."

And that's one of the reasons that I say to developers, "foreign keys are your friends". Even if you disregard the safety of referential integrity, even if you don't consider the benefits that foreign key indexes might bring to performance, foreign keys are STILL good things.

Even for us ordinary folk.


Anonymous said...

This friendship seems even more valuable when using the shorthand KEY JOIN syntax:)

That is one of my most favourite SQL Anywhere features as it really increases the usefulness of RI: Not only consistent and self-documented data but also shorter and more comprehensible query code.

The one disadvantage may be that this is a SQL Anywhere only feature AFAIK.


stuffittrans said...

I don't think that a sensible developer can ignore RI and performance benefits that FK give :).