On Primary Key Names

If you use frameworks like Microsoft Azure Mobile Services or Ruby on Rails, then you’re accustomed to complying with a host of development conventions. Frameworks are often said to be opinionated, forcing certain design decisions on the developers who use them. Given that very few software design choices are perfect in every situation, the value of having an opinion is often more about consistency than it is about correctness.

“A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines. With consistency a great soul has simply nothing to do.” — Ralph Waldo Emerson, Essay: First Series on Self-Reliance

Emerson’s famous quote about the potential perils of standardization is sometimes misapplied. For example, I once attended a seminar by a software vendor where the speaker referred to Ruby on Rails developers as hobgoblins because of their unswerving reliance on programming conventions. Yet, those who understand the Ruby on Rails framework, understand that it is loaded with touchstones that lead us to exhibit good and helpful behaviors most of the time. The seminar speaker’s broad derision of the Rails framework was either based on his misunderstanding or an intent to misdirect the hapless audience for some commercial gain. In his famous essay, Emerson clearly relegates only those friendly yet troublesome creatures of habit that lead to folly as the ones to be categorically avoided.

“One man’s justice is another’s injustice; one man’s beauty another’s ugliness; one man’s wisdom another’s folly.” — Ralph Waldo Emerson

Yet, it is true now and again that the conventions expressed in Rails, Azure Mobile Services, Django, CakePHP and many other frameworks can lead to unfortunate consequences based on the tools’ misapplications or by circumstances  that could simply not be foreseen by the frameworks’ designers. Nowhere else is this more true than in the area of data access. A data pattern that the framework designer considers beautiful in many situations may repulse some database administrators in practice. Application frameworks are often developed and sold for so-called greenfield solutions, those not naturally constrained by prior design decisions in the database and elsewhere. However, many real-world implementations of application frameworks are of the brownfield variety, mired in the muck and the gnarled, organic growth of  the thousands of messy technical decisions that came before. In environments like that, the framework designer’s attempt to enforce one kind of wisdom may prove to be quite foolish.

Primary keys are one of the concerns that application frameworks tend to be opinionated about. Ruby on Rails, Microsoft Azure Mobile Service and Django all name their surrogate, primary keys id by default, meaning identifier or identity. In fact, the word identity is based on the Latin word id which means it or that one. So in these frameworks, when you use the primary key to identify a specific record, you’re sort of saying “I mean that one.”

Database developers and administrators often argue that relational databases aren’t so-called object databases and that naming primary keys the same for all tables leads to confusion and errors in scripting. It’s true that when you read the SQL code in a database that uses id for all the primary key names, it can be a bit confusing. Developers must typically use longer, more meaningful table aliases in their queries to make them understood. Ironically, when the application frameworks that desire uniformity in primary key names generate database queries dynamically, they often emit short table aliases or ones that have little or no relationship to the names of the tables they represent. Have you ever tried to analyze a complex query at runtime that has been written by an Object-Relational Mapping (ORM) tool? It can be positively maddening precisely because the table aliases typically bear no resemblance to the names of the tables they express.

Another problem with having all the primary keys named the same is that it sometimes inhibits other kinds of useful conventions. For example, MySQL supports a highly expressive feature in its JOIN syntax that allows you to write code like this:

SELECT * FROM Order INNER JOIN LineItem USING (OrderID);

In this case, because the Order table’s primary key is named the same as the LineItem’s foreign key to orders, the USING predicate makes it really simple to connect the two tables. One has to admit that’s a very natural-feeling expression. The aforementioned application frameworks’ fondness for naming  all primary keys id makes this sort of practice impossible.

Now that I’ve spent some time besmirching the popular application frameworks for the way they name primary keys, let me defend them a bit. As I said in the beginning, when it comes to frameworks, their opinions are oftentimes more about consistency than objective or even circumstantial correctness. For application developers working in languages like Ruby or C#, having all the primary keys named similarly gives the database a more object-oriented feel. When data models have members that are named consistently from one object to the next, it feels as though the backing database is somewhat object-oriented, with some sort of base table that has common elements in it. If such conventions are reliable, all sorts of time-saving and confusion-banishing practices can be established.

Having done a lot of data architecture work in my career and an equal amount of application development work, my opinion is that naming all database primary keys the same has more benefits than drawbacks across the ecosystem. My opinion is based on the belief that application developers tend to make more mistakes in their interpretations of data than database people do. I believe this is true because as stewards of information, database developers and administrators live and breath data as their core job function while Ruby and C# developers use data as just one of many facets that they manage in building applications. Of course, this is the sort of argument where everyone is correct and no one is. So I’ll not try to claim that my opinion is authoritative. I’m interested in hearing your thoughts on the subject.

One thought on “On Primary Key Names”

Comments are closed.