In relational databases it’s common to use foreign keys to reference other tables so that when you make a change to values in the table you maintain referential integrity without paying a performance cost. For some reason, many Rails programmers don’t apply this concept to state machines in ActiveRecord. In this post I’ll describe the benefits of not storing state names in the database.
Let’s say you have a community site that allows users to signup. Further, you need to verify their email addresses. You might start with a class like this:
class User
state_machine :status do
state :unverified
state :verified
end
end
Over time your community grows and some users stop going to the site. You decide that you are going to retire inactive accounts so that the usernames can be claimed by new users. You decide to add the concept of “active” and “inactive”. You update your class to look like this:
class User
state_machine :status do
state :unverified
state :active
state :inactive
end
end
Unfortunately, you decided to store the states as strings in the database, which means that you need to run a migration to update existing users:
class AddActiveAndInactiveToUsers < ActiveRecord::Migration
def self.up
execute "update users set status = 'active' where status = 'verified';"
end
end
But with 27 million users in the database, this query takes the site down for over 15 minutes so you have to schedule it for 3am. Even still, you get bad press for being down. And to think, all this can be avoided by just storing integers in the database!
Some state machine implementations provide easy ways to accomplish this. For example if you use PluginAWeek’s state_machine gem you could write:
class User
state_machine :status do
state :unverified, :value => 0
state :verified, :value => 1
end
end
Since you store integers in the database, you can easily add a new state and change the name of an existing state without having to migrate data. Having state names in the database is a classic example of dependency inversion, where your code, which doesn’t change very often, relies on a value in a database table, which is likely to change more often.
If your state machine doesn’t support storing values in the database as integers (or guids), I suggest you look into upgrading to one that does, or patching your existing state machine to support integers. You may end up getting one less sleepless night because of it!
About the Author