179: Seed Data
(view original Railscast)
Rails has recently been updated to version 2.3.4. This release focuses mainly on security and bug fixes but there are a couple of interesting new features too. One of these new features allows you to seed your applications database with the data it needs to get your application up and running.
When you create a Rails application under Rails 2.3.4 a file called seeds.rb
is created in the db
directory. This is now the conventional place to define any initial data that your application needs. This data can then be created by running a new rake task: rake db:seed
.
To give a quick demonstration of this we’ll add a puts
statement to the seed file:
puts "Seed data goes here."
Then run the rake task, where we’ll see the output.
% rake db:seed (in /Users/eifion/rails/apps_for_asciicasts/ep179/seeder) Seed data goes here.
At first sight this might seem to be a simple new feature, and so it is. What makes it worthy of note though is that it means that there is now a conventional place to put seed data in applications.
Creating Seed Data
Let’s say that we’re writing an application where users have to choose which operating system they’re running when they register. To enable this we’ll create a model called OperatingSystem
that has a name
column. We’ll generate the model in the usual way.
script/generate model operating_system name:string
The list of operating systems isn’t something that will be created by the users so we’ll need to define some initial data. But where should we do this? One place where Rails developers sometimes add seed data is within the migration files, like this:
class CreateOperatingSystems < ActiveRecord::Migration def self.up create_table :operating_systems do |t| t.string :name t.timestamps end # Create the seed data ["Linux", "Mac OS X", "Windows"].each do |os| OperatingSystem.find_or_create_by_name os end end def self.down drop_table :operating_systems end end
This works, but it isn’t really the best way to do this. Migrations are best left to the job they’re designed for: creating the structure of your database. Creating seed data in them can also lead to your seed data being scattered across several migration files.
Now from Rails 2.3.4 we have a central place where we can create the seed data so we can move the seed data code from the migration file into the seeds.rb file.
["Linux", "Mac OS X", "Windows"].each do |os| OperatingSystem.find_or_create_by_name os end
Note that we’re using find_or_create_by_name
so that the models are only created if they don’t already exist, meaning that the seed data file can be run more than once and won’t repeatedly create the same operating systems.
Another example of the sort of data you might want to seed an application with is a list of countries for an address form. To do this we’ll generate another model for a country that has a name and a code.
script/generate model country name:string code:string
Entering all of the country data would be fairly tedious, even if we only have to do it once. Fortunately at this URL is a text file containing a list of country codes an names separated by a vertical bar.
AF|Afghanistan AL|Albania DZ|Algeria AS|American Samoa AD|Andorra …
We can use the data in this file to populate our Country model.
Country.delete_all open("http://openconcept.ca/sites/openconcept.ca/files/country_code_drupal_0.txt") do |countries| countries.read.each_line do |country| code, name = country.chomp.split("|") Country.create!(:name => name, :code => code) end end
This time we’re populating the data in a slightly different way. First we delete any existing countries, then open the text file and loop through each line in it, creating a country from the code and name. This provides a quick and simple way to populate the country models. The code above uses OpenURI to get the file so for it to work we’ll need to require it at the top of the file for it to work.
require 'open-uri'
Now that we’ve written our seed script we can run it to see if it works. Before we do we’ll need to run our migration file to create the two models.
rake db:migrate
Then we can run our seed task.
rake db:seed
This will take a couple of seconds to run and when it finishes our database will be populated. We can check this by running script/console
.
Our operating systems are there:
>> OperatingSystem.all +----+----------+-------------------------+-------------------------+ | id | name | created_at | updated_at | +----+----------+-------------------------+-------------------------+ | 1 | Linux | 2009-09-14 20:55:20 UTC | 2009-09-14 20:55:20 UTC | | 2 | Mac OS X | 2009-09-14 20:55:20 UTC | 2009-09-14 20:55:20 UTC | | 3 | Windows | 2009-09-14 20:55:20 UTC | 2009-09-14 20:55:20 UTC | +----+----------+-------------------------+-------------------------+ 3 rows in set
And so are the countries.
>> Country.all +-----+---------------------+------+---------------------+---------------------+ | id | name | code | created_at | updated_at | +-----+---------------------+------+---------------------+---------------------+ | 1 | Afghanistan | AF | 2009-09-14 21:03... | 2009-09-14 21:03... | | 2 | Albania | AL | 2009-09-14 21:03... | 2009-09-14 21:03... | | 3 | Algeria | DZ | 2009-09-14 21:03... | 2009-09-14 21:03... | | 4 | American Samoa | AS | 2009-09-14 21:03... | 2009-09-14 21:03... | | 5 | Andorra | AD | 2009-09-14 21:03... | 2009-09-14 21:03... |
Fixtures
We’ll finish this episode with a final tip. If your application already has fixtures which contain the data you want to use as seed data, you can use this as the basis for your seed data.
Say we have the following seed data in our /test/fixtures/operating_systems.yml
file.
# Read about fixtures at http://ar.rubyonrails.org/classes/Fixtures.html windows: name: Windows mac: name: Mac OS X linux: name: Linux
We can import it by replacing the code that generates the operating systems in seeds.rb
with this.
require 'active_record/fixtures' Fixtures.create_fixtures("#{Rails.root}/test/fixtures", "operating_systems")
If we re-run our seed task the operating system models will be recreated.
>> OperatingSystem.all +------------+----------+-------------------------+-------------------------+ | id | name | created_at | updated_at | +------------+----------+-------------------------+-------------------------+ | 303122256 | Linux | 2009-09-14 21:28:31 UTC | 2009-09-14 21:28:31 UTC | | 387181413 | Mac OS X | 2009-09-14 21:28:31 UTC | 2009-09-14 21:28:31 UTC | | 1676117404 | Windows | 2009-09-14 21:28:31 UTC | 2009-09-14 21:28:31 UTC | +------------+----------+-------------------------+-------------------------+ 3 rows in set
There is one noticeable difference when getting the data from the fixture file: the id
s are rather wonky. This is because of the way that fixtures generate ids when they are not explicitly specified in the fixture file.
There is some controversy among developers about what exactly constitutes seed data. For some it’s the minimum amount of data that is required to get an application functioning, while others like to add user records and other user-generated content. If you want to keep the data that is absolutely necessary to get an application set up separate from any other data it might need you can use a rake task. Episode 126 used this approach to populate a database with a large amount of test data. If you’re trying to test the performance of your application when it has a lot of data, or simulate what it will look like when it has a large amount of data this is an excellent approach.
While we’re on the topic of seed data it’s worth looking at the seed-fu library which is another way of generating seed data. If you’re not using the latest version of Rails then this provides a useful alternative way of generating seed data. Alternatively the BootStrapper library is also well worth investigating.