ruby-on-railsrubyactiverecorddata-migrationschema-migration

When I run a schema migration before a data migration, with ActiveRecord, data does not properly update in DB


As of now, I have a users table with columns id, name, email, status

status field is an integer type with values 1 and 2 representing an Active and Inactive user, respectively.

I would like to change the status field to a string type and migrate the data -- Convert 1 to "Active" and 2 to "Inactive"

I generated 2 migration files with rails g migration

user.rb

class User < ApplicationRecord
  module Status
    ACTIVE = 'Active'.freeze
    INACTIVE = 'Inactive'.freeze

    ALL = [ACTIVE, INACTIVE].freeze
  end

  validates :status, presence: true
  validates :status, inclusion: Status::ALL
end

db/migrate/20190906115523_update_user_status_type.rb

def UpdateUserStatusType < ActiveRecord::Migration[5.2]
  def up
    change_column :users, :status, :string, default: User::Status::ACTIVE, 
  end

  def down
    User.where(status: User::Status::ACTIVE).update_all(status: 1)
    User.where(status: User::Status::INACTIVE).update_all(status: 2)
    change_column :users, :status, :integer, default: 1
  end
end

db/migrate/20190906115828_update_user_statuses.rb

def UpdateUserStatuses < ActiveRecord::Migration[5.2]
  def data
    User.where(status: 1).update_all(status: User::Status::ACTIVE)
    User.where(status: 2).update_all(status: User::Status::INACTIVE)  
  end
end

After running rails db:migrate

Expected: Each user's status should be converted to either "Active" or "Inactive" after migrations are finished.

Actual Results: Each user's status are converted to "0" of string type.


Solution

  • You're assuming after the first migration runs (change_column :users, :status, :string, default: User::Status::ACTIVE) you can still fetch the old values from the status column which is not the case. When you change the type of that column to string all the integer values are invalid so I suspect your database just changes all the invalid values to be "0" instead.

    If I was told to make this change to an application that is heavily used in production, I would be roll out this change in a few separate pull requests/migrations. I'd create a whole new separate column, iterate through all the users, set the value of the new column depending on what the value in the old column is, and then delete the old column. This is a much safer way to make this change.