ruby-on-railsregexrubymigration

How can I get commands in Rails Migration file using regex?


I'm trying to get commands from a Rails migration file as an array based on a specific migration command, using regex. My code works well on most cases, but when there is a command with multiline code, it broke and I couldn't fix.

Example

class AddMissingUniqueIndices < ActiveRecord::Migration
  def self.up
    add_index :tags, :name, unique: true

    remove_index :taggings, :tag_id
    remove_index :taggings, [:taggable_id, :taggable_type, :context]
    add_index :taggings,
              [:tag_id, :taggable_id, :taggable_type, :context, :tagger_id, :tagger_type],
              unique: true, name: 'taggings_idx'
  end

  def self.down
    remove_index :tags, :name

    remove_index :taggings, name: 'taggings_idx'
    add_index :taggings, :tag_id
    add_index :taggings, [:taggable_id, :taggable_type, :context]
  end
end

My objective is return an array with the separated commands as string. What I expect:

[
  "add_index :tags, :name, unique: true", 
  "remove_index :taggings, :tag_id", 
  "remove_index :taggings, [:taggable_id, :taggable_type, :context]", 
  "add_index :taggings, [:tag_id, :taggable_id, :taggable_type, :context, :tagger_id, :tagger_type], unique: true, name: 'taggings_idx'"
]

First, I separe the block of change or self.up (for old migrations), and then try to use the above regex code to collect each add/remove index commands into an array:

migration_content = 'migration file in txt'
@table_name = 'taggings'
regex_pattern = /(add|remove)_index\s+:#{@table_name}.*\w+:\s+?\w+/m
columns_to_process = migration_content.to_enum(:scan, regex_pattern).map { Regexp.last_match.to_s.squish }
puts columns_to_process
=> ["remove_index :taggings, :tag_id remove_index :taggings, [:taggable_id, :taggable_type, :context] add_index :taggings, [:tag_id, :taggable_id, :taggable_type, :context, :tagger_id, :tagger_type], unique: true"]

As you can see, didn't work, returning just 2 commands, and both in same string. This works fine for inline code, my problem starts when the user can use a block like the last self.up action, specially this case where has much elements, I couldn't adapt the regex to all cases, also tried to get all content between add_index/remove_index or end, but didn't work. Can anyone help me?


Solution

  • I think before scanning the file content you could replace all the line breaks that come after coma with space:

    migration_content = migration_content.gsub(/,\s*\R/, ', ')
    

    Maybe can also use gsub(/\(\s*\R/, '(') to replace multiline function calls where the code line ends with (