rubymongodbmongoid3

Mongodb Mongoid inserting data speed up


I need to insert 2.000.000 rows of data into MongoDB, row by row but when it reaches 200.000 inserts it gets very slow, I am using Mongoid and I do not know if I can use bulk insert as I need data validation before inserting. How I can speed up this process? Thank you very much for the help!


Solution

  • For significant performance improvement, you should try batch insertion via Moped::Collection#insert. You will have to explicitly call #valid? yourself. Try something like the following, assuming that data_rows is an array of Mongoid model MyModel objects.

    slice_size = 1000    
    data_rows.each_slice(slice_size) do |slice|
      slice.each{|data_row| raise "validation error" unless data_row.valid?}
      MyModel.collection.insert(slice.collect{|data_row| data_row.serializable_hash})
    end
    

    If you can intercept your import data in "raw" non-Mongoid model form, you can bypass some overhead by inserting arrays of hashes, but then you will have to program your own custom validations and not be able to use Mongoid model validation.