I am fetching all the rows from the collection and experience delay on 100th row. I understand that find method returns cursor and not all the data up front and at certain point need to fetch more data. But the 100th row is the only delay.
Checking images 99
Checking image 100
*pause*
Checking image 101
And then with no visible delay up to 100 000 image.
Used ruby script:
require 'mongo'
time_start = Time.now
mongo = Mongo::MongoClient.new("localhost", 27017)
db = mongo["pics"]
images = db["images"]
albums = db["albums"]
orphans = []
images.find().each do |row|
puts "Checking image #{row['_id']}"
end
# puts orphans
time_end = Time.now
puts "Total time taken: #{time_end - time_start}"
mongoimport --db pics --collection images file_name
The questions are:
Thank you
The default "batch size" of the MongoDB cursor is 100 objects. Means MongoDB fetches 100 objects before fetching the next batch...that is why you see delays. All drivers should provide a method "batch_size()" or similar on the cursor object for setting and retrieving the batch size.