I have a 3-shard MongoDB cluster (version 4.4) that I want to move to another cluster. Each shard has 500 million documents for collection "A". I am trying to speed up the process by using mongodump / mongorestore per shard and moving data in parallel for all shards at once.
After moving the data, the destination cluster already has data in all 3 shards. Is there a way to update or start the sharding so mongos already recognizes data in all shards?
I tried the following command: https://www.mongodb.com/docs/manual/reference/command/shardCollection/
but only the 1st shard (rs0) was successful for collection "A".
This is the status output:
'Migration Results for the last 24 hours': {
'44': "Failed with error 'aborted', from rs0 to rs1",
'68': "Failed with error 'aborted', from rs0 to rs2",
'682': 'Success'
}
I have enabled sharding and created the sharding key.
I don't know if this is even possible, because usually the shard is done in a collection and it automatically splits its data between available shards based on the sharding key.
The most suitable options are as follow (depending on if you want to backup/restore sharded cluster or only sharded collection):
backup/restore mongoDB sharded cluster via file system snapshot offcial procedure is here.
backup/restore mongoDB sharded collection via mongodump/mongorestore (this option is best if you have other collections in the targeted cluster and the collection is relatively small , bigger collections can take time ).
2.1 Create mongodump from mongos for the whole collection ( if the collection is big may take some time as it will need to read all the 500M docs from the 3x shards )
2.2 Create the collection , create the necesary indexes , shard it and pre-split in the target cluster so it is empty , but has the necessary number of chunks in all 3x shards before the loading.
2.3. mongorestore the collection via mongos to the targeted cluster.