We need to free some of our MongoDB space, and we identified 100Gb + worth of documents that can be safely removed from a collection.
So we removed them from our test environment which has this setting:
When done, we found out that the space on disk was still used and needed to be reclaimed. We found this post and it helped us: after running both
db.runCommand({repairDatabase: 1})
and
db.runCommand({compact: collection-name })
We freed 100Gb +.
We then proceeded in production, forgetting that the setting was different since we had 1 replica node:
After removing the documents, we run
db.runCommand({repairDatabase: 1})
and got the OK message (after a while, 10 min +). We tried running
db.runCommand({compact: collection-name })
and got this error:
will not run compact on an active replica set primary as this is a slow blocking operation. use force:true to force
So we run
db.runCommand({compact: collection-name, force: true })
and got the OK message (almost instantly), but the disk on space is still used, it wasn't freed.
we searched for solutions for running the repairDatabase
and compact
commands with replica-set but the advise was focused on avoiding downtime as if that was the only issue. However, we can schedule downtime and our problem is rather that the commands don't work as expected since the space is not actually reclaimed.
What did we do wrong?
For replica set configurations, the best and the safest method to recover space is to perform an initial sync. If you need to recover space from all nodes in the set, you can perform a rolling initial sync. That is, perform initial sync on each of the secondaries, before finally stepping down the primary and perform initial sync on it.
Note that the rolling initial sync is only possible if your deployment contains at least three nodes replica set (for reasons I will describe below).
Rolling initial sync method is the safest method to perform replica set maintenance, and it also involves no downtime as a bonus.
Having said that, there are some things that are worth mentioning:
compact
:The compact
command on WiredTiger on MongoDB 3.0.x was affected by this bug: SERVER-21833 which was fixed in MongoDB 3.2.3. Prior to this version, compact
on WiredTiger could silently fail.
repairDatabase
:Please don't run repairDatabase
on replica set nodes. This is strongly not recommended, as mentioned in the repairDatabase page. The name repairDatabase
is a bit misleading, since the command doesn't attempt to repair anything. The command was intended to be used when there's disk corruption, which could lead to corrupt documents.
The repairDatabase
command could be more accurately described as "salvage database". That is, it recreates the databases by discarding corrupt documents, in an attempt to get the database into a state where you can start it and salvage intact document from it.
In a replica set, MongoDB expects all nodes in the set to contain identical data. If you run repairDatabase
on a replica set node, there is a chance that the node contains undetected corruption, and repairDatabase
will dutifully remove the corrupt documents. Predictably, this makes that node contains a different dataset from the rest of the set. If an update happens to hit that single document, the whole set could crash. To make matters worse, it is entirely possible that this situation could stay dormant for a long time, only to strike suddenly with no apparent reason.
I noticed that in your production environment, you created a replica set with two nodes. This setup is not recommended, since the failure of a single node will render the remaining node to become a secondary, and thus disallowing writes to the set.
Due to the way MongoDB high availability works (see Replica Set Election), it's strongly recommended to deploy three data-bearing nodes at a minimum, or at least add an arbiter node (see Replica Set Members) so that your replica set contains an odd number of members.
Having only two members in a replica set also makes rolling upgrades/initial sync/maintenance much harder or even impossible in some cases.
MongoDB 3.0.1 was released in March 17, 2015, which is more than 2 years ago as of this writing. If you're forced to use MongoDB 3.0 series, please consider moving to 3.0.15. Or better yet, to 3.4.7 (the latest as of Aug 10, 2017), which contains massive improvements over 3.0.1.