mongodbwiredtiger

Unable to start mongod with one missing collection wt file


I have a working mongod instance (v3.2.21).

But suddenly it stopped working. When i run mongo command it is throwing me following error:

2019-06-04T13:52:41.725+0000 W NETWORK  [thread1] Failed to connect to 127.0.0.1:27017, in(checking socket for error after poll), reason: errno:111 Connection refused

When i checked log its showing:

2019-06-04T13:36:43.388+0000 E STORAGE  [initandlisten] WiredTiger (2) [1559655403:388180][8404:0x7fb74e904c80], file:collection-1-1305830686620002691.wt, WT_SESSION.open_cursor: /mnt/volume-fra1-01//collection-1-1305830686620002691.wt: handle-open: open: No such file or directory
2019-06-04T13:36:43.388+0000 E STORAGE  [initandlisten] no cursor for uri: table:collection-1-1305830686620002691
2019-06-04T13:36:43.388+0000 F -        [initandlisten] Invalid access at address: 0x58
2019-06-04T13:36:43.398+0000 F -        [initandlisten] Got signal: 11 (Segmentation fault).

The collection which was earlier there is now missing in data directory

I tried to use --repair but the process gets stop at this collection.

I have looked at various resources but couldn't figure out how to make it working? Is there way so that wiredtiger escape this collection?


Solution

  • MongoD 4.0.3 and newer has better repair facilities as per SERVER-19815.

    One thing you can try:

    1. Copy your original dbpath so if the attempt is not successful, you don't clobber your original data
    2. Download the 4.0.3 binaries (or newer, currently the latest is 4.0.10)
    3. mongod --repair using the 4.0.3 binaries to attempt to repair the copied dbpath
    4. If the repair is successful, try to run the 3.2.21 binaries pointing to the repaired dbpath

    Please note that this repair attempt is best-effort and there's no guarantee of success. Having an up-to-date backup is still recommended. You might also want to investigate how the dbpath is missing a file to begin with.