How does mongodb reclaim the space that should be vacated after a document “shrinks”?

  mongodb, question

Use the default WiredTiger storage engine with version mongodb-win32-x86_64-3.2.1.

There are about 4 million documents in a collection, occupying 40G of space; Later, the useless part of each document processing was deleted and updated (the document was not deleted). The script to perform this task was completed on schedule. According to its output statistics, the whole document was reduced to about 45%, that is, more than 20G of space was freed up, but the volume of the database file was not reduced, but increased a little (about 1%).

After reading the chapter on storage in the document,https://docs.mongodb.com/v3.2 …, which mentioned that the empty record space caused by deleting documents can be reclaimed through compact (when using WiredTiger storage engine), and did not mention the “empty space” formed by the above-mentioned modified documents. I ran compact and finished it in a few seconds, and the storage file did not become smaller. So what should I do?

> use cache
 switched to db cache
 > db.htmldocs.runCommand("compact")
 { "ok" : 1 }
 > db.runCommand({compact: "htmldocs"})
 { "ok" : 1 }

Yes, WT uses the latest versioncompactIt is true that space can be reclaimed, but because of a bug this should take effect after a version of 3.2, I do not remember the specific version number (3.2.9? ), but it is certain that 3.2.1 is definitely invalid, and upgrading to the latest version 3.2.12 will definitely solve this problem.
relative torepairDatabaseIn general,compactIt does look lighter, but the problem that cannot be avoided is that it is still a very heavy operation. Imagine the windows disk defragmentation, the principle is the same. More preferably, the method with less impact is to reinitialize all nodes with replication set scrolling, and the space can be released.

The more important question is, is recycling these spaces really meaningful to you? Most of the time the system occupies these spaces, which means that it may occupy these spaces again (unless there are changes in the system design). It is better to keep the system in place and reuse it than to redistribute it once it is released.