MongoDB database has not been accessed for a long time, and it takes a long time for the first query. How to solve it?

  mongodb, question

I encountered a MongoDB problem in the project, which has not been solved for many days. I hope to get Daniel’s advice.
The specific problem is: under the condition of not accessing the database for a long time, it takes a long time to query the database for the first time, but then the query will be very fast.

(1) The size of the whole database is about 1.9TB;
 (2) i query's collection has roughly 7 million data;
 (3) i query once got about 230,000 pieces of data;
 ④ 120GB; of server memory;
 (5) An index has been established according to the query criteria, and the index data size is about 600MB;
 ⑥ The time for the first query is about 20 seconds, and the subsequent queries are within 1 second.

Reasons currently under consideration:

Since MongoDB is not responsible for memory management, when the database is not accessed for a long time, the data in the memory is cold data, and the memory management program of the operating system will release this part of cold data, resulting in the need to reload the data into the memory during the next query, which is relatively time-consuming.  At present, it is impossible to determine whether it takes time to load indexes or data.  MongoDB provides a touch command (this command can specify to load index data or user data of a collection into memory), but I am using a WiredTiger storage engine, which is not supported by this command.

Need help:

(1) is this problem caused by the above reasons?
 (2) If it is caused by this reason, how to determine whether it takes time to load indexes or data?
 (3) is there a better solution?
 Note: Since the maximum capacity of this collection is about 25GB, and there are many other collections in the entire database, it is not advisable to store all the data of this collection in memory.  If it can be confirmed that it takes time to load the index, it can be considered to load the index into memory regularly, but there is no method to support this function for WiredTiger storage engine, which is another problem.

The problem you mentioned is related to workset.

1. what is working set?

MongoDB is an important concept in memory management. In memory management, data sets frequently accessed by businesses and related indexes are placed in memory as much as possible.

2, how to put the working set in memory?

In your statement, it is actually a need to Preload or Preheating the working set in memory. How exactly? You mentioned touch(MMAP engine), so how will it be implemented in the subsequent version (WT engine)?

If it is a relational database, the commonly used method is select *, and performance tests are often done. In order to achieve good results, a batch of Select statements will be run in advance to warm up the memory.

In MongoDB, consider:

1) If covered index can be directly used in business query, or when index needs to be preheated:

db.collection.find({}, {“_id” : 0, “field_a” : 1, “field_b” : 1}).hint({“field_a” : 1, “field_b” : 1}).explain()

2) When preheating the working set, the premise is that you know which data in your collection need to be accessed frequently, usually for a certain period of time, and then preheat the collection of this period of time using the same method as above.

The difference is that preheating index preheats all indexes, so the query condition is {}, while preheating working set is only preheating some data in collection, so the query condition may be related to the time range.

For reference.

Love MongoDB! Have fun!

MongoDB Chinese Community Shenzhen User Conference

This Saturday, everyone has an appointment

For details, please enter