How efficient is mongodb’s articles and comments in the same data?

  mongodb, question

Putting comments and articles together, here I have a question, when the number of comments is very large, will it lead to inefficiency in searching the list pages of articles?
IfcommentsPeel off to anothercollectionIs this a way to relieve the pressure when only the list of articles is displayed

{
	"_id" : ObjectId(),
	"author" : "",
	"comment_num" : "",
	"comments" : [
		{
			"text" : "",
			"created" : ISODate(),
			"author" : ""
		},
	],
	"created" : ISODate(),
	"text" : "",
	"title" : ""
}

@halty said very well and did not fully agree with him. If there are not many comments, the design put together is very suitable and the upstairs is very good. But if there are too many comments, the problem will arise. The most important are two basic starting points:
1. Hard disk is too slow;
As long as the data is in memory, there is no problem.

  1. find
    After the data is extremely large, read a lot of data on the disk, because the Memory Mapped File will be stored in the memory, but we only need a small part of it. The main problem is that the OS may change pages of other data to the hard disk. As far as listing articles is concerned, memory is not used effectively.
  2. insert
    On disk files, it is not a good thing if a document has been long and long for many times. Because if after adding new data, such as adding a new comment, the document becomes larger and the original place cannot be filled, a new place must be found, and the previous hole will be reused. However, the problem is that the location of the document has changed, and all indexes related to it will change. If you still have an index on the array, such as the user name of the comment, then the updated index is linearly related to the length of the array.
  3. size
    The upstairs said very well on this point. The upper limit of 16MB.

To sum up, when there are too many comments, the performance will be affected.

In conclusion, schema design should consider

  1. The size of the data, as long as the frequently accessed data is in memory, there is no problem for access. Article 1 abovefindThe underutilization of memory mentioned in the book is not really a big problem. Because there are always many people reading the comments of popular articles, it is also good to put them in memory. If the document is long and long, MongoDB will automatically allocate more disk space.
  2. Adapted to Access Pattern. Writing comments is too small to read articles and comments.Twitter dataThe average tweet 5K/s 5k/s, read timeline 300K/s. 60 times! As long as the read request can be satisfied in memory. MongoDB eliminates the need for caching. It’s not that the visit is really too big, and the extreme situation of writing too much can be said. On that day, MongoDB’s sharding came in handy.
  3. Convenient development
    The cost of the product is not only the cost of machine hardware and network, but also the development cost of programmers. The salary is so high … So, it is also very important to write quickly and conveniently and not to make mistakes, right? This also explains why the flexibility of MongoDB document model is widely praised.

On the other hand, I don’t think most of the comments on this kind of application will exceed 100%. At this time, the single document will give full play to its advantages. Hundreds of comments will be fine, and the problem of the subject will not be a problem. I hope that the application of the subject can break through this number …