Mongodb Aggregation Sort Paging Data Duplicates?

  mongodb, question

Use aggregate plus $sort $skip $limit
When the value of $sort is the same, the results obtained by paging query are largely repeated.


If you don’t use aggregate and find to paginate the results, there will be no problem.
How to solve the problem of data duplication during aggregation?

To understand this way, the sorting criteria you have given are not sufficient. The database has been pressed according to your requirements.{count: -1}The sorting is completed, but because their values are all the same, no matter who is placed in front or behind, it does not violate your requirement, because your requirement is only the descending order of count. From the perspective of the database, since you have no additional requirements, it is of course the most efficient way to give you results, that is, regardless of the order other than count, because it saves the most resources.
So what is the most efficient? This involves some knowledge at the bottom of the database. On a single machine, if there is no index support, the database will try to traverse all the data, and then make a memory sort to give you the result. From the perspective of saving resources, it is obvious that this sort is only ranked until the descending order of count is satisfied, and other fields can be said to be on a first-come-first-served basis. This causes other sequences to be random when count is the same. They may be affected by:

  1. Your order on disk will affect, because it will affect which record the database traverses first. And it should be noted that the order of the data on the disk will change every time the data is updated.
  2. In theory, it is also related to the sorting algorithm used by the database. I’m sorry that I didn’t pay attention to the sorting algorithm here and couldn’t give you any further information.
  3. In a fragmented cluster environment, the results are also affected by which slice returns the data first. The sorting in the slicing environment is to arrange the sorting in each slice first, and then merge and sort again.

I think of this for the time being. In short, you do not specify, the database is not guaranteed.
As for the solution, it is also clear to specify a sorting condition that can completely determine the order, such as:

{$sort: {count: -1, _id: 1}}

However, it should be understood that this will make the database make extra efforts to ensure the correctness of the second sorting condition. In the actual use scenario, you should judge whether this is really meaningful to you according to the actual situation.

Last but not least,When asking questions in the future, it is suggested to post relevant codes and results in text instead of screenshots as much as possible.. Because although screenshots are convenient for you, it is quite troublesome for others to use your code or data to do experiments when answering questions. People who have no patience may directly ignore your question and it is not good for you to find the answer.