Is the performance of upsert getting worse and worse as the number of documents increases?
What if it is insert? Isn’t there no problem?
I now have a scenario of writing more and reading less, how to ensure the best writing speed, and with more and more data in the collection, the writing speed will not deteriorate?
insertThan, it must be
insertBetter. Because ..
findDoes it exist before
insertThere is an extra cost. But there is no essential difference in theory, because the time complexity of B tree is
O(log2(N))-Time consumption will not increase significantly with the increase of data volume:
upsertThe condition cannot hit the index, and the time complexity is O(N). Also, the 45 degree line can be seen in the above figure, which is that the time is proportional to the amount of data.
So the conclusion is:
upsertThe condition of the can hit the index, in theory with
insertThe difference will not be too great and there will not be obvious change trend with the increase of data volume.
upsertIf the condition cannot hit the index, the time spent will increase in proportion to the amount of data.
Theory goes back to theory. In engineering, you have to ensure that there is enough room for indexes, otherwise swapping with disk will greatly slow down the speed of index search. The more exchanges, the worse the performance, which will be destroyed.
O(log2(N))The curve of the