I am a novice mongoDB and have seen the introduction of embedded non-standardized database on the official website, but I am still confused when facing the examples. (the examples on the official website only have two or three collection, which feel much simpler than the actual application)
Every year in a season, a league matches many teams and each team has many players. the attributes of the players include the position on the field of the team’s jersey number (players may transfer, but only belong to one team at a certain time). the league matches have regular season matches, playoff matches, and each match has a host/guest team. it is necessary to record the scores/rebounds/assists/blocks/steals of each player on the field and the time stamp of each event. each match corresponds to several news articles.
Typical application scenarios:
The host/guest team scored in the latest 10 matches, winning or losing.
Ranking of Points for All Teams
Average points/rebounds/assists/blocks/steals of all teams in the top 10 this season
Average points/rebounds/assists/blocks/steals of all players in the top 10 this season
In a certain match, the host and guest teams scored, rebounded, assisted and hit the field
In a certain match, the players of the host and guest teams scored, rebounded, assisted and shot percentage in this match, etc.
Ranking of all players in a certain season
The average number of players in each season (players who have been transferred may be on different teams)
Can you give us a complete data model structure design, thank you very much!
Some questions I don’t understand:
Should players be embedded in the team? How to deal with the transfer problem?
Should the data of each match be embedded into each match? If so, how is the relationship between players and each score/event related?
Where is the non-standardized data, I feel that no matter how it is designed, the entire database must be scanned when summarizing the historical data of the team or players and when ranking, isn’t it scientific? …
Where can indexing significantly improve performance?
Is this application scenario suitable or not applicable to nosql? ……………..
The big principle is to design according to Access Pattern. Fortunately, we already have typical application scenarios.
Team-player one-to-many relationships can be embedded or separated, mainly depending on whether the two data are often used together. An example used together: the questions and answers of SegmentFault are also one-to -many but are often displayed together. If not, it can also be separated, and then the players’ names, _id, effective time and other basic information can be added to the team, so that denormalization needs to be updated one more place when it is updated. However, due to the low frequency of updates (such as transfers) of this information and the large number of readings, it is still worthwhile.
Statistics. If the player’s historical data is recalculated, no matter what data inventory is used, the entire database needs to be traversed, right? This is another trade-off before reading and writing. Obviously, the data update is much less than the query, so it is meaningful to cache the statistical data. Moreover, some statistics are unchanged, such as the data of a team for a given season. When updating, it can be updated in real time or some intermediate results can be cached, such as the average value comes from the total number and times to help calculation. Or, it is possible to update once a day after each game.
The data of each match is related to the match as well as the players, but is it often the same as one? If not, but only for statistical services, it is also good to put it in a separate collection.
Indexes can help you find data faster, but they can’t help you reduce the result data set. When the data model is designed, it is only natural to select indexes.