First, the previous picture:Explanation: The above figure is a tree-shaped data structure. The idea is to collect performance data of 50,000 machines every 5 minutes, store the data for two years (data can be deleted only after two years, so there are 730 days). Each machine has an ip address (50,000 ip addresses for 50,000 machines) and multiple ports (for example, switches have eth0,eth1,…). Each port collects 16 performance data (number of packets received, number of packets sent out, number of erroneous packets … etc., i.e. key in the figure), and each key corresponds to 288 values (because every 5 minutes, there are 288 values for a key a day). )
My initial idea is to use NOsql to store the data and design the date (the first layer of the tree) as a set. The ip address (Layer 2) is designed as a document. Using mongodb, the ports and the following (the third layer and the following layers) are designed as subdocuments. In this way, when a piece of data is collected and returned, it will be looked up according to the tree shape. If there is, the data will be inserted, and if there is no data, no branch will be added during the journey.
For example, a piece of data came in: 2013-5-184.108.40.206.1.eth 0.key1.value4 inserts value4 under value3.
Another piece of data: 2013-5-8. 192.168.10.1.eth1. key1. value1Just below eth0, a new eth1 branch was born and the data was added.
However, mongodb’s subdocument insertion (the subdocument to be inserted must be queried first) has certain repeatability.
How to choose a reasonable database and design storage format for the above requirements?
I hope everyone can discuss and come up with ideas. . .
What are your query requirements? ? ?