<aside>
➡️ Building a Real Time Metrics Database at Datadog
</aside>
https://leetcode.com/discuss/interview-question/system-design/287678/design-a-monitoring-or-analytics-service-like-datadog-or-signalfx
1. Problem statement
Use cases
- User submits count and guage metrics to system
- User creates monitors and alerts
- System send triggered monitors and alerts to users
- User views metrics and graphs in UI
- User can tag metrics
Out of scope
Constraints and assumptions
- Per-customer ballpark numbers
- $10^4$ apps (1,000s of hosts * 10s containers)
- $10^3$ metrics emitted from each app/container
- $10^0$ (1) point a second per metric
- $10^5$ secs per day (actually 86,400)
- $10^1$ bytes/point (8 byte float, amortized tags)
- $10^4 * 10^3 * 10^0 * 10^5 * 10^1 = 10^{13}$
10 TB
a day for one customer
- Number of customers = $10^4 = 10,000$
- $10^{13} * 10^4 = 10^{17} = 100PB$ a day
- 1000 queries per customer per day
2. Clarifying questions
todo
3. Functional requirements
- User
- Write stat →
POST /api/stats:write
- Describe tags
- What tags are queryable for this metric?
- Tag index
- Give a time series ID, what tags were used?
- Tag inverted index
- Given some tags and a time range, what were the time series ingested?
- Point store
- What are the values of a time series between two times?