It's actually an interview question I'm thinking of for 2 month and can't find a suitable architecture.
The problem
We want to build a small analytics system for fraud detection on orders.
System has the following requirements
MySql
, Redis
, Hadoop
, S3
etc)The system needs to provide following API
/insertOrder(order): Order
orderId
, beginTime
, and finishTime
as distinguished fields/getLongestNOrdersByDuration(n: int, startTime: datetime, endTime: datetime): Order[]
startTime
and endTime
,finishTime - beginTime
/getShortestNOrdersByDuration(n: int, startTime: datetime, endTime: datetime): Order[]
startTime
and endTime
,finishTime - beginTime
Look at using druid database. If you time series data
https://druid.apache.org/ - This has been used as analytics db at scale in Fortune 500 companies