It's actually an interview question I'm thinking of for 2 month and can't find a suitable architecture.
The problem
We want to build a small analytics system for fraud detection on orders.
System has the following requirements
MySql, Redis, Hadoop, S3 etc)The system needs to provide following API
/insertOrder(order): Order orderId , beginTime, and finishTime as distinguished fields/getLongestNOrdersByDuration(n: int, startTime: datetime, endTime: datetime): Order[] startTime and endTime,finishTime - beginTime/getShortestNOrdersByDuration(n: int, startTime: datetime, endTime: datetime): Order[] startTime and endTime,finishTime - beginTimeLook at using druid database. If you time series data
https://druid.apache.org/ - This has been used as analytics db at scale in Fortune 500 companies