I have a .NET Core service AAA that retrieves a bit of data from another Core service BBB. BBB has an in-memory cache (ConcurrentDictionary) and is deployed to 10 boxes. The total size of the data to be cached is around 100GB.
AAA will have a list of servers that run BBB and I was thinking of doing something along the lines of ServerId = DataItemId % 10
, so that each of the boxes gets to serve and cache 10% of the total dataset. What I can't figure is what to do when one of the BBB boxes goes down (e.g. due to Windows Update).
Is there some algorith to split the traffic, that will allow servers to go down and up, but still redirect most of requests to the server that has revelant data cashed?
Azure Load Balancer does not interact with the application payload. It makes decisions based on a hashing function which includes the 5-tuple of the TCP/UDP transport IP packet. There is a difference between Basic and Standard LB here in that Standard LB uses an improved hashing function. There's no strict guarantee for share of requests, but the number of flows arriving over time should be relatively even. A health probe can be used to detect whether a backend instance is health or sick. This controls whether new flows arrive on a backend instance. https://aka.ms/lbprobes has details.