I am building a federated learning model using Tensorflow Federated. Based on what I have read in the tutorials and papers, I understood that the state-of-the-art method (FedAvg) is working by selecting a random subset of clients at each round.
My concern is:
Thanks in advance
This is certainly a valid application of FedAvg
and the variants proposed in the linked paper, though one that is only studied empirically in a subset of the literature. On the other hand, many theoretical analyses of FedAvg
assume a similar situation to the one you're describing; at the bottom of page 4 of that linked paper, you will see that the analysis is performed in this so-called 'full participation' regime, where every client participates on every round.
Often the setting you describe is called 'cross silo'; see, e.g., section 7.5 of Advances and Open Problems in Federated Learning, which will also contain many useful pointers for the cross-silo literature.
Finally, depending on the application, consider that it may be more natural to literally train on all clients, reserving portions of each clients' data for validation and test. Questions around natural partitions of data to model the 'setting we care about' are often thorny in the federated setting.