I'm using Windows Server 2008, and my program is in C++. I'm using WinSock2 and sendto() in a while(true) loop to send my packets.
Code like so:
while(true)
{
if(c == snd->max)
c = snd->min;
dest.sin_addr.S_un.S_addr = hosts[c];
iphead->destaddr = hosts[c];
sendto(s, castpacket, pktsz, 0, castdest, szsad);
++c;
}
I need to send as much data to as many IPs in my hosts std::vector as possible, as quickly as possible.
I'm currently running on an i7 930 server, and I can only achieve 350Mbps or so.
I currently split my program into 4 threads, all running the while loop with different servers assigned to each thread. Adding more threads or running more copies of the program results in lower throughput.
I have another program running listening for replies from the servers. I get the servers from a master list and add them to my array. The problem at the moment is that it takes too long to go through all of them, and I want to check them regularly.
How exactly can I optimize my program/loop/sending here?
I would recommend moving to asynchronous I/O to speed things up a bit here. The main problem with sending them one at a time is that you are unable to queue up the next packet while the TCP stack is processing the current one.
Alterntively, you can go for a thread pool approach: you fire a certain number of worker threads, and each one picks up a client from a FIFO and sends data to the client. When a thread is done with its client, it puts the client back in the FIFO and picks up a new one. You can fill up the pipeline - but not swamp it - by tuning the number of worker threads.