Think MUDs/MUCKs but maybe with avatars or locale illustrations. My language of choice is ruby.
I need to handle multiple persistent connections with data being asynchronously transferred between the server and its various clients. A single database must be kept up-to-date based on activity occurring in the client sessions. Activity in each client session may require multiple other clients to be immediately updated (a user enters a room; a user sends another user a private message).
This is a goal project and a learning project, so my intention is to re-invent a wheel or two to learn more about concurrent network programming. However, I am new to both concurrent and network programming; previously I have worked almost exclusively in the world of non-persistent, synchronous HTTP requests in web apps. So, I want to make sure that I'm reinventing the right wheels.
Per emboss's excellent answer, I have been starting to look at the internals of certain HTTP servers, since web apps can usually avoid threading concerns due to how thoroughly the issue is abstracted away by the servers themselves.
I do not want to use EventMachine or GServer because I don't yet understand what they do. Once I have a general sense of how they work, what problems they solve and why they're useful, I'll feel comfortable with it. My goal here is not "write a game", but "write a game and learn how some of the lower-level stuff works". I'm also unclear on the boundaries of certain terms; for example, is "I/O-unbound apps" a superset of "event-driven apps"? Vice-versa?
I am of course interested in the One Right Way to achieve my goal, if it exists, but overall I want to understand why it's the right way and why other ways are less preferable.
Any books, ebooks, online resources, sample projects or other tidbits you can suggest are what I'm really after.
The way I am doing things right now is by using IO#select
to block on the list of connected sockets, with a timeout of 0.1 seconds. It pushes any information read into a thread-safe read queue, and then whenever it hits the timeout, it pulls data from a thread-safe write queue. I'm not sure if the timeout should be shorter. There is a second thread which polls the socket-handling thread's read queue and processes the "requests". This is better than how I had it working initially, but still might not be ideal.
I posted this question on Hacker News and got linked to a few resources that I'm working through; anything similar would be great:
Although you probably don't like to hear it I would still recommend to start investigating HTTP servers first. Although programming for them seemed boring, synchronous, and non-persistent to you, that's only because the creators of the servers did their job to hide the gory details from you so tremendously well - if you think about it, a web server is so not synchronous (it's not that millions of people have to wait for reading this post until you are done... concurrency :) ... and because these beasts do their job so well (yeah, I know we yell at them a lot, but at the end of the day most HTTP servers are outstanding pieces of software) this is the definite starting point to look into if you want to learn about efficient multi-threading. Operating systems and implementations of programming languages or games are another good source, but maybe a bit further away from what you intend to achieve.
If you really intend to get your fingers dirty I would suggest to orient yourself at something like WEBrick first - it ships with Ruby and is entirely implemented in Ruby, so you will learn all about Ruby threading concepts there. But be warned, you'll never get close to the performance of a Rack solution that sits on top of a web server that's implemented in C such as thin.
So if you really want to be serious, you would have to roll your own server implementation in C(++) and probably make it support Rack, if you intend to support HTTP. Quite a task I would say, especially if you want your end result to be competitive. C code can be blazingly fast, but it's all to easy to be blazingly slow as well, it lies in the nature of low-level stuff. And we haven't discussed memory management and security yet. But if it's really your desire, go for it, but I would first dig into well-known server implementations to get inspiration. See how they work with threads (pooling) and how they implement 'sessions' (you wanted persistence). All the things you desire can be done with HTTP, even better when used with a clever REST interface, existing applications that support all the features you mentioned are living proof for that. So going in that direction would be not entirely wrong.
If you still want to invent your own proprietary protocol, base it on TCP/IP as the lowest acceptable common denominator. Going beyond that would end up in a project that your grand-children would probably still be coding on. That's really as low as I would dare to go when it comes to network programming.
Whether you are using it as a library or not, look into EventMachine and its conceptual model. Overlooking event-driven ('non-blocking') IO in your journey would be negligent in the context of learning about/reinventing the right wheels. An appetizer for event-driven programming explaining the benefits of node.js as a web server.
Based on your requirements: asynchronous communication, multiple "subscribers" reacting to "events" that are centrally published; well that really sounds like a good candidate for an event-driven/message-based architecture.
Some books that may be helpful on your journey (Linux/C only, but the concepts are universal):
(Those were the classics)
Projects you may want to check out: