I have a webmachine REST API server running on one machine. in anticipation of having more traffic that this machine cannot handle, i would need to expand to other nodes on other cpu`s. is there a way of configuring this?
if not what is the right way of distribution here, would i need to do it manually through OTP, concurrent workers and supervisors? where a worker is spawned and ships the request to the neighboring machine.
It kind of depends on your use case. Best way would be observing where you experience problems, and react accordingly.
You could look at your application as three separate parts. First one would be REST interface, second could be businesses logic (little more later), and third would be the data itself (resources, lets call it data store, but it could be even just another service).
This one is simplest. I assume you are using separate service for this (like Riak cluster), where you could do your scaling separately. One thing you could look into is just making sure connection between Webmachine and your data store can scale enough for your needs.
If your server just can not handle enough requests, just put another one next to it. You can router to dispatch request to both of therm, ans since they will use same data store, they will stay in "sync".
REST being based on http assumes stateless communication. Meaning, any two requests (form same user or two different ones) don't share any resources and can be handled by different applications (you also don't have to share anything between your Webmachine instances).
In theory you should not have any of this in your REST API server, but still lets discuss this a little bit.
Some of your requests might require little more work than just serving content. You might be doing some computation (like serving statistics that need to be generated). You might be updating some resource, that need to change more that one data in one place (one could think of it as of transaction). It could call for more computational power, or state synchronization, which would make scaling harder.
Way around this would be separating REST from such logic. Especially introducing micro-services, which you could scale up or down independently from Webmachine itself.
In Erlang you could actually introduce separate applications inside Erlang VM. Those again could be scaled up with use of Distributed Erlang (and little more in this topic and pull of workers (like poolboy. I would recommend this approach for start, since it is easiest to implement, and due to async nature of Erlang it could always easily be ported to external micro-service.
You also should check if your box can handle such traffic. One of most common mistakes is not increasing maximal number of file descriptors in your production. But again, first you should observe such problems, and then react to them. Premature optimization in most cases doesn't pay off.
You can monitor our applications and system resources with tools like Exometer or more out-of-the-box WombatOAM.
And you can (should) stress test your application with tools like tsung or basho-bench