I know that fibers are cooperative threads. A fiber has control of the execution context, whereas a preemptive thread does not. A fiber can yield control, which means a fiber can start and stop in well-defined places.
Apparently, the reason why fibers are used in evented ruby is to clean up the nested blocks caused by the reactor pattern.
But I have difficulty trying to grasp the control flow of the below script that uses fiber.
def http_get(url)
f = Fiber.current
http = EventMachine::HttpRequest.new(url).get
# resume fiber once http call is done
http.callback { f.resume(http) }
http.errback { f.resume(http) }
return Fiber.yield
end
EventMachine.run do
Fiber.new{
page = http_get('http://www.google.com/')
puts "Fetched page: #{page.response_header.status}"
if page
page = http_get('http://www.google.com/search?q=eventmachine')
puts "Fetched page 2: #{page.response_header.status}"
end
}.resume
end
The way I understand it:
1) EM starts its event loop
2) A fiber is created and then the resume is called. Does the block of code passed to new get executed right away or does it get executed after resume is invoked?
3) http_get is called the first time. It does an asynchronous event (using select, poll or epoll on linux). We set up the event handler (in the callback method) of the asynchronous event. Then Fiber voluntarily yields control to the thread EventMachine is on (the main thread). However, as soon as the callback is invoked, it will take control back with f.resume(http). But in this simplified example, am I supposed to put my own callback code after f.resume(http)? Because right now it seems like f.resume(http) just returns control to the fiber and does nothing else.
I think what happens after yield is the control goes to EventMachine where it enters its event loop. So the second http_get is not invoked yet. Now once the callback is invoked, then control is returned to the Fiber (we only use one Fiber.new so I assume there is only one Fiber instance in all this). But when does the second http_get get called?
Let me see if I can answer it for you. I am adding line numbers to aid the description:
01: def http_get(url)
02: f = Fiber.current
03: http = EventMachine::HttpRequest.new(url).get
04:
05: # resume fiber once http call is done
06: http.callback { f.resume(http) }
07: http.errback { f.resume(http) }
08:
09: return Fiber.yield
10: end
11:
12: EventMachine.run do
13: Fiber.new{
14: page = http_get('http://www.google.com/')
15: puts "Fetched page: #{page.response_header.status}"
16:
17: if page
18: page = http_get('http://www.google.com/search?q=eventmachine')
19: puts "Fetched page 2: #{page.response_header.status}"
20: end
21: }.resume
22: end
google.come
. In Line 17 it checks if there was valid response from http_get
, then, perform next request in Line 18 to search a string eventmachine
..resume
at Line 21, Line 14 gets executed which invokes http_get
method.http
object, which is now referenced with variable page
in Line 14 and subsequently used in Line 15 to dump the HTTP request status.page
is truthy value, then, goes ahead and calls http_get
again, but with a new URL. Note that code if page
may never execute in case it was nil
as Line 15 would have bombed where page
accessed without a check for nil
.Hope that clarifies the matter.
As far as handling the response of HTTP GET with additional logic, I guess you can replace the puts
with some meaningful processing logic. The puts
in the this sample seems to be dealing with responses and callbacks are used primarily to resume the Fiber.