[SOLVED] Extracting values from a login accessible web page post-javascript using Ruby

Extracting values from a login accessible web page post-javascript using Ruby

I have a stock trading website that is only accessible after logging into the site. After logging in, there is a stock value that I am trying to extract. That number is not readily available and takes a while to load as it is being updated from the company's database.

I am trying to write a script in Ruby that will allow me to extract the number and then use it in my program.

In firebug, the tag looks like this but only after the number has loaded:

<span id="ContentPlaceHolderTodaysStock">10,747</span>

I have explored libraries such as hpricot and nokogiri and have tried code similar to the following:

require "nokogiri"
require "open-uri"
doc = Nokogiri::HTML(open("website.com/stocks"))
puts doc.xpath("//span/text()")

The problems I run into are 1)it only reads the html from the login page "website.com" instead of "website.com/stocks" 2)once I do get past the login, how do I use the html code after the javascript has loaded?

I have also tried Watir so that can get me past problem #1 but then doing something like the following doesn't help with problem#2 because it provides the original html source...

require 'net/http'
source = Net::HTTP.get("website.com/stocks", '/')

Any help in solving this problem would be greatly appreciated. Thank you!

Solution

Since you are able to login using Watir, you may as well use it to get the text off of the page. Watir has built-in methods for waiting for asynchronous components to load - see http://watirwebdriver.com/waiting/.

To get the text, you will want something like:

puts browser.span(:id => 'element_id').when_present.text