rubynokogiri

Having difficulty with CSS selector for multiple possibilities


I'm having difficulty getting a CSS selector to work with Nokogiri. I'm trying to search for all <tr> nodes that are direct descendants of a <table> or are direct descendants of a <thead> that is a direct descendants of a <table>. The following code will clarify the issue:

#!/usr/bin/ruby -w
require 'nokogiri'

puts RUBY_VERSION #=> 2.7.0
puts Nokogiri::VERSION #=> 1.13.6

html = <<~'HTML'
<html>
    <body>
    
        <table>
            <thead>
                <tr>
                    <th>city</th>
                    <th>state</th>
                    <th>classification</th>
                </tr>
            </thead>
            
            <tr>
                <th>Blacksburg</th>
                <td>Virginia</td>
                <td>College</td>
            </tr>
        </table>
        
    </body>
</html>
HTML

doc = Nokogiri::HTML(html)
table = doc.at('table')

puts table.search('> tr').length #=> 1
puts table.search('> thead > tr').length #=> 1
puts table.search('> tr, > thead > tr').length #=> barf

The first two searches work. The third search is what I actually want to do, but unless I've really forgotten how CSS selectors work (entirely possible) then it seems to me that third selector should work too. Instead I get this error message:

Traceback (most recent call last):
    12: from ./parse.rb:36:in `<main>'
    11: from /var/lib/gems/2.7.0/gems/nokogiri-1.13.6-x86_64-linux/lib/nokogiri/xml/searchable.rb:54:in `search'
    10: from /var/lib/gems/2.7.0/gems/nokogiri-1.13.6-x86_64-linux/lib/nokogiri/xml/searchable.rb:54:in `map'
     9: from /var/lib/gems/2.7.0/gems/nokogiri-1.13.6-x86_64-linux/lib/nokogiri/xml/searchable.rb:55:in `block in search'
     8: from /var/lib/gems/2.7.0/gems/nokogiri-1.13.6-x86_64-linux/lib/nokogiri/xml/searchable.rb:245:in `xpath_query_from_css_rule'
     7: from /var/lib/gems/2.7.0/gems/nokogiri-1.13.6-x86_64-linux/lib/nokogiri/xml/searchable.rb:245:in `map'
     6: from /var/lib/gems/2.7.0/gems/nokogiri-1.13.6-x86_64-linux/lib/nokogiri/xml/searchable.rb:246:in `block in xpath_query_from_css_rule'
     5: from /var/lib/gems/2.7.0/gems/nokogiri-1.13.6-x86_64-linux/lib/nokogiri/css.rb:46:in `xpath_for'
     4: from /var/lib/gems/2.7.0/gems/nokogiri-1.13.6-x86_64-linux/lib/nokogiri/css/parser_extras.rb:78:in `xpath_for'
     3: from /var/lib/gems/2.7.0/gems/nokogiri-1.13.6-x86_64-linux/lib/nokogiri/css/parser_extras.rb:68:in `parse'
     2: from (eval):3:in `do_parse'
     1: from (eval):3:in `_racc_do_parse_c'
/var/lib/gems/2.7.0/gems/nokogiri-1.13.6-x86_64-linux/lib/nokogiri/css/parser_extras.rb:86:in `on_error': unexpected '> ' after ', ' (Nokogiri::CSS::SyntaxError)

Ruby version is 2.7.0 and Nokogiri is 1.13.6. Any idea what I'm doing wrong?


Solution

  • Well naturally I couldn't find the answer myself until I asked someone first. It's a bizarre lifelong pattern. Anyway, I'll post the answer for posterity. The answer is to give multiple arguments to search:

    puts table.search('> tr', '> thead > tr').length