rubyregexarraysparsing

Parsing a string representation of nested arrays into an Array


Let's say I had the string

"[1,2,[3,4,[5,6]],7]"

How would I parse that into the array

[1,2,[3,4,[5,6]],7]

?

Nesting structures and patterns are completely arbitrary in my usage case.

My current ad-hoc solution involves adding a space after every period and using YAML.load, but I'd like to have a cleaner one if possible.

(One that does not require external libraries if possible)


Solution

  • That particular example is being parsed correctly using JSON:

    s = "[1,2,[3,4,[5,6]],7]"
    #=> "[1,2,[3,4,[5,6]],7]"
    require 'json'
    #=> true
    JSON.parse s
    #=> [1, 2, [3, 4, [5, 6]], 7]
    

    If that doesn't work, you can try running the string through eval, but you have to ensure that no actual ruby code has been passed, as eval could be used as injection vulnerability.

    Edit: Here is a simple recursive, regex based parser, no validation, not tested, not for production use etc:

    def my_scan s
      res = []
      s.scan(/((\d+)|(\[(.+)\]))/) do |match|
        if match[1]
          res << match[1].to_i
        elsif match[3]
          res << my_scan(match[3])
        end
      end
      res
    end
    
    s = "[1,2,[3,4,[5,6]],7]"
    p my_scan(s).first #=> [1, 2, [3, 4, [5, 6]], 7]