parsingexceptionclojurecontext-free-grammarinstaparse

How to test for texts not fitting an Instaparse-grammar (Clojure)?


I wrote a project for parsing strings using context-free grammar in Instaparse (Clojure). Now I'd like to test several input-Strings for their parsing results. Some input strings might not fit into the grammar. So far I only tested for "parsed strings not fitting the expectation". But I think it would be more accurate to test for exceptions using (is (thrown? ...)). Are there exceptions thrown? It seems to me that some output (Containing Parse error...) is generated, but no exception is thrown.

My project.clj is:

(defproject com.stackoverflow.clojure/tests "0.1.0-SNAPSHOT"
  :description "Tests of Clojure test-framework."
  :url "http://example.com/FIXME"
  :license {:name "Eclipse Public License"
            :url "http://www.eclipse.org/legal/epl-v10.html"}
  :dependencies [[org.clojure/clojure "1.6.0"]
                 [instaparse "1.3.4"]])

My core source is:

(ns com.stackoverflow.clojure.testInstaparseWrongGrammar
  (:require [instaparse.core :as insta]))

(def parser (insta/parser "
    <sentence> = words <DOT>
    DOT        = '.'
    <words>    = word (<SPACE> word)*
    SPACE      = ' '
    word     = #'(?U)\\w+'
"))

(defn formatter [expr] 
  (->> (parser expr)
       (insta/transform {:word identity})
       (apply str)))

My test source is:

(ns com.stackoverflow.clojure.testInstaparseWrongGrammar-test
  (:require [clojure.test :refer :all]
            [com.stackoverflow.clojure.testInstaparseWrongGrammar :refer :all]))

(deftest parser-tests
  (is (= [[:word "Hello"] [:word "World"]] (parser "Hello World.")))
  (is (not (= [[:word "Hello"] [:word "World"]] (parser "Hello World?"))))
  ;(parser "Hello World?")     gives:
  ;
  ;Parse error at line 1, column 12:
  ;Hello World?
  ;           ^
  ;Expected one of:
  ;"." (followed by end-of-string)
  ;" "
)

(deftest formatter-tests
  (is (= "HelloWorld" (formatter "Hello World.")))
  (is (not (= "HelloWorld" (formatter "Hello World?"))))
  ;(formatter "Hello World?")     gives:
  ;"[:index 11][:reason [{:tag :string, :expecting \".\", :full true} {:tag :string, :expecting \" \"}]][:text \"Hello World?\"][:column 12][:line 1]"
)

; run the tests
(run-tests)

How should I test for the errors (Here: when the sentence does not end with a . but with a !)?


Solution

  • Instaparse does not throw an exception on a parse error; instead, it returns a "failure object" (ref: parse errors). You can test for a failure object with (insta/failure? result).

    If you want your parser/formatter to throw an exception on unexpected input, add that to your core:

    (ns com.stackoverflow.clojure.testInstaparseWrongGrammar
      (:require [instaparse.core :as insta])
      (:require [instaparse.failure :as fail]))
    
    (def raw-parser (insta/parser "
        <sentence> = words <DOT>
        DOT        = '.'
        <words>    = word (<SPACE> word)*
        SPACE      = ' '
        word     = #'(?U)\\w+'
    "))
    
    ; pretty-print a failure as a string
    (defn- failure->string [result]
      (with-out-str (fail/pprint-failure result)))
    
    ; create an Exception with the pretty-printed failure message
    (defn- failure->exn [result]
      (Exception. (failure->string result)))  
    
    (defn parser [expr]
      (let [result (raw-parser expr)]
        (if (insta/failure? result)
          (throw (failure->exn result))
          result)))
    
    (defn formatter [expr]
      (->> (parser expr)
           (insta/transform {:word identity})
           (apply str)))
    

    ...and now you can use (is (thrown? ...)) in the test:

    (deftest parser-tests
      (is (= [[:word "Hello"] [:word "World"]] (parser "Hello World.")))
      (is (thrown? Exception (= [[:word "Hello"] [:word "World"]] (parser "Hello World?"))))
    

    This approach uses instaparse to pretty-print the failure and wraps that in an Exception. Another approach is to use ex-info as outlined in this answer.