scalaspecsscalacheck

Generating arbitrary (legal) Unicode character with scalacheck?


I'm trying to create a generator that produces (non-zero-length) legal unicode strings, with scalacheck 1.6.6 and specs 1.7 (scala 2.8.1).

I hoped I could just create generators like:

object Generators {
  def unicodeChar: Gen[Char] = 
    choose(Math.MIN_CHAR, Math.MAX_CHAR).map(_.toChar).filter(
      c => Character.isDefined(c))
  def unicodeStr: Gen[String] = for(cs <- listOf1(unicodeChar)) yield cs.mkString
}

...then use them from specs like:

import org.specs.Specification
import org.specs.matcher.ScalaCheckMatchers

object CoreSpec extends Specification with ScalaCheckMatchers {        
  "The core" should {    
    "pass trivially" in {
      Generators.unicodeStr must pass((s: String) => s == s)
    }
  }
}

But, it seems that using filter in unicodeChar causes a problem:

Specification "CoreSpec"
  The core should
  x pass trivially
    Gave up after only 64 passed tests. 500 tests were discarded.

If I remove the filter from unicodeChar, my test passes, but I run into other problems later, since my strings are not always well-defined unicode.

Thanks in advance for any suggestions on how to achieve this.


Solution

  • Try filtering out the characters before creating the generator:

    val unicodeChar: Gen[Char] = Gen.oneOf((Math.MIN_CHAR to Math.MAX_CHAR).filter(Character.isDefined(_)))
    

    It will be more memory intensive, since the complete list of unicode characters will be allocated when the generator is created, but only one instance of that list will be used so it shouldn't be a big problem.