scalaparsingparser-combinatorssemantic-versioning

Parsing SemVer in Scala


I'm trying to write a SemVer (http://semver.org) parser in Scala using parser combinators, as a sort of familiarisation with them.

This is my current code:

case class SemVer(major: Int, minor: Int, patch: Int, prerelease: Option[List[String]], metadata: Option[List[String]]) {
  override def toString = s"$major.$minor.$patch" + prerelease.map("-" + _.mkString(".")).getOrElse("") + metadata.map("+" + _.mkString("."))
}

class VersionParser extends RegexParsers {
  def number: Parser[Int] = """(0|[1-9]\d*)""".r ^^ (_.toInt)
  def separator: Parser[String] = """\.""".r
  def prereleaseSeparator: Parser[String] = """-""".r
  def metadataSeparator: Parser[String] = """\+""".r
  def identifier: Parser[String] = """([0-9A-Za-z-])+""".r ^^ (_.toString)

  def prereleaseIdentifiers: Parser[List[String]] = (number | identifier) ~ rep(separator ~> (number | identifier)) ^^ {
    case first ~ rest => List(first.toString) ++ rest.map(_.toString)
  }

  def metadataIdentifiers: Parser[List[String]] = identifier ~ rep(separator ~> identifier) ^^ {
    case first ~ rest => List(first.toString) ++ rest.map(_.toString)
  }
}

I'd like to know how I should parse identifiers for the prerelease section, because it disallows leading zeros in numeric identifiers and when I try to parse using my current parser leading zeros (for e.g. in "01.2.3") simply become a list containing the element 0.

More generically, how should I detect that the string does not conform to the SemVer spec and consequently force a failure condition?


Solution

  • After some playing around and some searching, I've discovered the issue was that I was calling the parse method instead of the parseAll method. Since parse basically parses as much as it can, ending when it can't parse anymore, it is possible for it to accept partially correct strings. Using parseAll forces all the input to be parsed, and it fails if there is input remaining once parsing stops. This is exactly what I was looking for.