Formal Language Processing in Scala, Solutions to Part 4

October 12th, 2008 | In Scala

Here is the solution to the exercise from part 4.

Whether you want to use binary or n-ary operators, the first step is always to add the new “-” operator to the list of delimiters:

  lexical.delimiters += ("+", "-", "*", "(", ")")

N-ary

We currently have two chained operators, “+” and “*”. Since “+” and “-” have the same precedence, we cannot simply add a separate chain for “-”. Adding the new operator to the parser production is straight-forward:

  lazy val sum = product ~ ( ( ("+"|"-") ~> product)* )

This, however, won’t work for building the AST because we treat an Add expression as a list of sub-expressions to be added without keeping the individual operator symbols. In order to make this work with two different operators, we can replace Add with an AddChain expression that contains a list of AddChainOps. There are two concrete sub-types of AddChain for Add and Subtract operators which contain the expression to be added or subtracted:

  case class AddChain(ops: List[AddChainOp]) extends Expression

  sealed abstract class AddChainOp(expr: Expression)
  case class Add(expr: Expression) extends AddChainOp(expr)
  case class Subtract(expr: Expression) extends AddChainOp(expr)

I’ve split up the sum production to make it more readable with the new transformations:

  lazy val sum = product ~ ( sumOp* ) ^^ {
    case e ~ Nil => e
    case e ~ l => AddChain(Add(e) :: l)
  }

  lazy val sumOp = "+" ~> product ^^ Add |
    "-" ~> product ^^ Subtract

The first operand gets wrapped as an Add object and the other ones as Add or Subtract depending on the operator. Note the use of Add and Subtract in sumOp as function objects: “^^ Add” is a shortcut for “^^ { e => Add(e) }”, which can be expanded further to “^^ { e => Add.apply(e) }”.

For interpreting the AST in the compute method, we still do a left-fold, starting with 0, except now we have to examine the individual AddChainOps to decide if we need to add or subtract:

  def compute(e: Expression): Int = e match {
    ...
    case AddChain(ops) => ops.foldLeft(0) {
      case (z, Add(e)) => z + compute(e)
      case (z, Subtract(e)) => z - compute(e)
    }
  }

Binary

The solution with binary operators is a lot simpler, with the only tricky part being the associativity. The “+” and “-” operators need to be left-associative (e.g. “1-2-3″ is read as “(1-2)-3)” and not as “1-(2-3)”) but that cannot be expressed directly with a recursive-descent parser.

First of all, we define the expression classes for the binary operators:

  case class Add(left: Expression, right: Expression) extends Expression
  case class Subtract(left: Expression, right: Expression) extends Expression

This allows for a very straight-forward interpreter:

  def compute(e: Expression): Int = e match {
    ...
    case Add(left, right) => compute(left) + compute(right)
    case Subtract(left, right) => compute(left) - compute(right)
  }

In order to construct the AST in the parser, we add “-” as shown above and change the “~>” operator back to “~” so that we don’t drop the parsed “+” or “-” token. That way we still get a list of operations (now including the operator symbol). We can then change the transformation to left-fold that list into the AST:

  lazy val sum = product ~ ( ( ("+"|"-") ~ product)* ) ^^ {
    case e ~ Nil => e
    case e ~ l => l.foldLeft(e) {
      case (z, "+" ~ p) => Add(z, p)
      case (z, "-" ~ p) => Subtract(z, p)
    }
  }

There is a simpler way to do this which I will cover in part 5.

One Response to “Formal Language Processing in Scala, Solutions to Part 4”

  1. Mitch says:

    >> There is a simpler way to do this which I will cover in part 5. <<
    I’m implementing the “non-simple” way right now, but it would be enlightening to me to see the simplified version.

Leave a Reply


Close
E-mail It