Archive for the ‘Scala’ Category

Efficient Parameterized Queries in ScalaQuery

Thursday, August 6th, 2009

One question about ScalaQuery which keeps coming up is that of perfomance. The question is usually in the form “How does it compare to JDBC?” but that’s like comparing apples and, well, apple trees. After all, ScalaQuery is a layer on top of JDBC which provides mainly two things:

  • A nicer, more Scala-like way of handling database connections, performing queries and reading result sets. This is not optional when you access a database in your application. You will wrap SQL statement execution and result set reading in some way or another to abstract from the low-level JDBC API. Anyway, the overhead here is quite low.
  • A way of composing queries with an internal DSL based on a query monad and combinators. That’s the part I want to talk about in this post.

If you want to optimize the query generation away, you can always fall back to the StaticQuery and DynamicQuery classes. These work a bit like iBATIS, except your SQL code is embedded directly in your Scala code and not in some XML files. But you’re still writing SQL! That’s nice for the special cases which are not covered by the combinator queries and which do not need to be composable but it’s probably not the reason why you want to use ScalaQuery in the first place.

When you’re constructing a query in the query monad, you normally have some variables which are used in the query like this:

def userNameByID(id: Int) =
  for(u <- Users if u.id is id) yield u.first

This would insert the user ID directly into the SQL statement, thus requiring a new statement to be generated by ScalaQuery and parsed and optimized by the database server (which can be very expensive) for every invocation of the query. We can improve this by using a bind variable for the user ID:

def userNameByID(id: Int) =
  for(u <- Users if u.id is id.bind) yield u.first

Now the generated SQL statement is always the same (e.g. “SELECT t1.first FROM users t1 WHERE (t1.id=?)“), so the database server needs to calculate an execution plan for it only once and can reuse it on subsequent invocations. But the query is still constructed as a tree of dozens of objects and compiled to the same SQL statement every time you call userNameByID.

This can be remedied with query templates, a recent addition to ScalaQuery:

val userNameByID = for {
  id <- Parameters[Int]
  u <- Users if u.id is id
} yield u.first

If you recall how for-comprehensions are desugared, this gets translated to Parameters[Int].apply(...).flatMap(id => ...). The apply method on the Parameters object takes an implicit TypeMapper for each parameter type you specify and creates a Parameters instance. The flatMap method on Parameters takes a function which creates a Query and returns a QueryTemplate for it:

final class QueryTemplate[P, R](query: Query[ColumnBase[R]]) {
  def apply(param: P) = new AppliedQueryTemplate(built, param, query.value)
  lazy val built = QueryBuilder.buildSelect(query, NamingContext())
}

final class Parameters[P, C](c: C) {
  def flatMap[F](f: C => Query[ColumnBase[F]]): QueryTemplate[P, F] =
    new QueryTemplate[P, F](f(c))
  ...
}

object Parameters {
  def apply[P1](implicit tm1: TypeMapper[P1]) =
    new Parameters[P1, Column[P1]](new ParameterColumn(-1, tm1))
  ...
}

Note that, unlike Query, neither Parameters nor QueryTemplate is a monad. You cannot compose multiple parameter lists or query templates this way but by providing a suitable flatMap method, the parameters can be used as the first generator in a for-comprehension which otherwise operates in the Query monad.

Either one of the userNameByID functions/methods defined above can be used in the same way:

for(t <- userNameByID(3)) println(t)

When you apply the parameters to a QueryTemplate, you get an AppliedQueryTemplate which can be lifted to a suitable Invoker by an implicit conversion (just like a Query). The first time you do this, the SQL code gets generated and then cached in the QueryTemplate for further applications.

If you specify more than one type parameter, the Parameters generator gives you a Projection instead of a single Column. It can be unpacked either with the extractor on the “~” object:

val userNameByIDRangeAndProduct = for {
  min ~ max ~ product <- Parameters[Int, Int, String]
  u <- Users if u.id >= min &&
    u.id <= max &&
    Orders.where(o => (u.id is o.userID) && (o.product is product)).exists
} yield u.first

…or with the Projection extractor (which is slightly more efficient but not as nice to read):

val userNameByIDRangeAndProduct = for {
  Projection(min, max, product) <- Parameters[Int, Int, String]
  u <- Users if u.id >= min &&
    u.id <= max &&
    Orders.where(o => (u.id is o.userID) && (o.product is product)).exists
} yield u.first

ScalaQuery’s invoker framework provides methods for invoking queries with parameters but these are currently used by the simple queries only. I expect to integrate query templates with this system in the future.

ScalaQuery for different Scala versions

Wednesday, July 22nd, 2009

I have just created a new scala-2.7 branch for ScalaQuery. My original plan was to target only Scala 2.8 but since I’ve made lots of progress during the last few weeks and I’ve seen increased interest in ScalaQuery, I tried to build it with 2.7.5.

I had to change the semantics of SimpleFunction and SimpleBinaryOperator for 2.7 but I prefer the new version anyway, so it went into the main line. The code on the scala-2.7 branch is currently identical to the master branch, except for the test classes which are different in two regards:

  • Although the Scala Language Specification mandates that the part left to the “<-” in a for comprehension is a pattern, the wildcard pattern “_” does not work in 2.7. I have changed it to a dummy variable named “__“.
  • The type inferencer in 2.7 cannot infer the correct type for the implicit OptionMapper objects. OptionMapper[_,_,_,_] has four type parameters, the last one being used for the return type of functions which use the mapper, so it is not yet known when looking for an implicit mapper and gets inferred as Nothing. Scala 2.8 apparently knows that this type parameter is undetermined, finds the single matching implicit object for the other three parameters and then fills in the fourth. The type-correct interoperability of option and non-option types in ScalaQuery relies heavily on the improved type inferencer and I don’t see any way of making it work nicely with 2.7. The work-around is to add type annotations to the boolean operators, e.g. a && b might become a.&&[Boolean,Option[Boolean]](b). Yuck!

I’m still focused on Scala 2.8 as a target platform and will probably not spend much time integrating new features into the scala-2.7 branch but contributions are always welcome.

Implicits on Implicits

Friday, July 17th, 2009

I have recently made a change to ScalaQuery which enables the use of any type for a column without needing specific Column classes and factory methods on Table for it. For this I used an approach which I had not considered earlier and which I wasn’t even sure would work.

Scala (at least in version 2.7) only considers single function calls for implicit conversions. Take the following definitions for example:

class A
class B(a: A)
class C(b: B)

implicit def aToB(a: A) = new B(a)
implicit def bToC(b: B) = new C(b)

val a = new A

def useB(b: B) = ...
def useC(c: C) = ...

You can call useB(a) because the compiler finds the implicit conversion aToB(a) but you cannot call useC(a) — the compiler does not try the chained call bToC(aToB(c)).

But this does not mean that an implicit conversion may not rely on an implicit value! The following works just fine:

trait Column[T]
class ConstColumn[T](value: T, tm: TypeMapper[T]) extends Column[T]

implicit def valueToColumn[T](value: T)(implicit tm: TypeMapper[T]) =
  new ConstColumn(value, tm)

trait TypeMapper[T]
implicit object IntTypeMapper extends TypeMapper[Int]

val c1: Column[_] = 42

The Scala compiler recognizes that valueToColumn[Int] provides the desired conversion from Int to Column[_] and then looks for the required implicit TypeMapper[Int] value, which is available through the implicit IntTypeMapper object.

In ScalaQuery, the real TypeMapper contains only a few methods and there are predefined TypeMappers for most of the basic types used by JDBC. That’s nice because it removes some redundancy from my code base but the better news is that it allows you to write your own TypeMappers. For example, if you wanted to get the java.lang.Integer columns back which I removed in favor of Int and Option[Int], you could add this implicit object to your code:

implicit object IntegerTypeMapper extends TypeMapper[java.lang.Integer] {
  def zero = null
  def sqlType = java.sql.Types.INTEGER
  def setValue(v: java.lang.Integer, p: PositionedParameters) =
    if(v eq null) p.setIntOption(None) else p.setIntOption(Some(v.intValue))
  def setOption(v: Option[java.lang.Integer], p: PositionedParameters) = v match {
    case Some(null) => p.setIntOption(None)
    case Some(i) => p.setIntOption(Some(i.intValue))
    case None => p.setIntOption(None)
  }
  def nextValue(r: PositionedResult) = r.nextIntOption match {
    case None => null
    case Some(i) => java.lang.Integer.valueOf(i)
  }
}

The same should work for any database engine- or domain-specific types.

Formal Language Processing in Scala, Solutions to Part 5

Wednesday, July 15th, 2009

This is the solution to the exercise from part 5. (more…)

Formal Language Processing in Scala, Part 5

Sunday, July 5th, 2009

This is the fifth part in a series of articles on formal language processing in Scala. In this part I will introduce some new parser combinators, provide a specification for the Fun1 language and build an interpreter for it. (more…)


Close
E-mail It