Efficient Parameterized Queries in ScalaQuery
Thursday, August 6th, 2009One question about ScalaQuery which keeps coming up is that of perfomance. The question is usually in the form “How does it compare to JDBC?” but that’s like comparing apples and, well, apple trees. After all, ScalaQuery is a layer on top of JDBC which provides mainly two things:
- A nicer, more Scala-like way of handling database connections, performing queries and reading result sets. This is not optional when you access a database in your application. You will wrap SQL statement execution and result set reading in some way or another to abstract from the low-level JDBC API. Anyway, the overhead here is quite low.
- A way of composing queries with an internal DSL based on a query monad and combinators. That’s the part I want to talk about in this post.
If you want to optimize the query generation away, you can always fall back to the StaticQuery and DynamicQuery classes. These work a bit like iBATIS, except your SQL code is embedded directly in your Scala code and not in some XML files. But you’re still writing SQL! That’s nice for the special cases which are not covered by the combinator queries and which do not need to be composable but it’s probably not the reason why you want to use ScalaQuery in the first place.
When you’re constructing a query in the query monad, you normally have some variables which are used in the query like this:
def userNameByID(id: Int) = for(u <- Users if u.id is id) yield u.first
This would insert the user ID directly into the SQL statement, thus requiring a new statement to be generated by ScalaQuery and parsed and optimized by the database server (which can be very expensive) for every invocation of the query. We can improve this by using a bind variable for the user ID:
def userNameByID(id: Int) = for(u <- Users if u.id is id.bind) yield u.first
Now the generated SQL statement is always the same (e.g. “SELECT t1.first FROM users t1 WHERE (t1.id=?)“), so the database server needs to calculate an execution plan for it only once and can reuse it on subsequent invocations. But the query is still constructed as a tree of dozens of objects and compiled to the same SQL statement every time you call userNameByID.
This can be remedied with query templates, a recent addition to ScalaQuery:
val userNameByID = for {
id <- Parameters[Int]
u <- Users if u.id is id
} yield u.first
If you recall how for-comprehensions are desugared, this gets translated to Parameters[Int].apply(...).flatMap(id => ...). The apply method on the Parameters object takes an implicit TypeMapper for each parameter type you specify and creates a Parameters instance. The flatMap method on Parameters takes a function which creates a Query and returns a QueryTemplate for it:
final class QueryTemplate[P, R](query: Query[ColumnBase[R]]) {
def apply(param: P) = new AppliedQueryTemplate(built, param, query.value)
lazy val built = QueryBuilder.buildSelect(query, NamingContext())
}
final class Parameters[P, C](c: C) {
def flatMap[F](f: C => Query[ColumnBase[F]]): QueryTemplate[P, F] =
new QueryTemplate[P, F](f(c))
...
}
object Parameters {
def apply[P1](implicit tm1: TypeMapper[P1]) =
new Parameters[P1, Column[P1]](new ParameterColumn(-1, tm1))
...
}
Note that, unlike Query, neither Parameters nor QueryTemplate is a monad. You cannot compose multiple parameter lists or query templates this way but by providing a suitable flatMap method, the parameters can be used as the first generator in a for-comprehension which otherwise operates in the Query monad.
Either one of the userNameByID functions/methods defined above can be used in the same way:
for(t <- userNameByID(3)) println(t)
When you apply the parameters to a QueryTemplate, you get an AppliedQueryTemplate which can be lifted to a suitable Invoker by an implicit conversion (just like a Query). The first time you do this, the SQL code gets generated and then cached in the QueryTemplate for further applications.
If you specify more than one type parameter, the Parameters generator gives you a Projection instead of a single Column. It can be unpacked either with the extractor on the “~” object:
val userNameByIDRangeAndProduct = for {
min ~ max ~ product <- Parameters[Int, Int, String]
u <- Users if u.id >= min &&
u.id <= max &&
Orders.where(o => (u.id is o.userID) && (o.product is product)).exists
} yield u.first
…or with the Projection extractor (which is slightly more efficient but not as nice to read):
val userNameByIDRangeAndProduct = for {
Projection(min, max, product) <- Parameters[Int, Int, String]
u <- Users if u.id >= min &&
u.id <= max &&
Orders.where(o => (u.id is o.userID) && (o.product is product)).exists
} yield u.first
ScalaQuery’s invoker framework provides methods for invoking queries with parameters but these are currently used by the simple queries only. I expect to integrate query templates with this system in the future.