Introduction
Before introducing refinement types, let’s suppose you are a library author, and you have a tricky situation to solve. One of your library objects can refer to other objects by their name.
Very abstract don’t you think ? But when you start to look around, a lot of these cases exists.
- Relational Tables (columns registered in a table object)
- Forms
- RPC handle collections
- HTTP Routes
Two main problems arise when using these :
- The user have to read the whole definition of the parent to know which fields exists (orelse the program will error at runtime)
- The type information is lost, and we need to either on typed getter
parent.get[Int]("field")
that could fail, or even worse, type casts.
Let’s focus on the table / column exemple. I made a little modelization of it :
sealed trait Col[T]:
def values: Array[T]
case class IntCol(values: Array[Int]) extends Col[Int]
case class BoolCol(values: Array[Boolean]) extends Col[Boolean]
case class TextCol(values: Array[String]) extends Col[String]
case class Table(cols: Map[String, Col[?]])
def addCol(name: String, col: Col[?]): Table =
Table(cols + (name -> col))
object Table:
def empty = Table[EmptyTuple, EmptyTuple](Map.empty)
Nothing too complicated here, we have a common trait for Column that allow us to retrieve values, and they can be grouped into a table. Now how can we get a column from the table ?
val table = Table.empty
.addCol("id", IntCol(Array(1, 2, 3)))
.addCol("name",TextCol(Array("Alice", "Bob", "Carol")))
.addCol("flag",BoolCol(Array(true, false, true)))
table.cols("id").asInstanceOf[Col[Int]]
// fails if absent
// fails is not a Col[Int]
// How to know ? better read the codebase
A better solution
Another approach would be to strongly type field objects and use them in the API, like below.
case class ColRef[T](name: String)
case class Table(cols: Map[ColRef[?], Col[?]])
def addCol[T](ref: ColRef[T], col: Col[T]): Table =
Table(cols + (ref -> col))
def apply[T](ref: ColRef[T]); Col[T] =
table.cold(ref).asInstanceOf[Col[T]]
// ------------ //
val id = ColRef[Int]("id")
val name = ColRef[String]("name")
val flag = ColRef[Boolean]("flag")
val foo = ColRef[Int]("who cares, I can put everything here !")
val table = Table.empty
.addCol(id, IntCol(Array(1, 2, 3)))
.addCol(name, TextCol(Array("Alice", "Bob", "Carol")))
.addCol(flag, BoolCol(Array(true, false, true)))
table(id) // yay
table(foo) // *boom*
Better than previously, as type errors are now impossible ! But two more problems are still to be solved :
- We can still use the wrong
ColRef[T]
to get data (or even worse insert the wrong reference !) - We need to fish the entire codebase to know which fields are present in which table.
- All fields are to be declared by hand then used with the table
So how can we do more ? Ready your keyboard, sharpen your Quotes
, we are going to the land of macros !
Refinements as a user
Here is sneak peak of what we are going to build today :
// Simple register your columns using plain innocent strings
val table = Table.empty
.addCol("id", IntCol(Array(1, 2, 3)))
.addCol("name",TextCol(Array("Alice", "Bob", "Carol")))
.addCol("flag",BoolCol(Array(true, false, true)))
table.schema.id // id must be there, validated at compile time
We get full type safety and validation that our field is present at compile time. Also the schema here is explorable using your ide ! Hoovering over the schema, you can spot his type, a very special type.
val schema: Schema{val id: Col[Int]; val name: Col[String]; val flag: Col[Boolean]}
A way to read this is that this type is the Schema
type, with three refinements corresponding to the 3 columns of our table.
In this article, we are going to read an see the whole implementation that leads to this result.
Implemetation
Before going more in depth, I really would like to thanks Jamie Thompson for its excellent implementation of refinement type and many other type-level-macro piece of code that were relly eye-opening for me. Especially the ops-mirror repository, which is by itself tackling a very interesting problem.
Keeping the type information
First thing we need to have a way to keep the column names and types while building the map. For this I used two type-level tuples, one for the column names and one for the column types.
Note the usage of name.type
to get the litteral type corresponding to the inputed string. Scala has a good support for these, for instance the most precise type for the number 3
is 3
, which is a subtype of Int
.
case class Table[Ns <: Tuple, Ts <: Tuple](cols: Map[String, Col[?]]):
inline def addCol[T](name: String, col: Col[T]): Table[Tuple.Append[Ns, name.type], Tuple.Append[Ts, T]] =
Table(cols + (name -> col))
Setup the Schema
As seen previously, the Schema object seems very alien. Let’s focus on its definition.
class Schema(table: Table[?, ?]) extends Selectable {
def selectDynamic(name: String): Any = table.cols(name)
}
Selectable
is the key entrypoint to the machinery and usefulness of Refinement Types- When called using one of the phantom refinement field, the
selectDynamic
method will be applied withname
as the refinement name, and the result will be casted to the refinement value (hence theAny
, as we don’t care about what’s out). - the table type arguments are irrelevant (hence the wildcards) because the Schema type is already crafted when its usage occurs
The key thing to understand here is that the Schema
class is just an empty vessel, something to weaves together our refinements.
Crafting types
In order to build type we obviously need a macro. The goal of this macro will be to take a typed table in input, and output a refined schema.
Since the return type of our macro will be dependent on the input we will have to use infamous transparent
keyword together with a token Any
return type.
case class Table[Ns <: Tuple, Ts <: Tuple](cols: Map[String, Col[?]]):
// addCol impl //
transparent inline def schema: Any =
${ macros.schemaImpl('this) }
Here is the complete macro implementation, don’t worry we are going to explain it line by line.
def schemaImpl[Ns <: Tuple, Ts <: Tuple](table: Expr[Table[Ns, Ts]])(using Quotes, Type[Ns], Type[Ts]): Expr[Any] =
import quotes.reflect.* // 1
val names = utils.stringsFromTuple[Ns] // 2
val types = utils.typesFromTuple[Ts].map: // 2
case '[t] => Type.of[Col[t]] // 3
val refinements = names.zip(types).foldLeft(Type.of[Schema]: Type[?]): (acc, p) => // 4
((acc, p): @unchecked) match // 5
case ('[acc], (l, '[e])) =>
Refinement(TypeRepr.of[acc], l, TypeRepr.of[e]).asType // 6
refinements match
case '[refined] => '{new Schema($table).asInstanceOf[refined]} // 7
quotes.reflect.*
is needed to unlock powerful macro features like type building or compile time reporting- These helpers materialize the
Ns
tuples in a list of constant Strings, andTs
is a list ofType[?]
, which is the reflection entrypoint for types in scala macros. You can find the original implemention here. - Here we map the columns inner type to get the desired column value. We could have specialized more there, for instance retrurning
Type.of[IntCol]
oncase '[Int]
. - Using labels and types, we fold over the Schema type to build the refinement.
- Weird case of pattern matching which dissaloy matching
Type
(thecase '[e]
patterns)
And finally we get the desired result: compile time guarantee that we can access strongly typed fields in a user-defined semi-static object.
Conclusion
Building refinements is a very advanced feature of the language, that require a good understanding of the underlying machinery regarding types. The price is high, but the benefits to have reliable, auto-complete aware type-safe APIs is priceless.
A programing language is often judged not by its features or its complexity but by the quality of its tooling. Scala suffers from the inverse problem : a poor tooling hampering a nice, concise and beatiful language.
Let’s bring more ide support and tooling for our users, let’s use refinement types !