Introduction

Before introducing refinement types, let’s suppose you are a library author, and you have a tricky situation to solve. One of your library objects can refer to other objects by their name.

Very abstract don’t you think ? But when you start to look around, a lot of these cases exists.

  • Relational Tables (columns registered in a table object)
  • Forms
  • RPC handle collections
  • HTTP Routes

Two main problems arise when using these :

  • The user have to read the whole definition of the parent to know which fields exists (orelse the program will error at runtime)
  • The type information is lost, and we need to either on typed getter parent.get[Int]("field") that could fail, or even worse, type casts.

Let’s focus on the table / column exemple. I made a little modelization of it :

sealed trait Col[T]:
    def values: Array[T]

case class IntCol(values: Array[Int])      extends Col[Int]
case class BoolCol(values: Array[Boolean]) extends Col[Boolean]
case class TextCol(values: Array[String])  extends Col[String]

case class Table(cols: Map[String, Col[?]])
    def addCol(name: String, col: Col[?]): Table =
        Table(cols + (name -> col))

object Table:
    def empty = Table[EmptyTuple, EmptyTuple](Map.empty)

Nothing too complicated here, we have a common trait for Column that allow us to retrieve values, and they can be grouped into a table. Now how can we get a column from the table ?

val table = Table.empty
    .addCol("id", IntCol(Array(1, 2, 3)))
    .addCol("name",TextCol(Array("Alice", "Bob", "Carol")))
    .addCol("flag",BoolCol(Array(true, false, true)))

table.cols("id").asInstanceOf[Col[Int]]     
  // fails if absent
  // fails is not a Col[Int]
  // How to know ? better read the codebase

A better solution

Another approach would be to strongly type field objects and use them in the API, like below.

case class ColRef[T](name: String)

case class Table(cols: Map[ColRef[?], Col[?]])
    def addCol[T](ref: ColRef[T], col: Col[T]): Table =
        Table(cols + (ref -> col))

    def apply[T](ref: ColRef[T]); Col[T] =
      table.cold(ref).asInstanceOf[Col[T]]

// ------------ //

val id = ColRef[Int]("id")
val name = ColRef[String]("name")
val flag = ColRef[Boolean]("flag")

val foo = ColRef[Int]("who cares, I can put everything here !")

val table = Table.empty
    .addCol(id, IntCol(Array(1, 2, 3)))
    .addCol(name, TextCol(Array("Alice", "Bob", "Carol")))
    .addCol(flag, BoolCol(Array(true, false, true)))

table(id)  // yay
table(foo) // *boom*

Better than previously, as type errors are now impossible ! But two more problems are still to be solved :

  • We can still use the wrong ColRef[T] to get data (or even worse insert the wrong reference !)
  • We need to fish the entire codebase to know which fields are present in which table.
  • All fields are to be declared by hand then used with the table

So how can we do more ? Ready your keyboard, sharpen your Quotes, we are going to the land of macros !

Refinements as a user

Here is sneak peak of what we are going to build today :

// Simple register your columns using plain innocent strings
val table = Table.empty
    .addCol("id", IntCol(Array(1, 2, 3)))
    .addCol("name",TextCol(Array("Alice", "Bob", "Carol")))
    .addCol("flag",BoolCol(Array(true, false, true)))

table.schema.id // id must be there, validated at compile time

We get full type safety and validation that our field is present at compile time. Also the schema here is explorable using your ide ! Hoovering over the schema, you can spot his type, a very special type.

val schema: Schema{val id: Col[Int]; val name: Col[String]; val flag: Col[Boolean]}

A way to read this is that this type is the Schema type, with three refinements corresponding to the 3 columns of our table.

In this article, we are going to read an see the whole implementation that leads to this result.

Implemetation

Before going more in depth, I really would like to thanks Jamie Thompson for its excellent implementation of refinement type and many other type-level-macro piece of code that were relly eye-opening for me. Especially the ops-mirror repository, which is by itself tackling a very interesting problem.

Keeping the type information

First thing we need to have a way to keep the column names and types while building the map. For this I used two type-level tuples, one for the column names and one for the column types.

Note the usage of name.type to get the litteral type corresponding to the inputed string. Scala has a good support for these, for instance the most precise type for the number 3 is 3, which is a subtype of Int.

case class Table[Ns <: Tuple, Ts <: Tuple](cols: Map[String, Col[?]]):
    inline def addCol[T](name: String, col: Col[T]): Table[Tuple.Append[Ns, name.type], Tuple.Append[Ts, T]] =
        Table(cols + (name -> col))

Setup the Schema

As seen previously, the Schema object seems very alien. Let’s focus on its definition.

class Schema(table: Table[?, ?]) extends Selectable {
  def selectDynamic(name: String): Any = table.cols(name)
}
  • Selectable is the key entrypoint to the machinery and usefulness of Refinement Types
  • When called using one of the phantom refinement field, the selectDynamic method will be applied with name as the refinement name, and the result will be casted to the refinement value (hence the Any, as we don’t care about what’s out).
  • the table type arguments are irrelevant (hence the wildcards) because the Schema type is already crafted when its usage occurs

The key thing to understand here is that the Schema class is just an empty vessel, something to weaves together our refinements.

Crafting types

In order to build type we obviously need a macro. The goal of this macro will be to take a typed table in input, and output a refined schema.

Since the return type of our macro will be dependent on the input we will have to use infamous transparent keyword together with a token Any return type.

case class Table[Ns <: Tuple, Ts <: Tuple](cols: Map[String, Col[?]]):
    // addCol impl //

    transparent inline def schema: Any =
        ${ macros.schemaImpl('this) }

Here is the complete macro implementation, don’t worry we are going to explain it line by line.

def schemaImpl[Ns <: Tuple, Ts <: Tuple](table: Expr[Table[Ns, Ts]])(using Quotes, Type[Ns], Type[Ts]): Expr[Any] = 
    import quotes.reflect.* // 1
    
    val names = utils.stringsFromTuple[Ns]    // 2
    val types = utils.typesFromTuple[Ts].map: // 2
        case '[t] => Type.of[Col[t]]          // 3
    
    val refinements = names.zip(types).foldLeft(Type.of[Schema]: Type[?]): (acc, p) => // 4
          ((acc, p): @unchecked) match                                                 // 5
            case ('[acc], (l, '[e])) =>
              Refinement(TypeRepr.of[acc], l, TypeRepr.of[e]).asType                   // 6

    refinements match
        case '[refined] => '{new Schema($table).asInstanceOf[refined]}                 // 7
  1. quotes.reflect.* is needed to unlock powerful macro features like type building or compile time reporting
  2. These helpers materialize the Ns tuples in a list of constant Strings, and Ts is a list of Type[?], which is the reflection entrypoint for types in scala macros. You can find the original implemention here.
  3. Here we map the columns inner type to get the desired column value. We could have specialized more there, for instance retrurning Type.of[IntCol] on case '[Int].
  4. Using labels and types, we fold over the Schema type to build the refinement.
  5. Weird case of pattern matching which dissaloy matching Type (the case '[e] patterns)

And finally we get the desired result: compile time guarantee that we can access strongly typed fields in a user-defined semi-static object.

Conclusion

Building refinements is a very advanced feature of the language, that require a good understanding of the underlying machinery regarding types. The price is high, but the benefits to have reliable, auto-complete aware type-safe APIs is priceless.

A programing language is often judged not by its features or its complexity but by the quality of its tooling. Scala suffers from the inverse problem : a poor tooling hampering a nice, concise and beatiful language.

Let’s bring more ide support and tooling for our users, let’s use refinement types !