Introduction
Let’s talk about a kind of common issue in programing. Imagine you have a large form to fill, or a SQL table to select column from, or really any type of data where you are interested by selecting subsets of its atomic components (like form results, or SQL columns). I call this data structures field maps data, and let’s see how we can implement the using the power of the Scala 3 language.
The goal is to be able to quickly represent any subset of the fields without having to create dedidated data structures each time.
A First solution
Our goal is to design such a structure in a way that prevent any runtime exception. Here are our needs :
- The structure must be immutable : no modification in place
- Getting the data given a key
- Adding new fields to the FieldMap
Since the structure will be heterogeneous, the key has to carry in some way the type of the value to return. Given these constraints, we can write it like this :
class Key[A]
That’s it. Since this is a regular class and not a case class, each new instance will have its own identity and will carry the type information. Then, using the universal apply method, we can create keys just like this :
val name = Key[String]
val city = Key[String]
val flag = Key[Boolean]
To represent the FieldMap proper, given the interface specified, one can write :
trait FieldMap:
def get[A](key: Key[A]): Option[A]
def put[A](key: Key[A], data: A): FieldMap
Here we are kind of forced to have an Option[A]
returned in order to tackle the case of a key not present yet in the structure. We also want to emphasize immutability, so our write operations needs to return a modified copy of it.
Now that we have a solid interface, let’s implement it ! First of all, what would be the ideal data structure to back it ? An immutable Map is a natural choice, but what would be the types of the keys and value then ?
- The key type should be the identity of the Key[A] instance
- The value, well, can really be anything, so we have to resort to use Any
final case class Impl(data: Map[Key[?], Any]) extends FieldMap:
def get[A](key: Key[A]): Option[A] =
data.get(key).map(_.asinstanceOf[A])
def put[A](key: Key[A], data: A): FieldMap =
data + (key -> data)
This is pretty straightforward knowing that in scala everything is a subtype of Any. Casting the value on retrieval is completely safe since we already know the type thanks to the key type parameter.
We can then test it :
val data = Impl(Map.empty).put(name, "Peter").put(flag, false)
val success = data.get(name) // Some("Peter")
val failure = data.get(city) // None
Great success ! But there are still a lot of points of improvement :
-
Getting data means having to take into account the case of the key not being present. This will contaminate a lot of code with Option whereas we statically know when a key should be present or not in most of the cases.
-
There is no separation between reading and writing the fieldmap. Ideally, we should be able to give a read permit on all or part of the fields to “client” function. Power corrupts and our design should take this into consideration
Let’s try to address these points.
Refining the design
Encoding key sets
First, let’s try to address the issue around the Option
return type. To safely remove this, we have to track what keys are already in the FieldMap and find a way to have a compile time error when there isn’t one. This is a good application of making invalid state unrepresentable.
To properly implement the type-level keyset, we need to attach it somehow to the structure. I choose to represent it as a generic parameter, but an encoding using a type member should be possible (alas more verbose).
trait FieldMap[Keys]
To represent the set, we can use scala 3 union types, combined with singleton types. Indeed, whenever we create a key, we create a type ! Let’s look at this expression :
val city = Key[String]
What is the type of city ? One could say Key[String]
and be right, but the underlying type assigned by the scala compiler is… city.type
, which is a very specialized subtype of the Key[T]
class.
Then we use the property of union types, which are the least common supertype of their members, to build the set. Let’s see how everything mix together in the new interface definition.
Here is the put
method:
def put[A](key: Key[A], data: A): FieldMap[Keys | key.type]
As we have seen before, we modify the definition to track the keys by adding the singleton type to the union.
The retrieval is trickier :
def get[A, K <: Key[A] & Keys](key: K): A
Let’s break it down :
-
The
Option
is gone : as we are going to make it impossible to retrieve a key not present, we can simplify the retrun type. -
Instead of having a
Key[A]
as a parameter, we use a type K that needs to satisfy two conditions: It obviously need to be a subtype ofKey[A]
and it needs to be a part of the key set.
Since the set is encoded as k0.type | k1.type | … | kn.type
(given k0
, k1
, …, kn
keys), any subset would satisfy the condition. That’s why we use a intersection type to further constraint K to be a single key instance.
The implementation won’t change a lot, the signatures aside.
final class Impl[Keys](private val m: Map[Key[?], Any]) extends FieldMap[Keys]:
def get[A, K <: Key[A] & Keys](key: K): A =
m(key).asInstanceOf[A]
def put[A](key: Key[A], data: A): FieldMap[Keys | key.type] =
Impl(m + (key -> data))
We can then verify how it works :
val name = Key[String]
val city = Key[String]
val flag = Key[Boolean]
val empty: FieldMap[Nothing] = Impl(Map.empty)
val data: FieldMap[name.type | city.type] =
empty.put(name, "Peter").put(flag, false)
val success = data.get(name) // "Peter"
val failure = data.get(city) // compile error !
The compile error is pretty clear also:
Found: (city : Key[String])
Required: Key[A] & ((name : Key[String]) | (flag : Key[Boolean]))
We reached ou goal, the FieldMap
is now typesafe for the get operation !
Reader/Writer separation
A pretty nice feature would be to restrict as much as possible the set of capabilities the main structure can delegate to functions and users.
First we can eat the low hanging fruit and clearly separate the reader from the rest using trait composition:
trait FieldMapReader[Keys]:
def get[A, K <: Keys & Key[A]](key: K): A
trait FieldMap[Keys] extends FieldMapReader[Keys]:
def put[A](key: Key[A], data: A): FieldMap[Keys | key.type]
This way, we can ask for a FieldMapReader[Keys]
instead of the power of the full blown map. As our structure was immutable from the start, it was also less dangerous to pass around but this ensures another layer of safety (and it costs pretty much nothing !).
Let’s take it further. We can add contravariance to the Keys
parameter, and using the fact that A | B
is a subtype of A | B | C
, function can now specify exactly the keys they need in their signature, further locking the structure.
In order to relax the assumption, we can always bring back the Option
based get method that won’t have a strong requirement on the keyset.
// The new interface, using contravariance
trait FieldMapReader[-Keys]:
def get[A, K <: Keys & Key[A]](key: K): A
def opt[A](key: Key[A]): Option[A]
trait FieldMap[-Keys] extends FieldMapReader[Keys]:
def put[A](key: Key[A], data: A): FieldMap[Keys | key.type]
def withNameAndFlag(data: FieldMapReader[name.type | flag.type]) =
// here we can only safely get the name and flag, gated by the type system
val maybeCity = data.opt(city) // still usable but can be empty
s"Doing something with ${data.get(name)} or ${data.get(flag)}"
val data = data.put(name, "foobared").put(flag, true)
withNameAndFlag(data) // compiles and typecheck the needed fields
val onlyCity = data.put(city, "Paris")
withNameAndFlag(onlyCity) // does not compile since the required fields are not set
Conclusion
In this post, we detailed the implementation of a field map, which is a data structure that can accept typed keys and modelize sets of type-dependant key-values. Scala, and especially the union types and intersection types, helps us a lot achieving this goal.
Still, there is a lot we can to to improve it !
- The fine grained accesses that enable this implementation could be used to fuel the initialization of complex, modularized apps. A dependency can be viewed as a
FieldMapReader[Deps] => Service
ofr instance, and the layers could be combined with a state monad (maybe in a next post in this series !). - The backing structure is known at compile time, thus could be implemented more efficiently using scala 3 tuples.
I hope you enjoyed the read, and see you soon !