`Map`

s and tuples are useful data types for modeling relational operations. For example, suppose we have the following table, indexed by the `Id`

column:

```
| Id | First Name | Last Name |
|----|------------|-----------|
| 0 | Gabriel | Gonzalez |
| 1 | Oscar | Boykin |
| 2 | Edgar | Codd |
```

We can model that as a `Map`

where the key is the `Id`

column and the value is a tuple of `FirstName`

and `LastName`

:

```
import Data.Map (Map) -- from the `containers` library
import qualified Data.Map as Map
type Id = Int
type FirstName = String
type LastName = String
names :: Map Id (FirstName, LastName)
names = Map.fromList
[ (0, ("Gabriel", "Gonzalez"))
, (1, ("Oscar" , "Boykin" ))
, (2, ("Edgar" , "Codd" ))
]
```

Now suppose we have another table containing Twitter handles, also indexed by `Id`

:

```
| Id | Twitter Handle |
|----|----------------|
| 0 | GabrielG439 |
| 1 | posco |
| 3 | avibryant |
```

We can also encode that as a `Map`

:

```
type TwitterHandle = String
handles :: Map Id TwitterHandle
handles = Map.fromList
[ (0, "GabrielG439")
, (1, "posco" )
, (3, "avibryant" )
]
```

One of the nice properties of `Map`

s is that you can join them together:

```
import Control.Applicative
-- I'm using `join2` to avoid confusion with `Control.Monad.join`
join2 :: Map k v1 -> Map k v2 -> Map k (v1, v2)
join2 = ... -- Implementation elided for now
```

What if we could generalize `join2`

to work on types other than `Map`

. Perhaps we could use the `Applicative`

interface to implement `join2`

:

```
join2 :: Applicative f => f a -> f b -> f (a, b)
join2 = liftA2 (,)
```

However, the `Map`

type cannot implement `Applicative`

in its current form. The reason why is that there is no sensible definition for `pure`

:

```
pure :: v -> Map k v
pure = ???
```

This would require a `Map`

that was defined for every key, which we cannot encode. Or can we?

#### Tables

Well, who says we need to use the `Map`

type from containers? What if I were to encode my `Map`

this way:

```
import Prelude hiding (lookup)
-- | A map encoded as a lookup function
newtype Table k v = Table { lookup :: k -> Maybe v }
-- | Encode a traditional map as a lookup function
from :: Ord k => Map k v -> Table k v
from m = Table (\k -> Map.lookup k m)
```

This new type of `Map`

only permits a single operation: `lookup`

, but because we constrain our API to this single operation we can now implement the full `Applicative`

interface:

```
instance Functor (Table k) where
fmap f (Table g) = Table (\k -> fmap f (g k))
-- Same as: Table (fmap (fmap f) g)
instance Applicative (Table k) where
pure v = Table (\k -> Just v)
-- Same as: Table (pure (pure v))
Table f <*> Table x = Table (\k -> f k <*> x k)
-- Same as: Table (liftA2 (<*>) f x)
```

We can promote conventional `Map`

s to this new `Table`

type using the above `from`

function:

```
names' :: Table Id (FirstName, LastName)
names' = from names
handles' :: Table Id TwitterHandle
handles' = from handles
```

... and now the more general `Applicative`

`join2`

will work on these two tables:

```
>>> let table = join2 names' handles'
>>> :type table
table :: Table Id ((FirstName, LastName), TwitterHandle)
>>> lookup table 0
Just (("Gabriel","Gonzalez"),"GabrielG439")
>>> lookup table 1
Just (("Oscar","Boykin"),"posco")
>>> lookup table 2
Nothing
```

However, in its present form we can't dump the table's contents because we don't know which keys are present in the table. Let's fix that by adding an additional field to the `Table`

type listing the keys. We will treat functions as being defined for all keys:

```
import Data.Set (Set)
import qualified Data.Set as Set
data Keys k = All | Some (Set k)
instance Ord k => Num (Keys k) where
fromInteger 0 = Some Set.empty
fromInteger n | n > 0 = All
All + _ = All
_ + All = All
Some s1 + Some s2 = Some (Set.union s1 s2)
All * ks = ks
ks * All = ks
Some s1 * Some s2 = Some (Set.intersection s1 s2)
-- | A map encoded as a lookup function
data Table k v = Table
{ keys :: Keys k
, lookup :: k -> Maybe v
}
-- | Encode a traditional map as a lookup function
from :: Ord k => Map k v -> Table k v
from m = Table
{ keys = Some (Set.fromList (Map.keys m))
, lookup = \k -> Map.lookup k m
}
```

Even after extending `Table`

with `keys`

we can still implement the `Applicative`

interface:

```
instance Functor (Table k) where
fmap f (Table ks g) = Table ks (fmap (fmap f) g)
instance Ord k => Applicative (Table k) where
pure v =
Table 1 (pure (pure v))
Table s1 f <*> Table s2 x =
Table (s1 * s2) (liftA2 (<*>) f x)
```

... and now we can add a `Show`

instance, too!

```
instance (Show k, Show v) => Show (Table k v) where
show (Table ks f) = case ks of
All -> "<function>"
Some s -> unlines (do
k <- Set.toList s
let Just v = f k
return (show (k, v)) )
```

Let's give it a test drive:

```
>>> names'
(0,("Gabriel","Gonzalez"))
(1,("Oscar","Boykin"))
(2,("Edgar","Codd"))
>>> handles'
(0,"GabrielG439")
(1,"posco")
(3,"avibryant")
>>> join2 names' handles'
(0,(("Gabriel","Gonzalez"),"GabrielG439"))
(1,(("Oscar","Boykin"),"posco"))
```

So far, so good!

#### Alternative

However, we need to support more than just inner joins. We'd also like to support left, right, and outer joins, too.

Conceptually, a left join is one in which values from the right table may be optionally present. One way we could implement this would be to define a function that converts a finite map to a function defined on all keys. This function will return `Nothing`

for keys not present in the original finite map and `Just`

for keys that were present:

```
optional :: Table k v -> Table k (Maybe v)
optional (Table ks f) =
Table All (\k -> fmap Just (f k) <|> pure Nothing)
```

Then we could define `leftJoin`

in terms of `join2`

and `optional`

:

```
leftJoin :: Table k a -> Table k b -> Table k (a, Maybe b)
leftJoin t1 t2 = join2 t1 (optional t2)
```

However, if we try to compile the above code, the compiler will give us a really interesting error message:

```
Ambiguous occurrence ‘optional’
It could refer to either ‘Main.optional’,
or ‘Control.Applicative.optional’
```

Apparently, `Control.Applicative`

has an `optional`

function, too. Let's pause to check out the type of this mysterious function:

`optional :: Alternative f => f v -> f (Maybe v)`

Wow! That type signature is suprisingly similar to the one we wrote. In fact, if `Table k`

implemented the `Alternative`

interface, the types would match.

`Alternative`

is a type class (also provided by `Control.Applicative`

) that greatly resembles the `Monoid`

type class:

```
class Applicative f => Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
```

... and the core `Alternative`

laws are identical to the `Monoid`

laws:

```
x <|> empty = x
empty <|> x = x
(x <|> y) <|> z = x <|> (y <|> z)
```

Let's dig even deeper and see how `optional`

uses this `Alternative`

type class:

`optional v = fmap Just v <|> pure Nothing`

Even the implementation is eerily similar! This strongly suggests that we should find a way to make our `Table`

type implement `Alternative`

. Perhaps something like this would work:

```
instance Ord k => Alternative (Table k) where
empty =
Table 0 (pure empty)
Table ks1 f1 <|> Table ks2 f2 =
Table (ks1 + ks2) (liftA2 (<|>) f1 f2)
```

Compare this to our `Applicative`

instance (duplicated here):

```
instance Ord k => Applicative (Table k) where
pure v =
Table 1 (pure (pure v))
Table s1 f <*> Table s2 x =
Table (s1 * s2) (liftA2 (<*>) f x)
```

The `Alternative`

instance mirrors our `Applicative`

instance except that we:

- replace
`pure v`

with`empty`

- replace
`(<*>)`

with`(<|>)`

- replace
`1`

with`0`

- replace
`(*)`

with`(+)`

... and what's really neat is that now the `optional`

function from `Control.Applicative`

behaves just like the original `optional`

function we wrote. (**Exercise:** Use equational reasoning to verify this)

#### Derived joins

Armed with this `Alternative`

instance, we can now implement `leftJoin`

in terms of the `optional`

provided by `Control.Applicative`

:

`leftJoin t1 t2 = join2 t1 (optional t2)`

Sure enough, it works:

```
>>> leftJoin names' handles'
(0,(("Gabriel","Gonzalez"),Just "GabrielG439"))
(1,(("Oscar","Boykin"),Just "posco"))
(2,(("Edgar","Codd"),Nothing)
```

Let's check out the type that the compiler infers for `leftJoin`

:

```
>>> :type leftJoin
leftJoin :: Alternative f => f a -> f b -> f (a, Maybe b)
```

Notice how there's no longer anything `Table`

-specific about `leftJoin`

. It works for anything that implements the `Alternative`

interface. I could `leftJoin`

two `Maybe`

s if I really wanted to:

```
>>> leftJoin (Just 1) (Just 2)
Just (1,Just 2)
>>> leftJoin (Just 1) Nothing
Just (1,Nothing)
>>> leftJoin Nothing (Just 1)
Nothing
```

... or two lists:

```
>>> leftJoin [1, 2] [3, 4]
[(1,Just 3),(1,Just 4),(1,Nothing),(2,Just 3),(2,Just 4),(2,
Nothing)]
```

In fact, I don't even really need specialized `leftJoin`

or `rightJoin`

functions. `optional`

is sufficiently light-weight that I could inline a right join on the fly:

```
>>> join2 (optional names') handles'
(0,(Just ("Gabriel","Gonzalez"),"GabrielG439"))
(1,(Just ("Oscar","Boykin"),"posco"))
(3,(Nothing,"avibryant"))
```

What happens if I try to do an "outer join"?

```
>>> -- DISCLAIMER: Technically not the same as an SQL outer join
>>> let o = join2 (optional names') (optional handles')
>>> o
<function>
```

The above "outer join" is defined for all keys (because both sides are `optional`

), so we get back a function! While we can't list the `Table`

(because it's conceptually infinite), we can still perform `lookup`

s on it:

```
>>> lookup o 0
Just (Just ("Gabriel","Gonzalez"),Just "GabrielG439")
>>> lookup o 2
Just (Just ("Edgar","Codd"),Nothing)
>>> lookup o 3
Just (Nothing,Just "avibryant")
>>> lookup o 4
Just (Nothing, Nothing)
```

... and if we were to join our "infinite" table against a finite table, we get back a finite table (**Exercise:** Try it! Define a new finite table to join against `o`

and see what happens)

What's nice about `optional`

is that we can easily left-join or right-join in multiple tables at a time. If I had four tables of types:

```
t1 :: Table k a
t2 :: Table k b
t3 :: Table k c
t4 :: Table k d
```

... I could left join `t2`

, `t3`

, and `t4`

into `t1`

by just writing:

```
liftA4 (,,,) t1 (optional t2) (optional t3) (optional t4)
:: Table k (a, Maybe b, Maybe c, Maybe d)
```

Now that I think about it, I don't even really need to provide `join2`

/`join3`

/`join4`

/`join5`

since they are not much shorter than using the `liftA`

family of functions in `Control.Applicative`

:

```
-- Exercise: What would `join1` be?
join2 = liftA2 (,)
join3 = liftA3 (,,)
join4 = liftA4 (,,,)
join5 = liftA5 (,,,,)
```

In other words, I can implement almost any imaginable join just by using `liftA{n}`

and some permutation of `optional`

s. I don't even know what I'd call this join:

`liftA5 (,,,,) t1 (optional t2) t3 (optional t4) t5`

... but the beauty is that I don't have to give a name for it. I can easily write anonymous joins on the fly using the `Control.Applicative`

module. Moreover, the above code will work for **anything** that implements the `Alternative`

interface.

#### Conclusion

`Control.Applicative`

provides a very general API for relational joins: the `Alternative`

type class (which includes `Applicative`

, since `Applicative`

is a super-class of `Alternative`

). Perhaps `Control.Applicative`

could be improved slightly by providing the `join{n}`

family of functions listed above, but it's still highly usable in its present state.

Note that this trick only works for relational abstractions embedded within `Haskell`

. This API can be generalized for external relational data stores (i.e. Postgres), which I will cover in a subsequent post.

I've made something very similar to these before to describe the interface to a database system. I called the composition of a function with Maybe a "partial function" (which unfortunately is strongly associated with computability) and a function with known keys an "enumerable partial function". The known keys can be viewed as a meet semi-lattice with an absorbing element bottom. This doesn't cover All keys, which I handled by considering any bounded lattice where we can test for bottom. A nice thing about working with lattices is when implementing the database you can define generalized indexed search trees elegantly in terms of a lattice equipped with a measure. In a way you are breaking each relation into the product of a lattice that can be reasoned about and a function that can't.

ReplyDeleteWhat I didn't realize, and your post demonstrates very nicely, is that the bounded lattice gives rise to an Alternative instance.

Amazing!

ReplyDelete