I'll start by introducing a concrete problem (StackOverflow guys like that). Say you define a simple type

```
data T a = T a
```

This type is a `Functor`

, `Applicative`

and a `Monad`

. Ignoring automatic deriving, to get those instances you have to write each one of them, even though `Monad`

implies `Applicative`

, which implies `Functor`

. More than that, I could define a class like this

```
class Wrapper f where
wrap :: a -> f a
unwrap :: f a -> a
```

This is a pretty strong condition and it definitely implies `Monad`

, but I can't write

```
instance Wrapper f => Monad f where
return = wrap
fa >>= f = f $ unwrap fa
```

because this, for some reason, means "everything is a `Monad`

(every `f`

), only if it's a `Wrapper`

", instead of "everything that's a `Wrapper`

is a `Monad`

".

Similarly you can't define the `Monad a => Applicative a`

and `Applicative a => Functor a`

instances.

Another thing you can't do (which is only probably related, I really don't know) is have one class be a superclass of another one, and provide a default implementation of the subclass. Sure, it's great that `class Applicative a => Monad a`

, but it's much less great that I still have to define the `Applicative`

instance before I can define the `Monad`

one.

This isn't a rant. I wrote a lot because otherwise this would quickly the marked as "too broad" or "unclear". The question boils down to the title. I know (at least I'm pretty sure) that there is some theoretical reason for this, so I'm wondering what exactly are the benefits here.

As a sub question, I would like to ask if there are viable alternatives that still keep all (or most) of those advantages, but allow what I wrote.

Addition: I suspect one of the answers might be something along the lines "What if my type is a `Wrapper`

, but I don't want to use the `Monad`

instance that that implies?". To this I ask, why couldn't the compiler just pick the most specific one? If there is an `instance Monad MyType`

, surely it's more specific than `instance Wrapper a => Monad a`

.

# Best How To :

There's a lot of questions rolled into one here. But let's take them one at a time.

First: why doesn't the compiler look at instance contexts when choosing which instance to use? This is to keep instance search efficient. If you require the compiler to consider only instances whose instance heads are satisfied, you essentially end up requiring your compiler to do back-tracking search among all possible instances, at which point you have implemented 90% of Prolog. If, on the other hand, you take the stance (as Haskell does) that you look only at instance heads when choosing which instance to use, and then simply enforce the instance context, there is no backtracking: at every moment, there is only one choice you can make.

Next: why can't you have one class be a superclass of another one, and provide a default implementation of the subclass? There is no fundamental reason for this restriction, so GHC offers this feature as an extension. You could write something like this:

```
{-# LANGUAGE DefaultSignatures #-}
class Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
default pure :: Monad f => a -> f a
default (<*>) :: Monad f => f (a -> b) -> f a -> f b
pure = return
(<*>) = ap
```

Then once you had provided an `instance Monad M where ...`

, you could simply write `instance Applicative M`

with no `where`

clause and have it Just Work. I don't really know why this wasn't done in the standard library.

Last: why can't the compiler allow many instances and just pick the most specific one? The answer to this one is sort of a mix of the previous two: there are very good fundamental reasons this doesn't work well, yet GHC nevertheless offers an extension that does it. The fundamental reason this doesn't work well is that the most specific instance for a given value can't be known before runtime. GHC's answer to this is, for polymorphic values, to pick the most specific one compatible with the full polymorphism available. If later that thing thing gets monomorphised, well, too bad for you. The result of this is that some functions may operate on some data with one instance and others may operate on that same data with another instance; this can lead to very subtle bugs. If after all this discussion you still think that's a good idea, and refuse to learn from the mistakes of others, you can turn on `IncoherentInstances`

.

I think that covers all the questions.