How to use two different monads in the same do-expression?

138 views Asked by At

To practice my Haskell I decided to write a little JSON parser. In the main file I call the different parts of my parser, print the results so I have more information for debugging and then write the parsed JSON back to a file:

{-# LANGUAGE OverloadedStrings #-}

module Main where

import qualified Lexer as L
import qualified Parser as P
import qualified Printer as PR
import qualified Data.Text.Lazy.IO as TIO

main :: IO ()
main = do
    input  <- TIO.readFile "input.json"
    case L.tokenize input of
        Nothing -> putStrLn "Syntax error!"
        Just tokens -> do
            print tokens
            case P.parse tokens of
                Nothing -> putStrLn "Parse error!"
                Just parsedValue -> do
                    print parsedValue
                    TIO.writeFile "output.json" $ PR.toText parsedValue

Unfortunately, I get this ugly nested code where I use multiple do-expressions inside of each other. In my understanding, one of the main reasons to use monads and do-notation is to avoid this kind of code-nesting. I could for example use the Maybe monad to evaluate the different parsing steps (lexing, parsing) without the need to check the success of each step individually. Sadly this is not possible in this example because I need to use functions such as print and writeFile that require an IO monad alternately with functions that require a Maybe monad.

How could I refactor this code to be less nested and to include less do-expressions? Or more generally, how can I write clean code that contains calls to functions of different monads? Is it somehow possible to "mix" two monads in the same do-expression, a bit like this?

main :: IO ()
main = do
    input  <- TIO.readFile "input.json"
    tokens <- L.tokenize input
    print tokens
    parsedValue <- P.parse tokens
    print parsedValue
    TIO.writeFile "output.json" $ PR.toText parsedValue
2

There are 2 answers

0
HTNW On

First of all, good intuition about do notation! In this case, you want to combine the Either String monad together with the IO monad. The result will be a new monad in which you will get a flat do-block. (Note you don't want Maybe, since Maybe doesn't let you record error information.) The combined monad of Either String and IO is called ExceptT String IO, where ExceptT is the following type defined in the transformers package (which should come with any installation of GHC).

newtype ExceptT e m a = ExceptT (m (Either e a))
instance Monad m => Monad (ExceptT e m) -- and other instances

You'll want to use it together with a function like

orError :: Functor f => e -> f (Maybe a) -> ExceptT e f a
orError err x = ExceptT $ maybe (Left err) Right <$> x

that annotates the "uninformative" failure of Maybe with a given error message, as well as a function like

printingError :: ExceptT String IO () -> IO ()
printingError x = do
  result <- runExceptT x
  case result of
    Left err -> putStrLn err
    Right _  -> pure ()

which "handles" the ExceptT String effect and leaves behind just IO. You will also need the function (defined in transformers)

lift :: IO a -> ExceptT e IO a

to fit existing IO actions into this new monad.

main :: IO ()
main = printingError $ do
    input  <- lift $ TIO.readFile "input.json"
    tokens <- orError "Syntax error!" $ L.tokenize input
    lift $ print tokens
    parsedValue <- orError "Parse error!" $ P.parse tokens
    lift $ print parsedValue
    lift $ TIO.writeFile "output.json" $ PR.toText parsedValue

An alternative solution is to just use a function like

orError :: String -> Maybe a -> IO a
orError err Nothing    = ioError $ userError err -- in System.IO.Error
orError err (Just ret) = pure ret

-- in which case
main :: IO ()
main = do
    input  <- TIO.readFile "input.json"
    tokens <- orError "Syntax error!" =<< L.tokenize input
    print tokens
    parsedValue <- orError "Parse error!" =<< P.parse tokens
    print parsedValue
    TIO.writeFile "output.json" $ PR.toText parsedValue

This leverages the fact that IO already has a built-in exception mechanism. However, the difference (I consider it an issue) with this is that the errors that can arise from an action are not clear in its type anymore, and therefore well-structured error handling is not enforced. (E.g. note that I'm not forced to handle my exceptions with a function like printingError anymore. The exceptions just bubble up past main and get handled by the runtime system.)

(NB: I haven't tested anything in this answer. Sorry if there are errors.)

0
Silvio Mayolo On

This is actually a very big question you've stumbled upon, and to a certain extent it's still an active area of research.

The problem is simple to state. If I have a Functor f and a Functor g, I can get a Functor (Compose f g) as

data Compose f g a = Compose { runCompose :: f (g a) }

instance (Functor f, Functor g) => Functor (Compose f g) where
    fmap f (Compose x) = Compose $ fmap (fmap f) x

That is, the composition of two Functors is a Functor.

Likewise, if I have an Applicative f and an Applicative g, then their composition is also an Applicative:

instance (Applicative f, Applicative g) => Applicative (Compose f g) where
    pure = Compose . pure . pure
    Compose ff <*> Compose xx = Compose (fmap (<*>) ff <*> xx)

However, if I have a Monad f and a Monad g, then it's not immediately clear how to make a Monad (Compose f g). And in fact, in general, the composition of two monads is not a monad. We can certainly try.

-- Does NOT compile!
instance (Monad f, Monad g) => Monad (Compose f g) where
    return = pure
    Compose x >>= f = Compose $ ??? $ fmap (fmap f) x

But we can't fill in that ???. No matter how we try, once we apply f, we'll get a f (g (Compose f g b)) (up to isomorphism, an f (g (f (g b))), and we need a Compose f g b (equiv, f (g b)).

Every monad can join. That is, every monad can flatten two layers of itself down to one

join :: Monad m => m (m a) -> m a

But we have an f (g (f (g b))) and want a f (g b). If only we could swap that middle g and f. We could go

f (g (f (g b))) -> f (f (g (g b))) -> f (g b)
                ^                  ^
                Some new           |
                operation          Joins

Fundamentally, figuring out which monads can compose is all about identifying which f and g can do that middle operation. That is, for which f and g we can write

forall b. g (f b) -> f (g b)

We can already put some constraints on this. If g is IO and f is a pure monad (read: Doesn't do IO), then that signature is

forall b. IO (f b) -> f (IO b)

and we would have to pull an f outside of the IO. Without unsafePerformIO this is impossible, and I'm not going to write a Monad instance that relies on unsafePerformIO. So Compose f IO is not a Monad in general (though Compose IO g can often be one).

There are a few different solutions to this problem. The simplest is monad transformers, implementing by the transformers package. transformers changes the frame a bit. Rather than writing "monads", we write a sort of factory for a monad called a "monad transformer". In vanilla Haskell with no external packages, we would just say Either e :: * -> * is a Monad.

data Either e a = Left e | Right a

Under transformers, we wrap this in an additional layer of indirection.

data ExceptT e m a = ExceptT (m (Either e a))

Then we say, whereas Either e :: * -> * is a Monad, ExceptT e :: (* -> *) -> * -> * is a monad transformer. That is, given a monad m, we can apply ExceptT e to get a monad ExceptT e m, and that monad is the result of composing an Either on the inside of m. It's a map from monads to monads.

We can get Functor and Applicative instances for ExceptT e m using the same technique as Compose.

instance Functor m => Functor (ExceptT e m) where
    -- Ordinary Functor composition
    fmap f (ExceptT x) = ExceptT $ fmap (fmap f) x

instance Applicative m => Applicative (ExceptT e m) where
    -- Ordinary Applicative composition
    pure = ExceptT . pure . Right
    ExceptT ff <*> ExceptT xx = ExceptT (fmap (<*>) ff <*> xx)

Now when we write the same thing for our (attempted) Monad instance, we say

instance Monad m => Monad (ExceptT e m) where
    return = pure
    ExceptT x >>= f = ExceptT $ ??? $ fmap (fmap f) x

The type of that unknown ??? is m (Either e (ExceptT e m b)) -> m (Either e b), which we can actually write.

instance Monad m => Monad (ExceptT e m) where
    return = pure
    ExceptT x >>= f = ExceptT $ go $ fmap (fmap f) x
        where go value = value >>= \value' ->
                         case value' of
                           Left e -> pure (Left e)
                           Right (ExceptT ma) -> ma

If you're wondering how I came up with the go function, I suggest you try it yourself as an exercise. You want a function of type m (Either e (ExceptT e m b)) -> m (Either e b), so just start trying to write it and follow the type errors.

So now we have a way to, given an arbitrary monad m, get a monad which is m plus the effects of Either. And ExceptT is a real type that exists in transformers.

If you want to convert an Either e a to an ExceptT e m a (that is, add an effect of m without using it), that's except. If you want to convert an m to an ExceptT e m a, that's lift. So your proposed main function would look something like

-- Not exact code, just an approximation to give you an idea.
mainExcept :: ExceptT String IO ()
mainExcept = do
    input  <- lift $ TIO.readFile "input.json"
    tokens <- except $ L.tokenize input
    lift $ print tokens
    parsedValue <- except $ P.parse tokens
    lift $ print parsedValue
    lift $ TIO.writeFile "output.json" $ PR.toText parsedValue

Now, as you're probably already noticing, this gets a bit verbose. Nobody wants to constantly be lift . lift . excepting all of their code all over the place. And that's where the cutting edge is: There's several different approaches to making this less noisy.

  • mtl (the monad transformer library) defines a bunch of typeclasses that "magically" insert the lift calls to make the types line up. Basically, we leverage Haskell's very powerful type inference and typeclass resolution system to insert boilerplate for us.
  • polysemy uses a category theory construct called free monads to essentially make a "super monad" called Sem which is capable of representing effects in a very abstract way.

Entire books could be written on how to make this ergonomic, but it all boils down to the fundamental problem you've posed here: "I have two monads and I want to squish them together and get another monad"