CMPUT 325 Schedule and Outcomes - 10. Week 8

CMPUT 325 Schedule and Outcomes

10. Week 8 - Mar 03

10.1 Topics

Modules
Non-strict evaluation and recursively defined sequences. Fun with Currying and slicing.
Side-effects, sequencing, and monads.
Hoogle - a Haskell function search engine! http://www.haskell.org/hoogle
The Haskell Cheat Sheet by Justin Bailey. You can find it at http://cheatsheet.codeslower.com
Yet another Haskell tutorial http://www.haskell.org/tutorial/modules.html

The simplest module is just a collection of types and function definitions. The module name should map to a file name, which means that you should use valid Haskel type names for your modules (use _ not -, for example).

The simplest module just exports everything, as in code/Week08/ExpTreeAll.hs

module ExpTree where

-- export everything

-- Expression trees

-- a is the type of the values in the tree,

-- f is the type of the operations performed on the values

data ExpTree a f =

V a |

Op f (ExpTree a f) (ExpTree a f)

deriving (Eq, Show)

-- an evaluator on expression trees

evalExp (V n) = n

evalExp (Op f e1 e2) = (f (evalExp e1) (evalExp e2))

But of course it is beter to explicitly export the interface to your module. code/Week08/ExpTree.hs

module ExpTree ( ExpTree(V,Op), evalExp) where

-- export just the interface, anything else we add is hidden

-- Expression trees

-- a is the type of the values in the tree,

-- f is the type of the operations performed on the values

data ExpTree a f =

V a |

Op f (ExpTree a f) (ExpTree a f)

deriving (Eq, Show)

-- an evaluator on expression trees

evalExp (V n) = n

evalExp (Op f e1 e2) = (f (evalExp e1) (evalExp e2))

The simplest use of a module is to just import it into your current namespace, which means that names in the module might collide with names already in the namespace. code/Week08/exptree-test-unqualified.hs

import ExpTree

t1 = (Op (+) (V 10) (V 20))

t2 = (Op (*) (V 4) (V (- 1)))

t3 = (Op (+) t1 t2)

-- evalExp t1

-- evalExp t2

-- evalExp t3

-- Note that the trees are functions of the types, so here is a tree containing

-- lists, and operations between lists

tl1 = (V [1, 2, 3])

tl2 = (V [4, 5, 6, 7])

tl3 = (Op (++) tl1 tl2)

-- evalExp tl1

-- evalExp tl2

-- evalExp tl3

-- Now try

-- tm1 = (Op (++) tl3 t3)

When the collision occurs, you can qualify names with the module in order to resolve the collision. code/Week08/exptree-test-resolved.hs

import ExpTree

data Operation f = Op f

-- qualify to the specific name space

o1 = Main.Op (-)

t1 = (ExpTree.Op (+) (ExpTree.V 10) (ExpTree.V 20))

-- only one evalExp in scope, so no need to resolve, but you should.

v = evalExp t1

Sometimes it makes sense to use an explicit qualifier. code/Week08/exptree-test-qualified.hs

-- import ExpTree as ET

import qualified ExpTree as ET

t1 = (ET.Op (+) (ET.V 10) (ET.V 20))

t2 = (ET.Op (*) (ET.V 4) (ET.V (- 1)))

t3 = (ET.Op (+) t1 t2)

-- evalExp t1

-- evalExp t2

-- evalExp t3

-- Note that the trees are functions of the types, so here is a tree containing

-- lists, and operations between lists

tl1 = (ET.V [1, 2, 3])

tl2 = (ET.V [4, 5, 6, 7])

tl3 = (ET.Op (++) tl1 tl2)

-- evalExp tl1

-- evalExp tl2

-- evalExp tl3

-- Now try

-- tm1 = (Op (++) tl3 t3)

10.3 Using Non-Strict Evaluation to Recursively Define Sequences

the main goal of this section is to make you comfortable with dealing with the various function currying and slicing operations of Haskell, and with the power of non-strict evaluation.

The usual recursive definition of arithmetic sequences is highly inefficient since earlier values in the sequence are repeatedly computed. Here is the typical bad Fibonacci: code/Week08/fib-usual.hs

-- the usual exponentially bad definition of the Fibonacci sequence

fib 0 = 1

fib 1 = 1

fib n = fib(n-1) + fib(n-2)

Memoization is one technique to speed up such computations. By keeping track of previously computed values the repeated computation of the same value is elimiated. You can think of the previous terms of a sequence as a kind of memoization. Add in the non-strict evaluation (implemented via laziness) and you can write an equation for the sequence (as we did in Racket). Here is a fast version of the Fibonacci sequence. code/Week08/fib-fast.hs

-- computing fibonacci in linear time

-- shifted sum of a list

-- ugly version: ssum lst@(h:rest) = zipWith (+) lst rest

-- elegant version:

ssum lst = zipWith (+) lst (drop 1 lst)

-- make sure full precision integers

fib_seq :: [Integer]

-- recursive equation for the sequence

fib_seq = [0, 1] ++ ssum fib_seq

fib n = fib_seq !! n

You can generalize this idea of defining an arithmetic sequences in terms of a fixedpoint equation as follows. code/Week08/linseq.hs

{-

Development of a generic inifinite linear sequence generator

-}

import Data.Ratio

-- the usual exponentially bad definition of the Fibonacci sequence

fib_slow 0 = 1

fib_slow 1 = 1

fib_slow n = fib_slow(n-1) + fib_slow(n-2)

{-

First we begin with the Fibonacci sequence:

fib 0 = 0

fib 1 = 1

fib (n+2) = fib (n+1) + fib n -- for n >= 0

NOTE: you can have n+k patterns in Haskell if you use

ghci -XNPlusKPatterns

or put this in your source file:

{-# LANGUAGE NPlusKPatterns #-}

The sequence has the property that it satisfies the recurrence relation

fib = [0, 1] ++ zipWith (+) fib (drop 1 fib)

Easy to verify:

fib = 0 1 1 2 3 5 ...

drop 1 fib = 1 1 2 3 5 8 ...

zipWith = 1 2 3 5 8 13 ...

fib = [0, 1] ++ zipWith (+) fib (drop 1 fib)

-}

-- make sure full precision integers

fib :: [Integer]

-- recursive equation for the sequence

fib = [0, 1] ++ zipWith (+) fib (drop 1 fib)

-- get the n'th term, 0-origin indexing

fib_n n = fib !! n

{-

Now consider a 3 term recurrence sequence, n >= 0, where seq_k is the

k'th term in seq, 0-origin indexing.

seq_(n+3) = seq_(n+2) + seq_(n+1) + seq_n

seq_0 = 0

seq_1 = 1

seq_2 = 2

We can only zip 2 lists at a time, but zipWith (+) associates to the left

so we can zip 3 lists by zipping the first 2, then 3rd. So again we get

a recurrence relation.

seq = [0, 1, 2] ++ zipWith (+) (zipWith (+) lst (drop 1 lst)) (drop 2 lst)

Now note the form of the zipWith above. It is doing a foldl where the

operator is (zipWith (+)) (Haskell currying to the rescue!). So we have

this

-}

seq' :: [Integer]

seq' = [0, 1, 2] ++ foldl (zipWith (+)) seq' [drop 1 seq', drop 2 seq']

{-

Now, these sequences are infinite, so we can use the foldl1 form of

fold that takes the initial value of the foldl from the first element

of the sequence being processed. This puts the sequence generator into

a very general form:

seq :: [Integer]

seq = [0, 1, 2] ++ foldl1 (zipWith (+)) [drop 0 seq, drop 1 seq, drop 2 seq]

Note the [drop 0 seq, drop 1 seq, drop 2 seq] looks like a map of

(\i -> drop i seq) onto [0, 1, 2]

This raises the possibility of supplying just the list of indicies of previous

terms used for the current term. For example, perhaps the sequence was

s_(n+3) = s_(n+2) + s_(n)

skipping s_(n+1).

Let's actually create a generator that does this. We pass it a list, terms,

of the previous indices used to construct the current value of the sequence.

Lets assume that terms is monotonic increasing ordered.

If terms is [i_0, ..., i_k]

then

l = max [i_0, ..., i_k]

is the maximum look backwards. We need to define the initial first l terms

of the sequence

[s_0, ..., s_l]

So the defining equation of each term in the sequence s is

s_(n+l+1) = s_(n+i_0) + s_(n+i_1) + ... + s_(n+i_k)

and seq_gen_s does this computation

So the defining equation is

s = [s_0, ..., s_l] ++ seq_gen_s terms s

-}

seq_gen_s :: [Int] -> [Integer] -> [Integer]

seq_gen_s terms seq = foldl1 (zipWith (+)) (map (\i -> drop i seq) terms )

fib1 = [0, 1] ++ seq_gen_s [0, 1] fib1

-- current term is sum of previous 3 terms

seq_3 = [1, 2, 3] ++ seq_gen_s [0, 1, 2] seq_3

-- current term is sum of two terms before with a gap

-- f_n+3 = f_(n+2) + f_n

seq_2 = [1, 2, 3] ++ seq_gen_s [0, 2] seq_2

{-

Finally, we can add coefficients to the prior terms, which are tuples

of (i, c) where i is the index of the term and c is the coefficient to

multiply the term. So we get this very general sequence generator:

If coeff is [ (i_0, c_0), ..., (i_k,c_k) ]

then

l = max [i_0, ..., i_k]

is the maximum look backwards. And the initial first l terms of the

sequence are

[s_0, ..., s_l]

Then

s = [s_0, ..., s_l] ++ seq_gen_c coeff s

is the sequence, for n >= 0

s_(n+l+1) = c_0 * s_(n+i_0) + c_1 * s_(n+i_1) + ... + c_k * s_(n+i_k)

-}

seq_gen_c :: [(Int,Integer)] -> [Integer] -> [Integer]

seq_gen_c coeff seq = foldl1 (zipWith (+))

(map (\(i,c) -> map (c *) (drop i seq)) coeff )

seq_a = [0, 1, 2] ++ seq_gen_c [ (0,1), (1,2), (2,-1) ] seq_a

{-

We still can't generate all the arithmetic sequences because we need

to have the ability to add a constant term to the recurrence.

So it is easy to define a full feature sequence generator by returning to

the original foldl and starting with the infinite sequence of the constant

term.

s_(n+l+1) = const +

c_0 * s_(n+i_0) + c_1 * s_(n+i_1) + ... + c_k * s_(n+i_k)

-}

seq_gen :: Num b => b -> [(Int, b)] -> [b] -> [b]

seq_gen const coeff seq =

foldl (zipWith (+)) (repeat const)

(map (\(i,c) -> map (c *) (drop i seq)) coeff )

nats = [0] ++ seq_gen 1 [(0,1)] nats

fib_a = [0, 1] ++ seq_gen 0 [(0,1), (1,1)] fib_a

10.4 Chaining Commands - Racket Inuition for Monads

In our discussion of Racket, we mentioned two programing concepts: accumulators and continuations. Both are a form of sequencing.

An accumulator contains some state that is common to and mutated by a chain of computations. Because the state has to be modified, it has to be passed into the computation and then passed out.

Continuations are a way of sequencing control. Control is passed into a function by calling it, and then when the function is finished, it passes control on to the continuation.

We will apply both of these notions.

If a step in the computation both computes a value of interest, and changes the world then both must be results of the function. So for example we have this: code/Week08/accum-chain.rkt

#lang racket

; compute a function of x, and record this in the log state

; the default value of the state is the empty list

(define (my-action x [log-state '()])

(letrec (

[ result (+ 1 x) ]

[ new-log-state (append log-state

(list (format "Computed ~a" result))) ]

)

(list result new-log-state)

))

; chain a sequence of actions, passing state to next one

; and remembering the result

(letrec (

[ c1 (my-action 42 '()) ]

[ c2 (my-action 7 (second c1)) ]

[ c3 (my-action (+ (first c1) (first c2)) (second c2)) ]

)

c3

)

; converting this same let chain into lambdas puts the actions into

; reverse order!

((lambda (c1)

((lambda (c2)

((lambda (c3) c3)

(my-action (+ (first c1) (first c2)) (second c2)) ; compute c3

)

(my-action 7 (second c1)) ; compute c2

)

(my-action 42 '()) ; compute c1

)

; if we pull out the function result v and the state s

; the computation is clearer

(letrec (

[ c1 (my-action 42 '()) ]

[ v1 (first c1) ]

[ s1 (second c1) ]

[ c2 (my-action 7 s1) ]

[ v2 (first c2) ]

[ s2 (second c2) ]

[ c3 (my-action (+ v1 v2) s2) ]

[ v3 (first c3) ]

[ s3 (second c3) ]

)

(list v3 s3) ; put the result back into the caller's context

)

; it would be nice to have a chaining operator that pulled out

; the result into a new variable, and automatically passed the

; updated state to the next action. The chaining operator is

; applied to the initial state, and next-action must be a

; function that is applied to the next-state, i.e. a chain operation

; (chain action arg result-var next-action)

; is a function from the current state to a final result

; where

; action is a function (op arg state-in) -> (result, state-out)

; that takes an argument and current state, applies op on them,

; and returns a list of the operation result and next state.

; result-var is a variable identifier that is bound to the result

; next-action is a chain function that can reference any variable

; from any enclosing chain operation

(define-syntax-rule (chain action arg result-var next-chain)

(lambda (state-in)

(letrec (

[ c (action arg state-in) ] ; perform the action on the current state

[ result-var (first c) ] ; capture the result in a variable

[ state-out (second c) ] ; capture the new state

)

(next-chain state-out) ; feed the new state into the next chain

; which can make reference to result-var

)))

((chain my-action 42 v1

(chain my-action 7 v2

(chain my-action (+ v1 v2) v3 ; previous v1 v2 are available

(lambda (s) (list v3 s))

)

'())

; if we do not want state to escape into the outside world

; then we can define a done statement that prevents us from

; obtaining it.

(define-syntax-rule (done exp)

(lambda (state-in) exp)

)

((chain my-action 42 v1

(chain my-action 7 v2

(chain my-action (+ v1 v2) v3

(done v3)

)

'())

; and finally, using the default input state of the action

; we can start off the chain

(define-syntax-rule (chain-start op arg result-var next-action)

(letrec (

[ c (op arg) ] ; exploit default initial state of op

[ result-var (first c) ]

[ state-out (second c) ]

)

(next-action state-out)

))

(chain-start my-action 42 v1

(chain my-action 7 v2

(chain my-action (+ v1 v2) v3

(done v3)

)))

Of course you can have many kinds of actions, and they can be chained together, so long as they have the same common type of state.

Continuations are a way of sequencing control. Control is passed into a function by calling it, and then when the function is finished, it passes control on to the continuation.

10.5 Monads

Let's recap the discussion above, but in Haskell. First of all, when we take an action, it might not need an argument (for example it reads a value, or generates a pseudo random number), and it might not return a result other than affecting the state (for example printing a value). What is needed to do the chaining is a way of wrapping the current state so it can be passed on to the next action implicitly. This is something a monad can do.

The do notation of Haskell does the same kind of thing as the chaining operations we implemented above in Racket. code/Week08/do-notation.hs

-- Perform the actions a1 to an in order, binding each result to variable vi,

-- and then compute a final result.

do

v1 <- a1

v2 <- a2

.

vn <- an

return (f v1 v2 ... vn)

-- The do syntax is special syntax, equivalent to

a1 >>= \ v1 ->

a2 >>= \ v2 ->

.

an >>= \ vn ->

return (f v1 v2 ... vn)

-- The >>= operator has type

(>>=) :: Monad m => m a -> (a -> m b) -> m b

-- The return operator has type

return :: Monad m => a -> m a

Here is an example of using the IO and Maybe monad. code/Week08/monad1.hs

-- from haskell-fast-hard

import Data.Maybe

import Data.Char

{- reads does string parsing

reads "1 2 3" :: [(Int,String)]

-}

maybeRead :: Read a => String -> Maybe a

maybeRead s = case reads s of

[(x,"")] -> Just x

_ -> Nothing

getListFromString :: String -> Maybe [Integer]

getListFromString str = maybeRead $ "[" ++ str ++ "]"

-- all of the actions in the do are in IO

main = do

putStrLn "Enter a list of numbers (separated by comma):"

input <- getLine

let maybeList = getListFromString input in

case maybeList of

Just l -> print (sum l)

Nothing -> error "Bad format. Good Bye."

Note how the return operator is different from our done above. It keeps the result in the same monad as the rest of the actions.

10.6 Flipped Work

We will post a collection of Haskell study questions for the upcoming midterm.

10. Week 8 - Mar 03
CMPUT 325 Schedule / Version 2.31 2014-04-04

10.1 Topics

10.2 Haskell Modules

10.3 Using Non-Strict Evaluation to Recursively Define Sequences

10.4 Chaining Commands - Racket Inuition for Monads

10.5 Monads

10.6 Flipped Work