Haskell Tutorial

Introduction

Haskell is a general purpose, purely functional programming language named after the logician Haskell B. Curry. It was designed in 1988 by a 15-member committee to satisfy, among others, the following constraints.

It should be suitable for teaching, research, and applications, including building large systems.
It should be freely available.
It should be based on ideas that enjoy a wide consensus.
It should reduce unnecessary diversity in functional programming languages.

It's features include higher-order functions, non-strict(lazy) semantics, static polymorphic typing, user-defined algebraic datatypes, type-safe modules, stream and continuation I/O, lexical, recursive scoping, curried functions, pattern-matching, list comprehensions, extensible operators and a rich set of primitive data types.

The Structure of Haskell Programs

A module defines a collection of values, datatypes, type synonyms, classes, etc. and exports some of these resources, making them available to other modules.

A Haskell program is a collection of modules, one of which, by convention, must be called Main and must export the value main. The value of the program is the value of the identifier main in module Main, and main must have type Dialogue.

Modules may reference other modules via explicit import declarations, each giving the name of a module to be imported, specifying its entities to be imported, and optionally renaming some or all of them. Modules may be mutully recursive.

The name space for modules is flat, with each module being associated with a unique module name.

There are no mandatory type declarations, although Haskell programs often contain type declarations. The language is strongly typed. No delimiters (such as semicolons) are required at the end of definitions - the parsing algorithm makes intelligent use of layout. Note that the notation for function application is simply juxtaposition, as in sq n.

Single line comments are preceded by ``--'' and continue to the end of the line. For example:

succ n = n + 1  -- this is a successor function

Multiline and nested comments begin with {- and end with -}. Thus

{- this is a 
     multiline
        comment -}

Lexical Issues

Haskell code will be written in ``typewriter font'' as in ``f (x+y) (a-b)''. Case matters. Bound variables and type variables are denoted by identifiers beginning with a lowercase letter; types, constructors, modules, and classes are denoted by identifiers beginning with an uppercase letter.

Haskell provides two different methods for enclosing declaration lists. Declarations may be explicitly enclosed between braces { } or by the layout of the code.

For example, instead of writing:

f a + f b where { a = 5; b = 4; f x = x + 1 }

one may write:

f a + f b = where a = 5; b = 4
                  f x = x + 1

Function application is curried, associates to the left, and always has higher precedence than infix operators. Thus ``f x y + g a b'' parses as ``((f x) y) + ((g a) b)''

Values and Types

All computation is done via the evaluation of expressions(syntactic terms) to yield values. Values are divided into disjoint sets called types --- integers, functions, lists, etc. Values are first-class objects. First-class values may be passed as arguments to functions, returned as results, placed in data structures, etc. Every value has a type (intuitively a type is a set of values). Type expressions are syntactic terms which denote type values (or just types). Types are not first-class in Haskell.

Expressions are syntactic terms that denote values and thus have an associated type.

Type System

Haskell is strongly typed --- every expression has exactly one ``most general'' type (called the principle type.

Types may be polymorphic --- i.e. they may contain type variables which are universally quantified over all types. Furthermore, it is always possible to statically infer this type. User supplied type declarations are optional

Pre-defined datatypes

Haskell provides several pre-defined data types: Integer, Int, Float, Double, Bool, and Char.

Pre-defined structured datatypes

Haskell provides for structuring of data through tuples and {\sl lists. Tuples have the form:

(e₁, e₂, ..., e_n) n >=  2

If e_i has type t_i then the tuple has type (t₁, t₂, ..., t_n)

Lists have the form: [e₁, e₂, ..., e_n] where n >= 0 and every element e_i must have the same type, say t, and the type of the list is then [t]. The above list is equivalent to: e₁:e₂:...:e_n:[] that is, ``:'' is the infix operator for ``cons''.

User Defined Types

User defined datatypes are done via a ``data'' declaration having the general form:

data T u₁ ... u_n = C₁ t₁₁ ... t_1k1
| ...
| C_n t_n1 ... t_nkn

where T is a type constructor; the u_i are type variables; the C_i are (data) constructors; and the t_ij are the constituent types (possibly containing some u_i).

The presence of the u_i implies that the type is polymorphic --- it may be instantiated by substituting specific types for the u_i.

Here are some examples:

data Bool = True | False
data Color = Red | Green | Blue | Indigo | Violet
data Point a = Pt a a
data Tree a = Branch (Tree a) (Tree a) | Leaf a

Bool and Color are nullary type constructors because they have no arguments. True, False, Red, etc are nullary data constructors. Bool and Color are enumerations because all of their data constructors are nullary. Point is a product or tuple type constructor because it has only one constructor; Tree is a union types; often called an algebraic data type.

Functions

Functions are first-class and therefore ``higher-order.'' They may be defined via declarations, or ``anonymously'' via lambda abstractions. For example,

  \x -> x+1

is a lambda abstraction and is equivalent to the function succ defined by:

succ x = x + 1

If ``x'' has type t₁ and ``exp'' has type t₂ then `` \x -> exp'' has type t₁->t₂. Function definitions and lambda abstractions are ``curried'', thus facilitating the use of higher-order functions. For example, given the definition

add x y = x + y

the function succ defined earlier might be redefined as:

succ = add 1

The curried form is useful in conjunction with the function map which applies a function to each member of a list. In this case,

map (add 1) [1, 2, 3] => [2,3,4]

map applies the curried function add 1 to each member of the list [1,2,3] and returns the list [2,3,4].

Functions are defined by using one or more equations. To illustrate the variety of forms that function definitions can take are are several definitions of the factorial function. The first definition is based on the traditional recursive definition.

fac n = if n == 0 then 1
        else n*fac( n - 1)

The second definition uses two equations and pattern matching of the arguments to define the factorial function.

fac 0     = 1
fac (n+1) = (n+1)*fac(n)

The next definition uses two equations, pattern matching of the arguments and uses the library function product which returns the product of the elements of a list. It is more efficient then the traditional recursive factorial function.

fac 0     = 1
fac (n+1) = product [1..(n+1)]

The final definition uses a more sophisticated pattern matching scheme and provides error handling.

fac n | n <  0    = error "input to fac is negative"
      | n == 0    = 1
      | n >  0    = product [1..n]

The infix operators are really just functions. For example, the list concatenation operator is defined in the Prelude as:

(++) :: [a] -> [a] -> [a]
[] ++ ys = ys
(x:xs) ++ ys = x : (xs++ys)

Since infix operators are just functions, they may be curried. Curried operators are called sections. For example, the first two functions add three and the third is used when passing the addition function as a parameter.

(3+)
(+3)
(+)

Block structure

It is also permitted to introduce local definitions on the right hand side of a definition, by means of a ``where'' clause. Consider for example the following definition of a function for solving quadratic equations (it either fails or returns a list of one or two real roots):

quadsolve a b c | delta < 0  = error "complex roots"
                | delta == 0 = [-b/(2*a)]
                | delta > 0  = [-b/(2*a) + radix/(2*a),
                                -b/(2*a) - radix/(2*a)]
                  where
                  delta = b*b - 4*a*c
                  radix = sqrt delta

The first equation uses the builtin error function, which causes program termination and printing of the string as a diagnostic.

Where clauses may occur nested, to arbitrary depth, allowing Haskell programs to be organized with a nested block structure. Indentation of inner blocks is compulsory, as layout information is used by the parser.

Polymorphism

Functions and datatypes may be polymorphic; i.e., universally quantified in certain ways over all types. For example, the ``Tree'' datatype is polymorphic:

data Tree a = Branch (Tree a) (Tree a) | Leaf a

``Tree Int'' is type of trees of fixnums; ``Tree (Char -> Bool)'' is the type of trees of functions mapping characters to Booleans, etc. Furthermore:

fringe (Leaf x)            = [x]
fringe (Branch left right) = finge left ++ fringe right

``fringe'' has type ``Tree a -> [a]'', i.e. ``for all types a, fringe maps trees of a into lists of a.

Here

id x = x
[] ++ ys = ys
(x:xs) ++ ys = x : (xs++ys)
map f [] = []
map f (x:xs) = f x : map f xs

id has type a->a, (++) (append) has type: [a]->[a]->[a], and map has type (a->b)->[a]->[b]. These types are inferred automatically, but may optionally be supplied as type signatures:

id   :: a -> a
(++) :: [a] -> [a] -> [a]
map  :: (a->b) -> [a] -> [b]

Type synonyms

For convenience, Haskell provides a way to define type synonyms --- i.e. names for commonly used types. Type synonyms are created using type declarations. Examples include:

type String = [Char]
type Person = (Name, Address)
type Name   = String
data Address = None | Addr String

This definition of String is part of Haskell, and in fact the literal syntax "hello" is shorthand for:

['h','e','l','l','o']

Pattern Matching

We have already seen examples of pattern-matching in functions (fringe, ++, etc.); it is the primary way that elements of a datatype are distinguished.

Functions may be defined by giving several alternative equations, provided the formal parameters have different patterns. This provides another method of doing case analysis which is often more elegant than the use of guards. We here give some simple examples of pattern matching on natural numbers, lists, and tuples. Here is (another) definition of the factorial function, and a definition of Ackerman's function:

Accessing the elements of a tuple is also done by pattern matching. For example the selection functions on 2-tuples can be defined thus

fst (a,b) = a
snd (a,b) = b

Here are some simple examples of functions defined by pattern matching on lists:

sum [] = 0
sum (a:x) = a + sum x

product [] = 0
product (a:x) = a * product x

reverse [] = []
reverse (a:x) = reverse x ++ [a]

n+k -- patterns are useful when writing inductive definitions over integers. For example:

x ^ 0     = 1
x ^ (n+1) = x*(x^n)

fac 0 = 1
fac (n+1) = (n+1)*fac n

ack 0 n = n+1
ack (m+1) 0 = ack m 1
ack (m+1) (n+1) = ack m(ack (m+1) n)

As-patterns are used to name a pattern for use on the right-hand side. For example, the function which duplicates the first element in a list might be written as:

f (x:xs) = x:x:xs

but using an as-pattern as follows:

f s@(x:xs) = x:s

Wild-cards. A wild-card will match anything and is used where we don't care what a certain part of the input is. For example:

head (x:_)  = x
tail (_:xs) = xs

Case Expressions

Pattern matching is specified in the Report in terms of case expressions. A function definition of the form:

f p11 ... p1k = e1
...
f pn1 ... pnk = en

is semantically equivalent to:

f x1 ... xk = case (x1, ..., xk) of (p11, ..., p1k) -> e1
                                    ...
                                    (pn1, ..., pnk) -> en

Lists

Lists are pervasive in Haskell and Haskell provides a powerful set of list operators. Lists may be appended by the '++' operator. The operator '**' does list subtraction. Other useful operations on lists include the infix operator `:' which prefixes an element to the front of a list, and infix `!!' which does subscripting. Here are some examples

["Mon","Tue","Wed","Thur","Fri"] ++ ["Sat","Sun"] is
     ["Mon","Tue","Wed","Thur","Fri","Sat","Sun"]
[1,2,3,4,5] [2,4] is [1,3,5]
0:[1,2,3] is [0,1,2,3]
[0,1,2,3]!!2 is 2

Note that lists are subscripted beginning with 0. The following table summarizes the list operators.

Symbol	Operation
x:List	prefix an element to a list
List ++ List	concatenate two lists
List \\ List	list difference
List !! n	n-th element of a list n = 0..

Arithmetic sequences

There is a shorthand notation for lists whose elements form an arithmetic series.

[1..5]    -- yields [1,2,3,4,5]
[1,3..10] -- yields [1,3,5,7,9]

In the second list, the difference between the first two elements is used to compute the remaining elements in the series.

List Comprehensions

List comprehensions give a concise syntax for a rather general class of iterations over lists. The syntax is adapted from an analogous notation used in set theory (called ``set comprehension''). A simple example of a list comprehension is:

[ n*n | n <- [1..100] ]

This is a list containing (in order) the squares of all the numbers from 1 to 100. The above expression would be read aloud as ``list of all n*n such that n is drawn from the list 1 to 100''. Note that ``n'' is a local variable of the above expression. The variable-binding construct to the right of the bar is called a ``generator'' - the ``<-'' sign denotes that the variable introduced on its left ranges over all the elements of the list on its right. The general form of a list comprehension in Haskell is:

[ body | qualifiers ]

where each qualifier is either a generator, of the form: var <- exp, or else a filter, which is a boolean expression used to restrict the ranges of the variables introduced by the generators. When two or more qualifiers are present they are separated by commas. An example of a list comprehension with two generators is given by the following definition of a function for returning a list of all the permutations of a given list,

perms [] = [[]]
perms x  = [ a:y | a <- x; y <- perms (x
[a]) ]

The use of a filter is shown by the following definition of a function which takes a number and returns a list of all its factors,

factors n = [ i | i <- [1..n]; n `mod` i = 0 ]

List comprehensions often allow remarkable conciseness of expression. We give two examples. Here is a Haskell statement of Hoare's ``Quicksort'' algorithm, as a method of sorting a list,

quicksort :: [a] -> [a]
quicksort []     = []
quicksort (p:xs) = quicksort [ x | x <- xs, x <= p ]
                   ++ [ p ] ++
                   quicksort [ x | x <- xs, x >  p ]

Here is a Haskell solution to the eight queens problem. We have to place eight queens on chess board so that no queen gives check to any other. Since any solution must have exactly one queen in each column, a suitable representation for a board is a list of integers giving the row number of the queen in each successive column. In the following program the function "queens n" returns all safe ways to place queens on the first n columns. A list of all solutions to the eight queens problem is therefore obtained by printing the value of (queens 8)

queens 0 = [[]]
queens (n+1) = [ q:b | b <- queens n; q <- [0..7]; safe q b ]
safe q b = and [ not checks q b i | i <- [0..(b-1)] ]
checks q b i = q=b!!i || abs(q - b!!i)=i+1

Lazy Evaluation and Infinite Lists

Haskell's evaluation mechanism is ``lazy'', in the sense that no subexpression is evaluated until its value is required. One consequence of this is that is possible to define functions which are non-strict (meaning that they are capable of returning an answer even if one of their arguments is undefined). For example we can define a conditional function as follows,

cond True x y = x
cond False x y = y

and then use it in such situations as ``cond (x=0) 0 (1/x)''.

The other main consequence of lazy evaluation is that it makes it possible to write down definitions of infinite data structures. Here are some examples of Haskell definitions of infinite lists (note that there is a modified form of the ``..'' notation for endless arithmetic progressions)

nats = [0..]
odds = [1,3..]
ones = 1 : ones
nums_from n = n : nums_from (n+1)
squares = [ x**2 | x <- nums_from 0 ]
odd_squares xs = [ x**2 | x <- xs, odd x ]
cp xs ys = [ ( x, y ) | x <- xs, y <- ys ]     -- Cartesian Product
pyth n = [ ( a, b, c ) | a <- [1..n],          -- Pythagorean Triples
                         b <- [1..n], 
                         c <- [1..n], 
                         a + b + c <= n, 
                         a^2 + b^2 == c^2 ]
squares = [ n*n | n <- [0..] ]
fib = 1:1:[ a+b | (a,b) <- zip fib ( tail fib ) ]
primes = sieve [ 2.. ]
         where
         sieve (p:x) = p : sieve [ n | n <- x, n `mod` p > 0 ]
repeat a = x
           where x = a : x
perfects = [ n | n <- [1..]; sum(factors n) = n ]
primes = sieve [ 2.. ]
         where
         sieve (p:x) = p : sieve [ n | n <- x; n mod p > 0 ]

The elements of an infinite list are computed ``on demand'', thus relieving the programmer of specifying ``consumer-producer'' control flow.

One interesting application of infinite lists is to act as lookup tables for caching the values of a function. For example here is a (naive) definition of a function for computing the n'th Fibonacci number:

fib 0 = 0
fib 1 = 1
fib (n+2) = fib (n+1) + fib n

This naive definition of ``fib'' can be improved from exponential to linear complexity by changing the recursion to use a lookup table, thus

fib 0 = 1
fib 1 = 1
fib (n+2) = flist!!(n+1) + flist!!n
            where
            flist = map fib [ 0.. ]

alternatively,

fib n = fiblist !! n
        where
           fiblist = 1:1:[a+b| (a,b) <- zip fiblist (tail fiblist) ]

Another important use of infinite lists is that they enable us to write functional programs representing networks of communicating processes. Consider for example the Hamming numbers problem - we have to print in ascending order all numbers of the form 2^a*3^b*5^c, for a,b,c>=0. There is a nice solution to this problem in terms of communicating processes, which can be expressed in Haskell as follows

hamming = 1 : merge (f 2) (merge (f 3) (f 5))
        where
        f a = [ n*a | n <- hamming ]
        merge (a:x) (b:y) = a : merge x (b:y), if a<b
                          = b : merge (a:x) y, if a>b
                          = a : merge x y,     otherwise

Abstraction and Generalization

Haskell supports abstraction in several ways:

where expressions
function definitions
data abstraction
higher-order functions
lazy evaluation

Data Abstraction

Haskell permits the definition of abstract types, whose implementation is hidden from the rest of the program. To show how this works we give the standard example of defining stack as an abstract data type (here based on lists):

module Stack (StackType, push, pop, top, empty)
where 
data StackType a = Empty | Stk a (StackType a)
push x s = Stk x s
pop (Stk _ s) = s
top (Stk x _) = x
empty = Empty

The constructors Empty and Stk, which comprise ``the implementation'' are not exported, and thus hidden outside of the module. To make the datatype concrete, one would write:

module Stack (StackType(Empty,Stk), push, ...)
...

Higher-Order Functions

Haskell is a fully higher order language --- functions are first class citizens and can be both passed as parameters and returned as results. Function application is left associative, so f x y it is parsed as (f x) y, meaning that the result of applying f to x is a function, which is then applied to y.

In Haskell every function of two or more arguments is actually a higher order function. This permits partial parameterization. For example member is a library function such that member x a tests if the list x contains the element a (returning True or False as appropriate). By partially parameterizing member we can derive many useful predicates, such as

vowel = member ['a','e','i','o','u']
digit = member ['0','1','2','3','4','5','6','7','8','9']
month = member ["Jan","Feb","Mar","Apr","May","Jun",
                "Jul","Aug","Sep","Oct","Nov","Dec"]

As another example of higher order programming consider the function foldr, defined by

foldr op k [] = k
foldr op k (a:x) = op a (foldr op k x)

All the standard list processing functions can be obtained by partially parameterizing foldr. Here are some examples.

sum = foldr (+) 0
product = foldr (*) 1
reverse = foldr postfix []
          where postfix a x = x ++ [a]

Abstract data types

Overloading

Type Classes

I/O

Arrays

Types

Simple Types

Haskell provides three simple types, boolean, character and number.

Types	Values
Bool	True, False
Char	the ASCII character set
Int	minInt, ..., maxInt
Integer	arbitrary precision integers
Float	floating point, single precision
Double	floating point, double precision
Bin	binary numbers
String	list of characters
Funtions	lambda abstractions and definitions
Lists	lists of objects of type T
Tuples	Algebraic data types
Numbers	integers and floating point numbers

Composite Types

Haskell provides two composite types, lists and tuples. The most commonly used data structure is the list. The elements of a list must all be of the same type. In Haskell lists are written with square brackets and commas. The elements of a tuple may be of mixed type and tuples are written with parentheses and commas. Tuples are analogous to records in Pascal (whereas lists are analogous to arrays). Tuples cannot be subscripted - their elements are accessed by pattern matching.

Type	Representation	Values
list	[ comma separated list ]	user defined
tuple	( comma separated list )	user defined

Here are several examples of lists and a tuple:

[]
["Mon","Tue","Wed","Thur","Fri"]
[1,2,3,4,5]
("Jones",True,False ,39)

Type Declarations

While Haskell does not require explicit type declarations (the type inference system provides static type checking), it is good programming practice to provide explicit type declarations. Type declarations are of the form:

e :: t

where e is an expression and t is a type. For example, the factorial function has type

fac :: Integer -> Integer

while the function length which returns the length of a list has type

length :: [a] -> Integer

where [a] denotes a list whose elements may be any type.

Type Predicates

Since Haskell provides a flexible type system it also provides type predicates check on the type of an object. Haskell provides three type predicates.

Predicate	Checks if
digit	argument is a digit
letter	argument is a letter
integer	argument is an integer

Expressions

Arithmetic Operators

Haskell provides the standard arithmetic operators.

Symbol	Operation
+	addition
-	subtraction
*	multiplication
/	real division
div	integer division
mod	modulus
^	to the power of

Tuples (records)

The elements of a tuple are accessed by pattern matching. An example is given in a later section.

Logical Operators

The following table summarizes the logical operators.

Symbol	Operation
not	negation
&&	logical conjunction
\|\|	logical disjunction

Boolean Predicates

The following table summarizes the boolean operators.

Symbol	Operation
==	equal
/=	not equal
<	less than
<=	less than or equal
>	greater than
>=	greater than or equal

Modules

At the top level, a Haskell program consists of a collection of modules. A Module is really just one big declaration which begins with the keyword module. Here is an example:

module Tree ( Tree(Leaf,Branch), fringe ) where

data Tree a                  = Leaf a | Branch ( Tree a ) ( Tree a )

fringe :: Tree a -> [a]
fringe ( Leaf x )            = [x]
fringe ( Branch left right ) = fringe left ++ fringe right

Appendix

The following functions are part of the Haskell standard prelude.

BOOLEAN FUNCTIONS
CHARACTER FUNCTIONS
NUMERIC FUNCTIONS
SOME STANDARD FUNCTIONS
Prelude PreludeList Haskell provides a number of operations on lists. Haskell treats strings as lists of characters so that the list operations and functions also apply to strings.
Prelude: PreludeArray
Prelude: PreludeText
Prelude: PreludeIO

References

Bird and Wadler: Introduction to Functional Programming Prentice Hall, New York, 1988.
Field and Harrison: Functional Programming Addison-Wesley, Workingham, England, 1988.
The Yale Haskell Group: The Yale Haskell Users Manual Beta Release 1.1-0. May 1991.
Hudak, Paul et al.: Report on the Programming Language Haskell Version 1.1 August 1991.
Peyton Jones, S. L.: The Implementation of Functional Programming Languages Prentice-Hall, englewood Cliffs, NJ, 1987.

Send comments to: webmaster@cs.wwc.edu