Streams of Data
Streams are (potentially infinite) sequences containing elements of
the same kind. The evaluation of a part of a stream is done on
demand, whenever it is needed by the current computation. A stream is
therefore a lazy data structure.
The stream type is an abstract data type; one does not need to
know how it is implemented. We manipulate objects of this type using
constructor functions and destructor
(or selector) functions. For the convenience of the user, Objective CAML has simple
syntactic constructs to construct streams and to access their
elements.
Warning
Streams are an extension of the language, not part of the stable
core of Objective CAML.
Construction
The syntactic sugar to construct streams is inspired by that for lists
and arrays. The empty stream is written:
# [<
>]
;;
- : 'a Stream.t = <abstr>
One may construct a stream by enumerating its elements, preceding each
one with an with a single quote (character '):
# [<
'
0
;
'
2
;
'
4
>]
;;
- : int Stream.t = <abstr>
Expressions not preceded by an apostrophe are considered to be
sub-streams:
# [<
'
0
;
[<
'
1
;
'
2
;
'
3
>]
;
'
4
>]
;;
- : int Stream.t = <abstr>
# let
s1
=
[<
'
1
;
'
2
;
'
3
>]
in
[<
s1;
'
4
>]
;;
- : int Stream.t = <abstr>
# let
concat_stream
a
b
=
[<
a
;
b
>]
;;
val concat_stream : 'a Stream.t -> 'a Stream.t -> 'a Stream.t = <fun>
# concat_stream
[<
'
"if"
;
'
"c"
;'
"then"
;'
"1"
>]
[<
'
"else"
;'
"2"
>]
;;
- : string Stream.t = <abstr>
The Stream module also provides other construction functions.
For instance, the functions of_channel and
of_string return a stream containing a sequence of
characters, received from an input stream or a string.
# Stream.of_channel
;;
- : in_channel -> char Stream.t = <fun>
# Stream.of_string
;;
- : string -> char Stream.t = <fun>
The deferred computation of streams makes it possible to manipulate
infinite data structures in a way similar to the type 'a enum
defined on page ??. We define the stream of
natural numbers by its first element and a function calculating the
stream of elements to follow:
# let
rec
nat_stream
n
=
[<
'n
;
nat_stream
(n+
1
)
>]
;;
val nat_stream : int -> int Stream.t = <fun>
# let
nat
=
nat_stream
0
;;
val nat : int Stream.t = <abstr>
Destruction and Matching of Streams
The primitive next permits us to evaluate, retrieve, and
remove the first element of a stream, all at once:
# for
i=
0
to
1
0
do
print_int
(Stream.next
nat)
;
print_string
" "
done
;;
0 1 2 3 4 5 6 7 8 9 10 - : unit = ()
# Stream.next
nat
;;
- : int = 11
When the stream is exhausted, an exception is raised.
# Stream.next
[<
>]
;;
Uncaught exception: Stream.Failure
To manipulate streams, Objective CAML offers a special-purpose matching
construct called destructive matching. The value matched is
calculated and removed from the stream. There is no notion of
exhaustive match for streams, and, since the data type is lazy and
potentially infinite, one may match less than the whole stream. The
syntax for matching is:
Syntax
match expr with parser
[< 'p1 ...>] -> expr1
| ...
The function next could be written:
# let
next
s
=
match
s
with
parser
[<
'x
>]
->
x
;;
val next : 'a Stream.t -> 'a = <fun>
# next
nat;;
- : int = 12
Note that the enumeration of natural numbers picks up where we left it
previously.
As with function abstraction, there is a syntactic form matching a
function parameter of type Stream.t.
Syntax
parser p -> ...
The function next can thus be rewritten:
# let
next
=
parser
[<
'x>]
->
x
;;
val next : 'a Stream.t -> 'a = <fun>
# next
nat
;;
- : int = 13
It is possible to match the empty stream, but take care: the stream
pattern [<>]
matches every stream. In fact, a stream
s is always equal to the stream [<
[<>]
;
s
>]
.
For this reason, one must reverse the usual order of matching:
# let
rec
it_stream
f
s
=
match
s
with
parser
[<
'x
;
ss
>]
->
f
x
;
it_stream
f
ss
|
[<>]
->
()
;;
val it_stream : ('a -> 'b) -> 'a Stream.t -> unit = <fun>
# let
print_int1
n
=
print_int
n
;
print_string" "
;;
val print_int1 : int -> unit = <fun>
# it_stream
print_int1
[<
'
1
;
'
2
;
'
3
>]
;;
1 2 3 - : unit = ()
Since matching is destructive, one can equivalently write:
# let
rec
it_stream
f
s
=
match
s
with
parser
[<
'x
>]
->
f
x
;
it_stream
f
s
|
[<>]
->
()
;;
val it_stream : ('a -> 'b) -> 'a Stream.t -> unit = <fun>
# it_stream
print_int1
[<
'
1
;
'
2
;
'
3
>]
;;
1 2 3 - : unit = ()
Although streams are lazy, they want to be helpful, and never refuse
to furnish a first element; when it has been supplied once it is lost.
This has consequences for matching. The following function is an
attempt (destined to fail) to display pairs from a stream of integers,
except possibly for the last element.
# let
print_int2
n1
n2
=
print_string
"("
;
print_int
n1
;
print_string
","
;
print_int
n2
;
print_string
")"
;;
val print_int2 : int -> int -> unit = <fun>
# let
rec
print_stream
s
=
match
s
with
parser
[<
'x;
'y
>]
->
print_int2
x
y;
print_stream
s
|
[<
'z
>]
->
print_int1
z;
print_stream
s
|
[<>]
->
print_newline()
;;
val print_stream : int Stream.t -> unit = <fun>
# print_stream
[<
'
1
;
'
2
;
'
3
>]
;;
(1,2)Uncaught exception: Stream.Error("")
The first two two members of the stream were displayed properly, but
during the evaluation of the recursive call (print_stream
[<
3
>]
),
the first pattern found a value for x, which was
thereby consumed. There remained nothing more for y. This
was what caused the error. In fact, the second pattern is useless,
because if the stream is not empty, then first pattern always begins
evaluation.
To obtain the desired result, we must sequentialize the matching:
# let
rec
print_stream
s
=
match
s
with
parser
[<
'x
>]
->
(match
s
with
parser
[<
'y
>]
->
print_int2
x
y;
print_stream
s
|
[<>]
->
print_int1
x;
print_stream
s)
|
[<>]
->
print_newline()
;;
val print_stream : int Stream.t -> unit = <fun>
# print_stream
[<
'
1
;
'
2
;
'
3
>]
;;
(1,2)3
- : unit = ()
If matching fails on the first element of a pattern however, then we
again have the familiar behavior of matching:
# let
rec
print_stream
s
=
match
s
with
parser
[<
'
1
;
'y
>]
->
print_int2
1
y;
print_stream
s
|
[<
'z
>]
->
print_int1
z;
print_stream
s
|
[<>]
->
print_newline()
;;
val print_stream : int Stream.t -> unit = <fun>
# print_stream
[<
'
1
;
'
2
;
'
3
>]
;;
(1,2)3
- : unit = ()
The Limits of Matching
Because it is destructive, matching streams differs from matching on
sum types. We will now illustrate how radically different it can be.
We can quite naturally write a function to compute the sum of the
elements of a stream:
# let
rec
sum
s
=
match
s
with
parser
[<
'n;
ss
>]
->
n+
(sum
ss)
|
[<>]
->
0
;;
val sum : int Stream.t -> int = <fun>
# sum
[<
'
1
;
'
2
;
'
3
;
'
4
>]
;;
- : int = 10
However, we can just as easily consume the stream from the inside,
naming the partial result:
# let
rec
sum
s
=
match
s
with
parser
[<
'n;
r
=
sum
>]
->
n+
r
|
[<>]
->
0
;;
val sum : int Stream.t -> int = <fun>
# sum
[<
'
1
;
'
2
;
'
3
;
'
4
>]
;;
- : int = 10
We will examine some other important uses of streams in
chapter 11, which is devoted to lexical and syntactic
analysis. In particular, we will see how consuming a stream from the
inside may be profitably used.