Compilation

Compilation

The distribution of a language depends on the processor and the operating system. For each architecture, a distribution of Objective CAML contains the toplevel system, the bytecode compiler, and in most cases a native compiler.

Command Names

The figure 7.5 shows the command names of the different compilers in the various Objective CAML distributions. The first four commands are available for all distributions.

ocaml toplevel loop

ocamlrun bytecode interpreter

ocamlc bytecode batch compiler

ocamlopt native code batch compiler

ocamlc.opt optimized bytecode batch compiler

ocamlopt.opt optimized native code batch compiler

ocamlmktop new toplevel constructor

Figure 7.5: Commands for compiling.

The optimized compilers are themselves compiled with the Objective CAML native compiler. They compile faster but are otherwise identical to their unoptimized counterparts.

Compilation Unit

A compilation unit corresponds to the smallest piece of an Objective CAML program that can be compiled. For the interactive system, the unit of compilation corresponds to a phrase of the language. For the batch compiler, the unit of compilation is two files: the source file, and the interface file. The interface file is optional - if it does not exist, then all global declarations in the source file will be visible to other compilation units. The construction of interface files is described in the chapter on module programming (14). The two file types (source and interface) are differentiated by separate file extensions.

Naming Rules for File Extensions

Figure 7.6 presents the extensions of different files used for Objective CAML and C programs.

extension meaning

.ml source file

.mli interface file

.cmo object file (bytecode)

.cma library object file (bytecode)

.cmi compiled interface file

.cmx object file (native)

.cmxa library object file (native)

.c C source file

.o C object file (native)

.a C library object file (native)

Figure 7.6: File extensions.

The files example.ml and example.mli form a compilation unit. The compiled interface file (example.cmi) is used for both the bytecode and native code compiler. The C language related files are used when integrating C code with Objective CAML code. (12).

The Bytecode Compiler

The general form of the batch compiler commands are:

command options file_name

For example:

ocamlc -c example.ml

The command-line options for both the native and bytecode compilers follow typical Unix conventions. Each option is prefixed by the character -. File extensions are interpreted in the manner described by figure 7.6. In the above example, the file example.ml is considered an Objective CAML source file and is compiled. The compiler will produce the files example.cmo and example.cmi. The option -c informs the compiler to generate individual object files, which may be linked at a later time. Without this option, the compiler will produce an executable file named a.out.

The table in figure 7.7 describes the principal options of the bytecode compiler. The table in figure 7.8 indicates other possible options.

Principal options

-a construct a runtime library

-c compile without linking

-o name_of_executable specify the name of the executable

-linkall link with all libraries used

-i display all compiled global declarations

-pp command uses command as preprocessor

-unsafe turn off index checking

-v display the version of the compiler

-w list choose among the list the level of warning message (see fig. 7.9)

-impl file indicate that file is a Caml source (.ml)

-intf file indicate that file is a Caml interface (.mli)

-I directory add directory in the list of directories

Figure 7.7: Principal options of the bytecode compiler.

Other options

light process -thread (19, page ??)

linking -g, -noassert (10, page ??)

standalone executable -custom, -cclib, -ccopt, -cc (see page ??)

runtime -make-runtime , -use-runtime

C interface -output-obj (12, page ??)

Figure 7.8: Other options for the bytecode compiler.

To display the list of bytecode compiler options, use the option -help.

The different levels of warning message are described in figure 7.9. A message level is a switch (enable/disable) represented by a letter. An upper case letter activates the level and a lower case letter disables it.

Principal levels

A/a enable/disable all messages

F/f partial application in a sequence

P/p for incomplete pattern matching

U/u for missing cases in pattern matching

X/x enable/disable all other messages

for hidden object M/m and V/v (see chapter 15)

Figure 7.9: Description of compilation warnings.

By default, the highest level (A) is chosen by the compiler.

Example usage of the bytecode compiler is given in figure 7.10.

Figure 7.10: Session with the bytecode compiler.

Native Compiler

The native compiler has behavior similar to the bytecode compiler, but produces different types of files. The compilation options are generally the same as those described in figures 7.7 and 7.8. It is necessary to take out the options related to runtime in figure 7.8. Options specific to the native compiler are given in figure 7.11. The different warning levels are same.

-compact optimize the produced code for space

-S keeps the assembly code in a file

-inline level set the aggressiveness of inlining

Figure 7.11: Options specific to the native compiler.

Inlining is an elaborated version of macro-expansion in the preprocessing stage. For functions whose arguments are fixed, inlining replaces each function call with the body of the function called. Several different calls produce several copies of the function body. Inlining avoids the overhead that comes with function call setup and return, at the expense of object code size. Principal inlining levels are:

0 : The expansion will be done only when it will not increase the size of the object code.
1 : This is the default value; it accepts a light increase on code size.
n>1 : Raise the tolerance for growth in the code. Higher values result in more inlining.

Toplevel Loop

The toplevel loop provides only two command line options.

-I directory: adds the indicated directory to the list of search paths for compiled source files.
-unsafe: instructs the compiler not to do bounds checking on array and string accesses.

The toplevel loop provides several directives which can be used to interactively modify its behavior. They are described in figure 7.12. All these directives begin with the character # and are terminated by ;;.

#quit ;; quit from the toplevel interaction

#directory directory ;; add the directory to the search path

#cd directory ;; change the working directory

#load object_file ;; load an object file (.cmo)

#use source_file ;; compile and load a source file

#print_depth depth ;; modify the depth of printing

#print_length width ;; modify the length of printing

#install_printer function ;; specify a printing function

#remove_printer function ;; remove a printing function

#trace function ;; trace the arguments of the function

#untrace function ;; stop tracing the function

#untrace_all ;; stop all tracing

Figure 7.12: Toplevel loop directives.

The directives dealing with directories respect the conventions of the operating system used.

The loading directives do not have exactly the same behavior. The directive #use reads the source file as if it was typed directly in the toplevel loop. The directive #load loads the file with the extension .cmo. In the later case, the global declarations of this file are not directly accessible. If the file example.ml contains the global declaration f, then once the bytecode is loaded (#load "example.cmo";;), it is assumed that the value of f could be accessed by Example.f, where the first letter of the file is capitalized. This notation comes from the module system of Objective CAML (see chapter 14, page ??).

The directives for the depth and width of printing are used to control the display of values. This is useful when it is necessary to display the contents of a value in detail.

The directives for printer redefinition are used to install or remove a user defined printing function for values of a specified type. In order to integrate these printer functions into the default printing procedure, it is necessary to use the Format library(8) for the definition.

The directives for tracing arguments and results of functions are particularly useful for debugging programs. They will be discussed in the chapter on program analysis (10).

Figure 7.13 shows a session in the toplevel loop.

Figure 7.13: Session with the toplevel loop.

Construction of a New Interactive System

The command ocamlmktop can be used to construct a new toplevel executable which has specific library modules loaded by default. For example, ocamlmktop is often used for pulling native object code libraries (typically written in C) into a new toplevel.

ocamlmktop options are a subset of those used by the bytecode compiler (ocamlc):

-cclib libname, -ccopt option, -custom, -I directory -o executable_name

The chapter on graphics programming (5, page ??) uses this command for constructing a toplevel system containing the Graphics library in the following manner:

ocamlmktop -custom -o mytoplevel graphics.cma -cclib \ 
           -I/usr/X11/lib -cclib -lX11

This command constructs an executable with the name mytoplevel, containing the bytecode library graphics.cma. This standalone executable (-custom, see the following section) will be linked to the library X11 (libX11.a) which in turn will be looked up in the path /usr/X11/lib.

`ocaml`	toplevel loop
`ocamlrun`	bytecode interpreter
`ocamlc`	bytecode batch compiler
`ocamlopt`	native code batch compiler
`ocamlc.opt`	optimized bytecode batch compiler
`ocamlopt.opt`	optimized native code batch compiler
`ocamlmktop`	new toplevel constructor

extension	meaning
`.ml`	source file
`.mli`	interface file
`.cmo`	object file (bytecode)
`.cma`	library object file (bytecode)
`.cmi`	compiled interface file
`.cmx`	object file (native)
`.cmxa`	library object file (native)
`.c`	C source file
`.o`	C object file (native)
`.a`	C library object file (native)

Principal options
`-a`	construct a runtime library
`-c`	compile without linking
`-o` name_of_executable	specify the name of the executable
`-linkall`	link with all libraries used
`-i`	display all compiled global declarations
`-pp` command	uses command as preprocessor
`-unsafe`	turn off index checking
`-v`	display the version of the compiler
`-w` list	choose among the list the level of warning message (see fig. 7.9)
`-impl` file	indicate that file is a Caml source (.ml)
`-intf` file	indicate that file is a Caml interface (.mli)
`-I` directory	add directory in the list of directories

Other options
light process	`-thread` (19, page ??)
linking	`-g`, `-noassert` (10, page ??)
standalone executable	`-custom`, `-cclib`, `-ccopt`, `-cc` (see page ??)
runtime	`-make-runtime` , `-use-runtime`
C interface	`-output-obj` (12, page ??)

Principal levels
`A/a`	enable/disable all messages
`F/f`	partial application in a sequence
`P/p`	for incomplete pattern matching
`U/u`	for missing cases in pattern matching
`X/x`	enable/disable all other messages
for hidden object	`M/m` and `V/v` (see chapter 15)

`-compact`	optimize the produced code for space
`-S`	keeps the assembly code in a file
`-inline` level	set the aggressiveness of inlining

`#quit` `;;`	quit from the toplevel interaction
`#directory` directory `;;`	add the directory to the search path
`#cd` directory `;;`	change the working directory
`#load` object_file `;;`	load an object file (`.cmo`)
`#use` source_file `;;`	compile and load a source file
`#print_depth` depth `;;`	modify the depth of printing
`#print_length` width `;;`	modify the length of printing
`#install_printer` function `;;`	specify a printing function
`#remove_printer` function `;;`	remove a printing function
`#trace` function `;;`	trace the arguments of the function
`#untrace` function `;;`	stop tracing the function
`#untrace_all` `;;`	stop all tracing