Compilation
The distribution of a language depends on the processor and the operating
system. For each
architecture, a distribution of
Objective CAML
contains the toplevel system, the bytecode compiler, and in most cases a native compiler.
Command Names
The figure 7.5 shows the command names of the different compilers in the various Objective CAML distributions. The first four commands are available for all distributions.
ocaml |
toplevel loop |
ocamlrun |
bytecode interpreter |
ocamlc |
bytecode batch compiler |
ocamlopt |
native code batch compiler |
ocamlc.opt |
optimized bytecode batch compiler |
ocamlopt.opt |
optimized native code batch compiler |
ocamlmktop |
new toplevel constructor |
Figure 7.5: Commands for compiling.
The optimized compilers are themselves compiled with the Objective CAML native compiler. They compile faster but are otherwise identical to their unoptimized counterparts.
Compilation Unit
A compilation unit corresponds to the smallest piece of an
Objective CAML program
that can be compiled. For the interactive system, the unit of compilation
corresponds to a phrase of the language. For the batch compiler, the unit of compilation is two files: the source
file, and the interface file. The interface file is optional -
if it does not exist, then all global declarations in the source file will be visible to other compilation units. The construction of interface files is described in the chapter on module programming (14). The two file types (source and interface) are differentiated by separate file extensions.
Naming Rules for File Extensions
Figure 7.6 presents the extensions of different files used for Objective CAML and C programs.
extension |
meaning |
.ml |
source file |
.mli |
interface file |
.cmo |
object file (bytecode) |
.cma |
library object file (bytecode) |
.cmi |
compiled interface file |
.cmx |
object file (native) |
.cmxa |
library object file (native) |
.c |
C source file |
.o |
C object file (native) |
.a |
C library object file (native) |
Figure 7.6: File extensions.
The files example.ml and example.mli form a compilation unit. The compiled interface file (example.cmi) is used for both the bytecode and native code compiler.
The C language related files are used when integrating C code with Objective CAML code.
(12).
The Bytecode Compiler
The general form of the batch compiler commands are:
command options file_name
For example:
ocamlc -c example.ml
The command-line options for both the native and bytecode compilers follow typical Unix conventions. Each option is
prefixed by the character -
. File extensions are interpreted in the manner described by figure 7.6. In the above example, the file example.ml is considered an Objective CAML source file and is compiled. The compiler will produce the files example.cmo and example.cmi. The option -c
informs the compiler to generate individual object
files, which may be linked at a later time.
Without this option, the compiler will produce an executable file named a.out
.
The table in figure 7.7 describes the principal options of the bytecode compiler. The table in figure 7.8 indicates other possible options.
Principal options |
-a |
construct a runtime library |
-c |
compile without linking |
-o name_of_executable |
specify the name of the executable |
-linkall |
link with all libraries used |
-i |
display all compiled global declarations |
-pp command |
uses command as preprocessor |
-unsafe |
turn off index checking |
-v |
display the version of the compiler |
-w list |
choose among the list the level of warning message
(see fig. 7.9) |
-impl file |
indicate that file is a Caml source (.ml) |
-intf file |
indicate that file is a Caml interface (.mli) |
-I directory |
add directory in the list of directories |
Figure 7.7: Principal options of the bytecode compiler.
Other options |
light process |
-thread (19, page ??) |
linking |
-g, -noassert
(10, page ??) |
standalone executable |
-custom, -cclib, -ccopt, -cc
(see page ??) |
runtime |
-make-runtime , -use-runtime |
C interface |
-output-obj
(12, page ??) |
Figure 7.8: Other options for the bytecode compiler.
To display the list of bytecode compiler options, use the option -help.
The different levels of warning message are described in figure 7.9. A message level is a switch (enable/disable) represented by a letter. An upper case letter activates the level and a lower case letter disables it.
Principal levels |
|
A/a |
enable/disable all messages |
F/f |
partial application in a sequence |
P/p |
for incomplete pattern matching |
U/u |
for missing cases in pattern matching |
X/x |
enable/disable all other messages |
for hidden object |
M/m and V/v (see chapter 15) |
Figure 7.9: Description of compilation warnings.
By default, the highest level (A) is chosen by the compiler.
Example usage of the bytecode compiler is given in figure 7.10.
Figure 7.10: Session with the bytecode compiler.
Native Compiler
The native compiler has behavior similar to the bytecode compiler, but
produces different types of files. The compilation options are generally the same as those described in figures 7.7 and 7.8. It is necessary to take out the options related to runtime in figure 7.8. Options specific to the native compiler are given in figure 7.11. The
different warning levels are same.
-compact |
optimize the produced code for space |
-S |
keeps the assembly code in a file |
-inline level |
set the aggressiveness of inlining |
Figure 7.11: Options specific to the native compiler.
Inlining is an elaborated version of macro-expansion in the preprocessing stage.
For functions whose arguments are fixed, inlining replaces each function call with the body of the function called. Several different calls produce several copies of the function body. Inlining avoids the overhead that comes with function call setup and
return, at the expense of object code size. Principal inlining levels are:
-
0 : The expansion will be done only when it will not increase the size of the object code.
- 1 : This is the default value; it accepts a light increase on code size.
- n>1 : Raise the tolerance for growth in the code. Higher values result in more inlining.
Toplevel Loop
The toplevel loop provides only two command line options.
-
-I directory: adds the indicated directory to the list of
search paths for compiled source files.
- -unsafe: instructs the compiler not to do bounds checking on
array and string accesses.
The toplevel loop provides several directives which
can be used to interactively modify its behavior. They are described in figure 7.12. All these directives begin with the character #
and
are terminated by ;;
.
#quit ;; |
quit from the toplevel interaction |
#directory directory ;; |
add the directory to the search path |
#cd directory ;; |
change the working directory |
#load object_file ;; |
load an object file (.cmo) |
#use source_file ;; |
compile and load a source file |
#print_depth depth ;; |
modify the depth of printing |
#print_length width ;; |
modify the length of printing |
#install_printer function ;; |
specify a printing function |
#remove_printer function ;; |
remove a printing function |
#trace function ;; |
trace the arguments of the function |
#untrace function ;; |
stop tracing the function |
#untrace_all ;; |
stop all tracing |
Figure 7.12: Toplevel loop directives.
The directives dealing with directories respect the conventions of the operating
system used.
The loading directives do not have exactly the same behavior. The directive
#use reads the source file as if it was typed directly in the toplevel loop. The
directive #load
loads the file with the extension .cmo
. In the later case, the global declarations of this file are not directly accessible.
If the file example.ml contains the global declaration f, then once the bytecode is loaded (#load "example.cmo";;
), it is assumed that the value of f could be accessed by
Example.f, where the first letter of the file is capitalized. This notation comes from the module system of Objective CAML (see chapter 14, page ??).
The directives for the depth and width of printing are used to control the display
of values. This is useful when it is necessary to display the contents of a value in detail.
The directives for printer redefinition are used to install or remove a user defined printing function for values of a specified type. In order to integrate these printer functions into the default printing procedure, it is necessary to use the Format
library(8) for the definition.
The directives for tracing arguments and results of functions are particularly useful for debugging programs. They will be discussed in the chapter on program analysis (10).
Figure 7.13 shows a session in the toplevel loop.
Figure 7.13: Session with the toplevel loop.
Construction of a New Interactive System
The command ocamlmktop can be used to construct a new toplevel
executable which has specific library modules loaded by default. For example, ocamlmktop is often used for pulling native object code libraries (typically written in C) into a new toplevel.
ocamlmktop options are a subset of those used by the bytecode compiler (ocamlc):
-cclib libname, -ccopt option, -custom,
-I directory -o executable_name |
The chapter on graphics programming (5, page ??) uses this command for constructing a toplevel system containing the Graphics
library in the following manner:
ocamlmktop -custom -o mytoplevel graphics.cma -cclib \
-I/usr/X11/lib -cclib -lX11
This command constructs an executable with the name mytoplevel, containing
the bytecode library graphics.cma. This standalone executable (-custom, see the following section) will be linked to the library X11 (libX11.a) which in turn will be looked up in the path /usr/X11/lib.