Previous Contents Next

HTTP Servlets

A servlet is a ``module'' that can be integrated into a server application to respond to client requests. Although a servlet need not use a specific protocol, we will use the HTTP protocol for communication (see figure 21.1). In practice, the term servlet refers to an HTTP servlet.

The classic method of constructing dynamic HTML pages on a server is to use CGI (Common Gateway Interface) commands. These take as argument a URL which can contain data coming from an HTML form. The execution then produces a new HTML page which is sent to the client. The following links describe the HTTP and CGI protocols.

Link


http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc1945.html

Link


http://hoohoo.ncsa.uiuc.edu/docs/cgi/overview.html
It is a slightly heavyweight mechanism because it launches a new program for each request.

HTTP servlets are launched just once, and can can decode arguments in CGI format to execute a request. Servlets can take advantage of the Web browser's capabilities to construct a graphical interface for an application.



Figure 21.1: communication between a browser and an Objective CAMLserver


In this section we will define a server for the HTTP protocol. We will not handle the entire specification of the protocol, but instead will limit ourselves to those functions necessary for the implementation of a server that mimics the behavior of a CGI application.

At an earlier time, we defined a generic server module Gsd. Now we will give the code to create an application of this generic server for processing part of the HTTP protocol.

HTTP and CGI Formats

We want to obtain a server that imitates the behavior of a CGI application. One of the first tasks is to decode the format of HTTP requests with CGI extensions for argument passing.

The clients of this server can be browsers such as Netscape or Internet Explorer.

Receiving Requests

Requests in the HTTP protocol have essentially three components: a method, a URL and some data. The data must follow a particular format.

In this section we will construct a collection of functions for reading, decomposing and decoding the components of a request. These functions can raise the exception:

# exception Http_error of string ;;
exception Http_error of string


Decoding
The function decode, which uses the helper function rep_xcode, attempts to restore the characters which have been encoded by the HTTP client: spaces (which have been replaced by +), and certain reserved characters which have been replaced by their hexadecimal code.


# let rec rep_xcode s i =
let xs = "0x00" in
String.blit s (i+1) xs 2 2;
String.set s i (char_of_int (int_of_string xs));
String.blit s (i+3) s (i+1) ((String.length s)-(i+3));
String.set s ((String.length s)-2) '\000';
Printf.printf"rep_xcode1(%s)\n" s ;;
val rep_xcode : string -> int -> unit = <fun>

# exception End_of_decode of string ;;
exception End_of_decode of string

# let decode s =
try
for i=0 to pred(String.length s) do
match s.[i] with
'+' -> s.[i] <- ' '
| '%' -> rep_xcode s i
| '\000' -> raise (End_of_decode (String.sub s 0 i))
| _ -> ()
done;
s
with
End_of_decode s -> s ;;
val decode : string -> string = <fun>


String manipulation functions
The module String_plus contains some functions for taking apart character strings:

# module String_plus =
struct
let prefix s n =
try String.sub s 0 n
with Invalid_argument("String.sub") -> s

let suffix s i =
try String.sub s i ((String.length s)-i)
with Invalid_argument("String.sub") -> ""

let rec split c s =
try
let i = String.index s c in
let s1, s2 = prefix s i, suffix s (i+1) in
s1::(split c s2)
with
Not_found -> [s]

let unsplit c ss =
let f s1 s2 = match s2 with "" -> s1 | _ -> s1^(Char.escaped c)^s2 in
List.fold_right f ss ""
end ;;


Decomposing data from a form
Requests typically arise from an HTML page containing a form. The contents of the form are transmitted as a character string containing the names and values associated with the fields of the form. The function get_field_pair transforms such a string into an association list.

# let get_field_pair s =
match String_plus.split '=' s with
[n;v] -> n,v
| _ -> raise (Http_error ("Bad field format : "^s)) ;;
val get_field_pair : string -> string * string = <fun>

# let get_form_content s =
let ss = String_plus.split '&' s in
List.map get_field_pair ss ;;
val get_form_content : string -> (string * string) list = <fun>


Reading and decomposing
The function get_query extracts the method and the URL from a request and stores them in an array of character strings. One can thus use a standard CGI application which retrieves its arguments from the array of command-line arguments. The function get_query uses the auxiliary function get. We arbitrarily limit requests to a maximum size of 2555 characters.

# let get =
let buff_size = 2555 in
let buff = String.create buff_size in
(fun ic -> String.sub buff 0 (input ic buff 0 buff_size)) ;;
val get : in_channel -> string = <fun>

# let query_string http_frame =
try
let i0 = String.index http_frame ' ' in
let q0 = String_plus.prefix http_frame i0 in
match q0 with
"GET"
-> begin
let i1 = succ i0 in
let i2 = String.index_from http_frame i1 ' ' in
let q = String.sub http_frame i1 (i2-i1) in
try
let i = String.index q '?' in
let q1 = String_plus.prefix q i in
let q = String_plus.suffix q (succ i) in
Array.of_list (q0::q1::(String_plus.split ' ' (decode q)))
with
Not_found -> [|q0;q|]
end
| _ -> raise (Http_error ("Unsupported method: "^q0))
with e -> raise (Http_error ("Unknown request: "^http_frame)) ;;
val query_string : string -> string array = <fun>

# let get_query_string ic =
let http_frame = get ic in
query_string http_frame;;
val get_query_string : in_channel -> string array = <fun>


The Server

To obtain a CGI pseudo-server, able to process only the GET method, we write the class http_servlet, whose argument fun_serv is a function for processing HTTP requests such as might have been written for a CGI application.

# module Text_Server = Server (struct type t = string
let to_string x = x
let of_string x = x
end);;

# module P_Text_Server (P : PROTOCOL) =
struct
module Internal_Server = Server (P)

class http_servlet n np fun_serv =
object(self)
inherit [P.t] Internal_Server.server n np

method receive_h fd =
let ic = Unix.in_channel_of_descr fd in
input_line ic

method process fd =
let oc = Unix.out_channel_of_descr fd in (
try
let request = self#receive_h fd in
let args = query_string request in
fun_serv oc args;
with
Http_error s -> Printf.fprintf oc "HTTP error : %s <BR>" s
| _ -> Printf.fprintf oc "Unknown error <BR>" );
flush oc;
Unix.shutdown fd Unix.SHUTDOWN_ALL
end
end;;


As we do not expect the servlet to communicate using Objective CAML's special internal values, we choose the type string as the protocol type. The functions of_string and to_string do nothing.

# module Simple_http_server =
P_Text_Server (struct type t = string
let of_string x = x
let to_string x = x
end);;
Finally, we write the primary function to launch the service and construct an instance of the class http_servlet.

# let cgi_like_server port_num fun_serv =
let sv = new Simple_http_server.http_servlet port_num 3 fun_serv
in sv#start;;
val cgi_like_server : int -> (out_channel -> string array -> unit) -> unit =
<fun>


Testing the Servlet

It is always useful during development to be able to test the parts that are already built. For this purpose, we build a small HTTP server which sends the file specified in the HTTP request as is. The function simple_serv sends the file whose name follows the GET request (the second element of the argument array). The function also displays all of the arguments passed in the request.

# let send_file oc f =
let ic = open_in_bin f in
try
while true do
output_byte oc (input_byte ic)
done
with End_of_file -> close_in ic;;
val send_file : out_channel -> string -> unit = <fun>

# let simple_serv oc args =
try
Array.iter (fun x -> print_string (x^" ")) args;
print_newline();
send_file oc args.(1)
with _ -> Printf.printf "error\n";;
val simple_serv : out_channel -> string array -> unit = <fun>

# let run n = cgi_like_server n simple_serv;;
val run : int -> unit = <fun>


The command run 4003 launches this servlet on port 4003. In addition, we launch a browser to issue a request to load the page baro.html on port 4003. The figure 21.2 shows the display of the contents of this page in the browser.



Figure 21.2: HTTP request to an Objective CAML servlet


The browser has sent the request GET /baro.html to load the page, and then the request GET /canard.gif to load the image.

HTML Servlet Interface

We will use a CGI-style server to build an HTML-based interface to the database of chapter 6 (see page ??).

The menu of the function main will now be displayed in a form on an HTML page, providing the same selections. The responses to requests are also HTML pages, generated dynamically by the servlet. The dynamic page construction makes use of the utilities defined below.

Application Protocol

Our application will use several elements from several protocols:
  1. Requests are transmitted from a Web browser to our application server in the HTTP request format.
  2. The data items within a request are encoded in the format used by CGI applications.
  3. The response to the request is presented as an HTML page.
  4. Finally, the nature of the request is specified in a format specific to the application.
We wish to respond to three kinds of request: queries for the list of mail addresses, queries for the list of email addresses, and queries for the state of received fees between two given dates. We give these query types respectively the names:
mail_addr, email_addr and fees_state. In the last case, we will also transmit two character strings containing the desired dates. These two dates correspond to the values of the fields start and end on an HTML form.

When a client first connects, the following page is sent. The names of the requests are encoded within it in the form of HTML anchors.
<HTML>
<TITLE> association </TITLE>
<BODY>
<HR>
<H1 ALIGN=CENTER>Association</H1>
<P>
<HR>
<UL>
<LI>List of
<A HREF="http://freres-gras.ufr-info-p6.jussieu.fr:12345/mail_addr">
mail addresses
</A>
<LI>List of
<A HREF="http://freres-gras.ufr-info-p6.jussieu.fr:12345/email_addr">
email addresses
</A>
<LI>State of received fees<BR>
<FORM 
 method="GET" 
 action="http://freres-gras.ufr-info-p6.jussieu.fr:12345/fees_state">
Start date : <INPUT type="text" name="start" value="">
End date : <INPUT type="text" name="end" value="">
<INPUT name="action" type="submit" value="Send">
</FORM>
</UL>
<HR>
</BODY>
</HTML>
We assume that this page is contained in the file assoc.html.

HTML Primitives

The HTML utility functions are grouped together into a single class called print. It has a field specifying the output channel. Thus, it can be used just as well in a CGI application (where the output channel is the standard output) as in an application using the HTTP server defined in the previous section (where the output channel is a network socket).

The proposed methods essentially allow us to encapsulate text within HTML tags. This text is either passed directly as an argument to the method in the form of a character string, or produced by a function. For example, the principal method page takes as its first argument a string corresponding to the header of the page1, and as its second argument a function that prints out the contents of the page. The method page produces the tags corresponding to the HTML protocol.

The names of the methods match the names of the corresponding HTML tags, with additional options added in some cases.

# class print (oc0:out_channel) =
object(self)
val oc = oc0
method flush () = flush oc
method str =
Printf.fprintf oc "%s"
method page header (body:unit -> unit) =
Printf.fprintf oc "<HTML><HEAD><TITLE>%s</TITLE></HEAD>\n<BODY>" header;
body();
Printf.fprintf oc "</BODY>\n</HTML>\n"
method p () =
Printf.fprintf oc "\n<P>\n"
method br () =
Printf.fprintf oc "<BR>\n"
method hr () =
Printf.fprintf oc "<HR>\n"
method hr () =
Printf.fprintf oc "\n<HR>\n"
method h i s =
Printf.fprintf oc "<H%d>%s</H%d>" i s i
method h_center i s =
Printf.fprintf oc "<H%d ALIGN=\"CENTER\">%s</H%d>" i s i
method form url (form_content:unit -> unit) =
Printf.fprintf oc "<FORM method=\"post\" action=\"%s\">\n" url;
form_content ();
Printf.fprintf oc "</FORM>"
method input_text =
Printf.fprintf oc
"<INPUT type=\"text\" name=\"%s\" size=\"%d\" value=\"%s\">\n"
method input_hidden_text =
Printf.fprintf oc "<INPUT type=\"hidden\" name=\"%s\" value=\"%s\">\n"
method input_submit =
Printf.fprintf oc "<INPUT name=\"%s\" type=\"submit\" value=\"%s\">"
method input_radio =
Printf.fprintf oc "<INPUT type=\"radio\" name=\"%s\" value=\"%s\">\n"
method input_radio_checked =
Printf.fprintf oc
"<INPUT type=\"radio\" name=\"%s\" value=\"%s\" CHECKED>\n"
method option =
Printf.fprintf oc "<OPTION> %s\n"
method option_selected opt =
Printf.fprintf oc "<OPTION SELECTED> %s" opt
method select name options selected =
Printf.fprintf oc "<SELECT name=\"%s\">\n" name;
List.iter
(fun s -> if s=selected then self#option_selected s else self#option s)
options;
Printf.fprintf oc "</SELECT>\n"
method options selected =
List.iter
(fun s -> if s=selected then self#option_selected s else self#option s)
end ;;
We will assume that these utilities are provided by the module Html_frame.

Dynamic Pages for Managing the Association Database

For each of the three kinds of request, the application must construct a page in response. For this purpose we use the utility module Html_frame given above. This means that the pages are not really constructed, but that their various components are emitted sequentially on the output channel.
We provide an additional (virtual) page to be returned in response to a request that is invalid or not understood.

Error page
The function print_error takes as arguments a function for emitting an HTML page (i.e., an instance of the class print) and a character string containing the error message.


# let print_error (print:Html_frame.print) s =
let print_body() =
print#str s; print#br()
in
print#page "Error" print_body ;;
val print_error : Html_frame.print -> string -> unit = <fun>


All of our functions for emitting responses to requests will take as their first argument a function for emitting an HTML page.

List of mail addresses
To obtain the page giving the response to a query for the list of mail addresses, we will format the list of character strings obtained by the function mail_addresses, which was defined as part of the database (see page ??). We will assume that this function, and all others directly involving requests to the database, have been defined in a module named Assoc.

To emit this list, we use a function for outputting simple lines:

# let print_lines (print:Html_frame.print) ls =
let print_line l = print#str l; print#br() in
List.iter print_line ls ;;
val print_lines : Html_frame.print -> string list -> unit = <fun>


The function for responding to a query for the list of mail addresses is:

# let print_mail_addresses print db =
print#page "Mail addresses"
(fun () -> print_lines print (Assoc.mail_addresses db))
;;
val print_mail_addresses : Html_frame.print -> Assoc.data_base -> unit =
<fun>


In addition to the parameter for emitting a page, the function print_mail_addresses takes the database as its second parameter.

List of email addresses
This function is built on the same principles as that giving the list of mail addresses, except that it calls the function email_addresses from the module Assoc:

# let print_email_addresses print db =
print#page "Email addresses"
(fun () -> print_lines print (Assoc.email_addresses db)) ;;
val print_email_addresses : Html_frame.print -> Assoc.data_base -> unit =
<fun>


State of received fees
The same principle also governs the definition of this function: retrieving the data corresponding to the request (which here is a pair), then emitting the corresponding character strings.

# let print_fees_state print db d1 d2 =
let ls, t = Assoc.fees_state db d1 d2 in
let page_body() =
print_lines print ls;
print#str ("Total : "^(string_of_float t));
print#br()
in
print#page "State of received fees" page_body ;;
val print_fees_state :
Html_frame.print -> Assoc.data_base -> string -> string -> unit = <fun>


Analysis of Requests and Response

We define two functions for producing responses based on an HTTP request. The first (print_get_answer) responds to a request presumed to be formulated using the GET method of the HTTP protocol. The second alters the production of the answer according to the actual method that the request used.

These two functions take as their second argument an array of character strings containing the elements of the HTTP request as analyzed by the function get_query_string (see page ??). The first element of the array contains the method, the second the name of the database request.
In the case of a query for the state of received fees, the start and end dates for the request are contained in the two fields of the form associated with the query. The data from the form are contained in the third field of the array, which must be decomposed by the function get_form_content (see page ??).


# let print_get_answer print q db =
match q.(1) with
| "/mail_addr" -> print_mail_addresses print db
| "/email_addr" -> print_email_addresses print db
| "/fees_state"
-> let nvs = get_form_content q.(2) in
let d1 = List.assoc "start" nvs
and d2 = List.assoc "end" nvs in
print_fees_state print db d1 d2
| _ -> print_error print ("Unknown request: "^q.(1)) ;;
val print_get_answer :
Html_frame.print -> string array -> Assoc.data_base -> unit = <fun>

# let print_answer print q db =
try
match q.(0) with
"GET" -> print_get_answer print q db
| _ -> print_error print ("Unsupported method: "^q.(0))
with
e
-> let s = Array.fold_right (^) q "" in
print_error print ("Something wrong with request: "^s) ;;
val print_answer :
Html_frame.print -> string array -> Assoc.data_base -> unit = <fun>


Main Entry Point and Application

The application is a standalone executable that takes the port number as a parameter. It reads in the database before launching the server. The main function is obtained from the function print_answer defined above and from the generic HTTP server function cgi_like_server defined in the previous section (see page ??). The latter function is located in the module Servlet.

# let get_port_num() =
if (Array.length Sys.argv) < 2 then 12345
else
try int_of_string Sys.argv.(1)
with _ -> 12345 ;;
val get_port_num : unit -> int = <fun>

# let main() =
let db = Assoc.read_base "assoc.dat" in
let assoc_answer oc q = print_answer (new Html_frame.print oc) q db in
Servlet.cgi_like_server (get_port_num()) assoc_answer ;;
val main : unit -> unit = <fun>


To obtain a complete application, we combine the definitions of the display functions into a file httpassoc.ml. The file ends with a call to the function main:
main() ;;
We can then produce an executable named assocd using the compilation command:
ocamlc -thread -custom -o assocd unix.cma threads.cma \
       gsd.cmo servlet.cmo html_frame.cmo string_plus.cmo assoc.cmo \
       httpassoc.ml -cclib -lunix -cclib -lthreads
All that's left is to launch the server, load the HTML page2 contained in the file assoc.html given at the beginning of this section (page ??), and click.

The figure 21.3 shows an example of the application in use.


Figure 21.3: HTTP request to an Objective CAML servlet


The browser establishes an initial connection with the servlet, which sends it the menu page. Once the entry fields are filled in, the user sends a new request which contains the data entered. The server decodes the request and calls on the association database to retrieve the desired information. The result is translated into HTML and sent to the client, which then displays this new page.

To Learn More

This application has numerous possible enhancements. First of all, the HTTP protocol used here is overly simple compared to the new versions, which add a header supplying the type and length of the page being sent. Likewise, the method POST, which allows modification of the server, is not supported.3

To be able to describe the type of a page to be returned, the servlet would have to support the MIME convention, which is used for describing documents such as those attached to email messages.

The transmission of images, such as in figure 21.2, makes it possible to construct interfaces for 2-player games (see chapter 17), where one associates links with drawings of positions to be played. Since the server knows which moves are legal, only the valid positions are associated with links.

The MIME extension also allows defining new types of data. One can thus support a private protocol for Objective CAML values by defining a new MIME type. These values will be understandable only by an Objective CAML program using the same private protocol. In this way, a request by a client for a remote Objective CAML value can be issued via HTTP. One can even pass a serialized closure as an argument within an HTTP request. This, once reconstructed on the server side, can be executed to provide the desired result.




Previous Contents Next