Sockets

Sockets

We saw in chapters 18 and 19 two ways to perform interprocess communication, namely, pipes and channels. These first two methods use a logical model of concurrency. In general, they do not give better performance to the degree that the communicating processes share resources, in particular, the same processor. The third possibility, which we present in this section, uses sockets for communication. This method originated in the Unix world. Sockets allow communication between processes executing on the same machine or on different machines.

Description and Creation

A socket is responsible for establishing communication with another socket, with the goal of transferring information. We enumerate the different situations that may be encountered as well as the commands and datatypes that are used by TCP/IP sockets. The classic metaphor is to compare sockets to telephone sets.

In order to work, the machine must be connected to the network (socket).
To receive a call, it is necessary to possess a number of the type sock_addr (bind).
During a call, it is possible to receive another call if the configuration allows it (listen).
It is not necessary to have one's own number to call another set, once the connection is established in both directions (connect).

Domains.

Sockets belong to different domains, according to whether they are meant to communicate internally or externally. The Unix library defines two possible domains corresponding to the type constructors:


# type socket_domain = PF_UNIX | PF_INET;;

The first domain corresponds to local communication, and the second, to communication over the Internet. These are the principal domains for sockets.

In the following, we use sockets belonging only to the Internet domain.

Types and protocols.

Regardless of their domain, sockets define certain communications properties (reliability, ordering, etc.) represented by the type constructors:


# type socket_type = SOCK_STREAM | SOCK_DGRAM | SOCK_SEQPACKET | SOCK_RAW ;;

According to the type of socket used, the underlying communications protocol obeys definite characteristics. Each type of communication is associated with a default protocol.

In fact, we will only use the first kind of communication --- SOCK_STREAM --- with the default protocol TCP. This guarantees reliability, order, prevents duplication of exchanged messages, and works in connected mode.

For more information, we refer the reader to the Unix literature, for example [Ste92].

Creation.

The function to create sockets is:


# Unix.socket ;;
- : Unix.socket_domain -> Unix.socket_type -> int -> Unix.file_descr = <fun>

The third argument allows specification of the protocol associated with communication. The value 0 is interpreted as ``the default protocol'' associated with the pair (domain, type) argument used for the creation of the socket. The value returned by this function is a file descriptor. Thus such a value can be used with the standard input-output functions in the Unix library.

We can create a TCP/IP socket with:


# let s_descr = Unix.socket Unix.PF_INET Unix.SOCK_STREAM 0 ;;
val s_descr : Unix.file_descr = <abstr>

Warning

Even though the socket function returns a value of type file_descr, the system distinguishes descriptors for a files and those associated with sockets. You can use the file functions in the Unix library with descriptors for sockets; but an exception is raised when a classical file descriptor is passed to a function expecting a descriptor for a socket.

Closing.

Like all file descriptors, a socket is closed by the function:


# Unix.close ;;
- : Unix.file_descr -> unit = <fun>

When a process finishes via a call to exit, all open file descriptors are closed automatically.

Addresses and Connections

A socket does not have an address when it is created. In order to setup a connection between two sockets, the caller must know the address of the receiver.

The address of a socket (TCP/IP) consists of an IP address and a port number. A socket in the Unix domain consists simply of a file name.


# type sockaddr =
   ADDR_UNIX of string | ADDR_INET of inet_addr * int ;;

Binding a socket to an address.

The first thing to do in order to receive calls after the creation of a socket is to bind the socket to an address. This is the job of the function:


# Unix.bind ;;
- : Unix.file_descr -> Unix.sockaddr -> unit = <fun>

In effect, we already have a socket descriptor, but the address that is associated with it at creation is hardly useful, as shown by the following example:


# let (addr_in, p_num) = 
   match Unix.getsockname s_descr with 
     Unix.ADDR_INET (a,n) -> (a,n)
   |  _ -> failwith "not INET" ;;
val addr_in : Unix.inet_addr = <abstr>
val p_num : int = 0
# Unix.string_of_inet_addr addr_in ;;
- : string = "0.0.0.0"

We need to create a useful address and to associate it with our socket. We reuse our local address my_addr as described on page ?? and choose port 12345 which, in general, is unused.


# Unix.bind s_descr (Unix.ADDR_INET(my_addr, 12345)) ;;
- : unit = ()

Listening and accepting connections.

It is necessary to use two operations before our socket is completely operational to receive calls: define its listening capacity and allow it to accept connections. Those are the respective roles of the two functions:


# Unix.listen ;;
- : Unix.file_descr -> int -> unit = <fun>
# Unix.accept ;;
- : Unix.file_descr -> Unix.file_descr * Unix.sockaddr = <fun>

The second argument to the listen function gives the maximum number of connections. The call to the accept function waits for a connection request. When accept finishes, it returns the descriptor for a socket, the so-called service socket. This service socket is automatically linked to an address. The accept function may only be applied to sockets that have called listen, that is, to sockets that have setup a queue of connection requests.

Connection requests.

The function reciprocal to accept is;


# Unix.connect ;;
- : Unix.file_descr -> Unix.sockaddr -> unit = <fun>

A call to Unix.connect s_descr s_addr establishes a connection between the local socket s_descr (which is automatically bound) and the socket with address s_addr (which must exist).

Communication.

From the moment that a connection is established between two sockets, the processes owning them can communicate in both directions. The input-output functions are those in the Unix module, described in Chapter 18.