The Internet
The Internet is a network of networks.
Their interconnection is organized as a hierarchy of domains,
subdomains, and so on, through interfaces. An interface
is the hardware in a computer that allows it to be connected
(typically, an Ethernet card). Some computers may have
several interfaces. Each interface has a unique IP address
that respects, in general, the interconnection hierarchy.
Message routing is also organized hierarchically: from domain
to domain; then from domain to subdomains, and so on, until
a message reaches its destination interface. Besides their
interface addresses, computers usually also have a name,
as do domains and subdomains. Some machines have a particular
role in the network:
-
bridges
- connect one network to another;
- routers
- use their knowledge of the topology of the Internet
to route data;
- name servers
- track the correspondence between machine names
and network addresses.
The purpose of the Internet protocol (i.e., of the IP)
is to make the network of networks into a single entity.
This is why one can speak of the Internet.
Any two machines connected via the Internet can communicate.
Many kinds of machines and systems coexist on the Internet.
All of them use IP protocols and most of them, the UDP and
TCP layers.
The different protocols and services used by the Internet are
described in RFC's (Requests For Comments),
which can be found on the Jussieu mirror site:
Link
ftp://ftp.lip6.fr/pub/rfc
Internet Protocols and Services
The unit of transfer used by the IP protocol is the
datagram or packet. This protocol in unreliable:
it does not assure proper order, safe arrival, or non-duplication
of transmitted packets. It only deals with correct routing of
packets and signaling of errors when a packet is unable to
reach its destination. Addresses are coded into 32 bits
in the current version of the protocol: IPv4. These 32 bits
are divided into four fields, each containing values between
0 and 255. IP addresses are written with the four fields
separated by periods, for example: 132.227.60.30
.
The IP protocol is in the midst of an important change
made necessary by the exhaustion of address space and the
growing complexity of routing problems due to the expansion of the
Internet. The new version of the IP protocol is
IPv6, which is described in [Hui97].
Above IP, two protocols allow higher-level transmissions:
UDP (User Datagram Protocol, and TCP (Transfer Control
Protocol). These two protocols use IP for communication
between machines, also allowing communication between applications
(or programs) running on those machines. They deal with correct transmission
of information, independent of contents. The identification of
applications on a machine is done via a port number.
UDP is a connectionless, unreliable protocol: it is to
applications as IP is to interfaces. TCP is
a connection-oriented, reliable protocol: it manages acknowledgement,
retransmission, and ordering of packets. Further, it is capable
of optimizing transmission by a windowing technique.
The standard services (applications) of the Internet most often
use the client-server model. The server manages requests by clients,
offering them a specific service. There is an asymmetry between
client and server. The services establish high-level protocols for
keeping track of transmitted contents. Among the standard services, we note:
-
FTP (File Transfer Protocol);
- TELNET (Terminal Protocol);
- SMTP (Simple Mail Transfer Protocol);
- HTTP (Hypertext Transfer Protocol).
Other services use the client-server model:
-
NFS (Network File System);
- X-Windows
- Unix services: rlogin, rwho ...
Communication between applications takes place via sockets.
Sockets allow communication between processes residing on possibly different
machines. Different processes can read and write to sockets.
The Unix Module and IP Addressing
The Unix library defines the abstract type inet_addr
representing Internet addresses, as well as two conversion functions
between an internal representation of addresses and strings:
# Unix.inet_addr_of_string
;;
- : string -> Unix.inet_addr = <fun>
# Unix.string_of_inet_addr
;;
- : Unix.inet_addr -> string = <fun>
In applications, Internet addresses and port numbers for
services (or service numbers) are often replaced by names.
The correspondence between names and address or number is
managed using databases. The Unix library
provides functions to request data from these databases and
provides datatypes to allow storage of the obtained information.
We briefly describe these functions below.
Address tables.
The table of addresses (hosts database) contains
the assocation between machine name(s) and interface address(es).
The structure
of entries in the address table is represented by:
# type
host_entry
=
{
h_name
:
string;
h_aliases
:
string
array;
h_addrtype
:
socket_domain;
h_addr_list
:
inet_addr
array
}
;;
The first two fields contain the machine name and its aliases;
the third contains the address type (see page
??); the last contains a list of
machine addresses.
A machine name is obtained by using the function:
# Unix.gethostname
;;
- : unit -> string = <fun>
# let
my_name
=
Unix.gethostname()
;;
val my_name : string = "estephe.inria.fr"
The functions that query the address table
require an entry, either the name or the machine address.
# Unix.gethostbyname
;;
- : string -> Unix.host_entry = <fun>
# Unix.gethostbyaddr
;;
- : Unix.inet_addr -> Unix.host_entry = <fun>
# let
my_entry_byname
=
Unix.gethostbyname
my_name
;;
val my_entry_byname : Unix.host_entry =
{Unix.h_name="estephe.inria.fr"; Unix.h_aliases=[|"estephe"|];
Unix.h_addrtype=Unix.PF_INET; Unix.h_addr_list=[|<abstr>|]}
# let
my_addr
=
my_entry_byname.
Unix.h_addr_list.
(0
)
;;
val my_addr : Unix.inet_addr = <abstr>
# let
my_entry_byaddr
=
Unix.gethostbyaddr
my_addr
;;
val my_entry_byaddr : Unix.host_entry =
{Unix.h_name="estephe.inria.fr"; Unix.h_aliases=[|"estephe"|];
Unix.h_addrtype=Unix.PF_INET; Unix.h_addr_list=[|<abstr>|]}
# let
my_full_name
=
my_entry_byaddr.
Unix.h_name
;;
val my_full_name : string = "estephe.inria.fr"
These functions raise the Not_found exception in case
the request fails.
Table of services.
The table of services contains the correspondence
between service names and port numbers.
The majority of Internet services are
standardized. The structure of entries in
the table of services is:
# type
service_entry
=
{
s_name
:
string;
s_aliases
:
string
array;
s_port
:
int;
s_proto
:
string
}
;;
The first two fields are the service name and its eventual
aliases; the third field contains the port number; the last field
contains the name of the protocol used.
A service is in fact characterized by its
port number and the underlying protocol.
The query functions are:
# Unix.getservbyname
;;
- : string -> string -> Unix.service_entry = <fun>
# Unix.getservbyport
;;
- : int -> string -> Unix.service_entry = <fun>
# Unix.getservbyport
8
0
"tcp"
;;
- : Unix.service_entry =
{Unix.s_name="www"; Unix.s_aliases=[|"http"|]; Unix.s_port=80;
Unix.s_proto="tcp"}
# Unix.getservbyname
"ftp"
"tcp"
;;
- : Unix.service_entry =
{Unix.s_name="ftp"; Unix.s_aliases=[||]; Unix.s_port=21; Unix.s_proto="tcp"}
These functions raise the Not_found exception if
they cannot find the service requested.