This chapter is devoted to perhaps the most important part of cluster design - the network. After all, nodes are relatively simple - you get what you pay for, you pay for what you get. For the most part, within a processor family, your serial task performance scales in fairly obvious ways with cpu clock, memory amount, type, speed, and so forth.
Not so with networking. Networks are intrinsically complex. In addition to the barebones concepts of latency and bandwidth we've already covered, issues like topology, probability, task organization, and various pieces of deep hardware-level magic come into play. Oh, and let's not forget the kernel, the device driver that interfaces device with the kernel, the networking stack that lives on top of the hardware device and its kernel device driver, the API between the driver and/or its networking stack, and of course your application. With maybe a layer or two more in there - no kidding.
Networks are so complex that I've found writing this chapter to be somewhat daunting and have hence postponed it again and again. They are also so rapidly varying that no matter what I write, the part at the bleeding edge will be obsolete (or at least, no longer bleeding edge and probably partly wrong) almost as fast as I write it and put it out there on the web.
Still, it's jobs that never get started that takes longest to finish, as Sam Gamgee's gaffer would say. So let's have at it. If you read this chapter and find that it is still incomplete, a) no surprise, it will take me a long time to write even a decently complete first draft; and b) feel free to bug me to finish it.
By the same token, if you really understand networking already and find something in this chapter that is egregiously in error (or for that matter, very subtley in error) you should feel free to correct me, if necessary with a whomp upside the head. I'm doing my best here, but some of these networks are expensive and until I either really need them or the vendors decide that they just have to loan me or give me a half-dozen cards worth to test and write about, I'm going to be writing at least partly on the basis of the vendors' published specifications, what I've gleaned from the beowulf list, what friends of mine who do run the networks have told me.
Finally, if you are a real expert in one of the high end networks and find my articles below to be hopelessly incompetent, well, remember that this is an open source, open license publication, and that I would cherish contributions from real experts. I'll even leave your very own name at the top of the section title, and ensure that it appears with the chapter in the TOC as well, so you get proper blame - uh, I mean credit - instead of me. High end networking companies, this goes for you as well - feel free to write your OWN (non-marketing-hype) description of your networking including a cost-benefit analysis and I'll cheerfully include it as a subsection - look, ma, free marketing, right where it does the most good!
The major editorial point being, the readers of this online book want useful, informed information about your products so they can make intelligent, cost-beneficial decisions about spending their money. You'd like for them to have all this information about your product so that they'll choose to buy it. Fine, we can work together on that basis, as long as you don't disrespect other's products or get into marketspeak. The readers of this document are all very likely to be technically competent shoppers and will want a technical and economic presentation, complete with at least ballpark prices.
Now, on to the meat of the matter. I'm going to try to organize this document in the following way. First, I'll present a very modest review of the basic concepts of networking, such as the ISO/OSI layers, the concept and general structure of a ``packet'', a bit of discussion of latency and bandwidth again (sorry, but this is a key context and requires it), and anything else likely to apply to ``all networks''. As a subsection, I'll present an equally compressed view of TCP/IP as one particular, important implementation of the network and transport layers. This won't be anywhere near enough to teach you to manage a TCP/IP network, but should give you a working knowledge of its basic concepts.
Then we'll run a set of sections on particular physical networks: Ethernet, SCI, Myrinet, and possibly more exotic networks (e.g. HIPPI) as I have the time and patience to figure them out or someone knowledgeable volunteers to write for me.
To permit me to reissue a book snapshot with this chapter finally not empty before I finish it, I'm going to cheat. A lot of what I'm going to put in this chapter comes from resources that I've either patiently collected over the last seventeen or so years or from resources that are readily available online. In fact, since most of the resources I've collected are ones that I make available online, one could say that all of it can be found on the web - somewhere (except where a copyright problem exists that might preclude republication). So for starters, every currently planned chapter will contain at the very least a list of web-resources you can click through - a reference to the vendor's website, for example.
That way, even if I'm still relatively ignorant of the network in question or am a world's-greatest-expert but just haven't had time to write the document (and which is which, you wonder, heh, heh) you'll be able to make some progress toward the all-important decision: what is the right network for my cluster?
Just to preview a part of the answer before we get started - it will almost certainly be TCP/IP on top of switched 100BT ethernet and (possibly) one of the high end, expensive the networks, depending on what you're planning to do. Switched 100BT has gone from nonexistent to expensive (as in tens of thousands of dollars) to quite cheap indeed in the years I've been doing cluster computing, and at this point it is so cheap, so ubiquitous, and so adequate for routine networking chores that it is hard to imagine a network without it. Perhaps in a few years it will be superceded by 1000BT ethernet as it once superceded 10BT ethernet, but in the meantime it is all but universal.