Program sources by themselves don't make an application. The way you put them together and package them for distribution matters, too. Unix provides a tool for semi-automating these processes; make(1). Make is covered in most introductory Unix books. For a really thorough reference, you can consult Managing Projects with Make [Oram-Talbot]. If you're using GNU make (the most advanced make, and the one normally shipped with open-source Unixes) the treatment in Programming with GNU Software [Loukides-Oram] may be better in some respects. Most Unixes that carry GNU make will also support GNU Emacs; if yours does you will probably find a complete make manual on-line through Emacs's info documentation system.
Ports of GNU make to DOS and Windows are available from the FSF.
If you're developing in C or C++, an important part of the recipe for building your application will be the collection of compilation and linkage commands needed to get from your sources to working binaries. Entering these commands is a lot of tedious detail work, and most modern development environments include a way to put them in command files or databases that can automatically be re-executed to build your application.
Unix's make(1) program, the original of all these facilities, was designed specifically to help C programmers manage these recipes. It lets you write down the dependencies between files in a project in one or more ‘makefiles’. Each makefile consists of a series of productions; each one tells make that some given target file depends on some set of source files, and says what to do if any of the sources are newer than the target. You don't actually have to write down all dependencies, as the make program can deduce a lot of the obvious ones from filenames and extensions.
For example: You might put in a makefile that the binary myprog depends on three object files myprog.o, helper.o, and stuff.o. If you have source files myprog.c, helper.c, and stuff.c, make will know without being told that each .o file depends on the corresponding .c file, and supply its own standard recipe for building a .o file from a .c file.
When you run make in a project directory, the make program looks at all productions and timestamps and does the minimum amount of work necessary to make sure derived files are up to date.
You can read a good example of a moderately complex makefile in the sources for fetchmail. In the subsections below we'll refer to it again.
Very complex makefiles, especially when they call subsidiary makefiles, can become a source of complications rather than simplifying the build process. A now-classic warning is issued in Recursive Make Considered Harmful.[136] The argument in this paper has become widely accepted since it was written in 1997, and has come near to reversing previous community practice.
No discussion of make(1) would be complete without an acknowledgement that it includes one of the worst design botches in the history of Unix. The use of tab characters as a required leader for command lines associated with a production means that the interpretation of a makefile can change drastically on the basis of invisible differences in whitespace.
make is not just useful for C/C++ recipes, however. Scripting languages like those we described in Chapter 14 may not require conventional compilation and link steps, but there are often other kinds of dependencies that make(1) can help you with.
Suppose, for example, that you actually generate part of your code from a specification file, using one of the techniques from Chapter 9. You can use make to tie the spec file and the generated source together. This will ensure that whenever you change the spec and remake, the generated code will automatically be rebuilt.
It's quite common to use makefile productions to express recipes for making documentation as well as code. You'll often see this approach used to automatically generate PostScript or other derived documentation from masters written in some markup language (like HTML or one of the Unix document-macro languages we'll survey in Chapter 18). In fact, this sort of use is so common that it's worth illustrating with a case study.
In the fetchmail makefile, for example, you'll see three productions that relate files named FAQ, FEATURES, and NOTES to HTML sources fetchmail-FAQ.html, fetchmail-features.html, and design-notes.html.
The HTML files are meant to be accessible on the fetchmail Web page, but all the HTML markup makes them uncomfortable to look at unless you're using a browser. So the FAQ, FEATURES, and NOTES are flat-text files meant to be flipped through quickly with an editor or pager program by someone reading the fetchmail sources themselves (or, perhaps, distributed to FTP sites that don't support Web access).
The flat-text forms can be made from their HTML masters by using the common open-source program lynx(1). lynx is a Web browser for text-only displays; but when invoked with the -dump option it functions reasonably well as an HTML-to-ASCII formatter.
With the productions in place, the developer can edit the HTML masters without having to remember to manually rebuild the flat-text forms afterwards, secure in the knowledge that FAQ, FEATURES, and NOTES will be properly rebuilt whenever they are needed.
Some of the most heavily used productions in typical makefiles don't express file dependencies at all. They're ways to bundle up little procedures that a developer wants to mechanize, like making a distribution package or removing all object files in order to do a build from scratch.
There is a well-developed set of conventions about what utility productions should be present and how they should be named. Following these will make your makefile much easier to understand and use.
Your all production should make every executable of your project. Usually the all production doesn't have an explicit rule; instead it refers to all of your project's top-level targets (and, not accidentally, documents what those are). Conventionally, this should be the first production in your makefile, so it will be the one executed when the developer types make with no argument.
Run the program's automated test suite, typically consisting of a set of unit tests[137] to find regressions, bugs, or other deviations from expected behavior during the development process. The ‘test’ production can also be used by end-users of the software to ensure that their installation is functioning correctly.
Remove all files (such as binary executables and object files) that are normally created when you make all. A make clean should reset the process of building the software to a good initial state.
Make a source archive (usually with the tar(1) program) that can be shipped as a unit and used to rebuild the program on another machine. This target should do the equivalent of depending on all so that a make dist automatically rebuilds the whole project before making the distribution archive — this is a good way to avoid last-minute embarrassments, like not shipping derived files that are actually needed (like the flat-text README in fetchmail, which is actually generated from an HTML source).
Throw away everything but what you would include if you were bundling up the source with make dist. This may be the the same as make clean but should be included as a production of its own anyway, to document what's going on. When it's different, it usually differs by throwing away local configuration files that aren't part of the normal make all build sequence (such as those generated by autoconf(1); we'll talk about autoconf(1) in Chapter 17).
Throw away everything you can rebuild using the makefile. This may be the same as make distclean, but should be included as a production of its own anyway, to document what's going on. When it's different, it usually differs by throwing away files that are derived but (for whatever reason) shipped with the project sources anyway.
Install the project's executables and documentation in system directories so they will be accessible to general users (this typically requires root privileges). Initialize or update any databases or libraries that the executables require in order to function.
Remove files installed in system directories by make install (this typically requires root privileges). This should completely and perfectly reverse a make install. The presence of an uninstall production implies a kind of humility that experienced Unix hands look for as a sign of thoughtful design; conversely, not having an uninstall production is at best careless, and (when, for example, an installation creates large database files) can be quite rude and thoughtless.
Working examples of all the standard targets are available for inspection in the fetchmail makefile. By studying all of them together you will see a pattern emerge, and (not incidentally) learn much about the fetchmail package's structure. One of the benefits of using these standard productions is that they form an implicit roadmap of their project.
But you need not limit yourself to these utility productions. Once you master make, you'll find yourself more and more often using the makefile machinery to automate little tasks that depend on your project file state. Your makefile is a convenient central place to put these; using it makes them readily available for inspection and avoids cluttering up your workspace with trivial little scripts.
One of the subtle advantages of Unix make over the dependency databases built into many IDEs is that makefiles are simple text files — files that can be generated by programs.
In the mid-1980s it was fairly common for large Unix program distributions to include elaborate custom shellscripts that would probe their environment and use the information they gathered to construct custom makefiles. These custom configurators reached absurd sizes. I wrote one once that was 3000 lines of shell, about twice as large as any single module in the program it was configuring — and this was not unusual.
The community eventually said “Enough!” and various people set out to write tools that would automate away part or all of the process of maintaining makefiles. These tools generally tried to address two issues:
One issue is portability. Makefile generators are commonly built to run on many different hardware platforms and Unix variants. They generally try to deduce things about the local system (including everything from machine word size up to which tools, languages, service libraries, and even document formatters it has available). They then try to use those deductions to write makefiles that exploit the local system's facilities and compensate for its quirks.
The other issue is dependency derivation. It's possible to deduce a great deal about the dependencies of a collection of C sources by analyzing the sources themselves (especially by looking at what include files they use and share). Many makefile generators do this in order to mechanically generate make dependencies.
Each different makefile generator tackles these objectives in a slightly different way. Probably a dozen or more generators have been attempted, but most proved inadequate or too difficult to drive or both, and only a few are still in live use. We'll survey the major ones here. All are available as open-source software on the Internet.
Several small tools have tackled the rule automation part of the problem exclusively. This one, distributed along with the X windowing system from MIT, is the fastest and most useful and comes preinstalled under all modern Unixes, including all Linuxes.
makedepend takes a collection of C sources and generates dependencies for the corresponding .o files from their #include directives. These can be appended directly to a makefile, and in fact makedepend is defined to do exactly that.
makedepend is useless for anything but C projects. It doesn't try to solve more than one piece of the makefile-generation problem. But what it does it does quite well.
makedepend is sufficiently documented by its manual page. If you type man makedepend at a terminal window you will quickly learn what you need to know about invoking it.
Imake was written in an attempt to mechanize makefile generation for the X window system. It builds on makedepend to tackle both the dependency-derivation and portability problems.
Imake system effectively replaces conventional makefiles with Imakefiles. These are written in a more compact and powerful notation which is (effectively) compiled into makefiles. The compilation uses a rules file which is system-specific and includes a lot of information about the local environment.
Imake is well suited to X's particular portability and configuration challenges and universally used in projects that are part of the X distribution. However, it has not achieved much popularity outside the X developer community. It's hard to learn, hard to use, hard to extend, and produces generated makefiles of mind-numbing size and complexity.
The Imake tools will be available on any Unix that supports X, including Linux. There has been one heroic effort [DuBois] to make the mysteries of Imake comprehensible to non-X-programming mortals. These are worth learning if you are going to do X programming.
autoconf was written by people who had seen and rejected the Imake approach. It generates per-project configure shellscripts that are like the old-fashioned custom script configurators. These configure scripts can generate makefiles (among other things).
Autoconf is focused on portability and does no built-in dependency derivation at all. Although it is probably as complex as Imake, it is much more flexible and easier to extend. Rather than relying on a per-system database of rules, it generates configure shell code that goes out and searches your system for things.
Each configure shellscript is built from a per-project template that you have to write, called configure.in. Once generated, though, the configure script will be self-contained and can configure your project on systems that don't carry autoconf(1) itself.
The autoconf approach to makefile generation is like imake's in that you start by writing a makefile template for your project. But autoconf's Makefile.in files are basically just makefiles with placeholders in them for simple text substitution; there's no second notation to learn. If you want dependency derivation, you must take explicit steps to call makedepend(1) or some similar tool — or use automake(1).
autoconf is documented by an on-line manual in the GNU info format. The source scripts of autoconf are available from the FSF archive site, but are also preinstalled on many Unix and Linux versions. You should be able to browse this manual through your Emacs's help system.
Despite its lack of direct support for dependency derivation, and despite its generally ad-hoc approach, in mid-2003 autoconf is clearly the most popular of the makefile generators, and has been for some years. It has eclipsed Imake and driven at least one major competitor (metaconfig) out of use.
A reference, GNU Autoconf, Automake and Libtool is available [Vaughan]. We'll have more to say about autoconf, from a slightly different angle, in Chapter 17.
automake is an attempt to add Imake-like dependency derivation as a layer on top of autoconf(1). You write Makefile.am templates in a broadly Imake-like notation; automake(1) compiles them to Makefile.in files, which autoconf's configure scripts then operate on.
automake is still relatively new technology in mid-2003. It is used in several FSF projects but has not yet been widely adopted elsewhere. While its general approach looks promising, it is as yet rather brittle — it works when used in stereotyped ways but tends to break badly if you try to do anything unusual with it.
Complete on-line documentation is shipped with automake, which can be downloaded from the FSF archive site.
[137] A unit test is test code attached to a module to verify correct performance. Use of the term ‘unit test’ suggests that the test is written concurrently with the code by the developer of the code, and implies a discipline in which module releases aren't considered complete until they have attached test code. The term and the concept originated in the “Extreme Programming” methodology popularized by Kent Beck, but has gained wide acceptance among Unix programmers since about 2001.