26.1 Python's distutilsThe distutils are a rich and flexible set of tools to package Python programs and extensions for distribution to third parties. I cover typical, simple use of the distutils for the most common packaging needs. For in-depth, highly detailed discussion of distutils, I recommend two manuals that are part of Python's online documentation: Distributing Python Modules (available at http://www.python.org/doc/current/dist/), and Installing Python Modules (available at http://www.python.org/doc/current/inst/), both by Greg Ward, the principal author of the distutils. 26.1.1 The Distribution and Its RootA distribution is the set of files to package into a single file for distribution purposes. A di stribution may include zero, one, or more Python packages and other Python modules (as covered in Chapter 7), as well as, optionally, Python scripts, C-coded (and other) extensions, supporting data files, and auxiliary files containing metadata about the distribution itself. A distribution is said to be pure if all code it includes is Python, and non-pure if it also includes non-Python code (most often, C-coded extensions). You should normally place all the files of a distribution in a directory, known as the distribution root directory, and in subdirectories of the distribution root. Mostly, you can arrange the subtree of files and directories rooted at the distribution root to suit your own organizational needs. However, remember from Chapter 7 that a Python package must reside in its own directory, and a package's directory must contain a file named _ _init_ _.py (or subdirectories with _ _init_ _.py files, for subpackages) as well as other modules belonging to that package. 26.1.2 The setup.py ScriptThe distribution root directory must contain a Python script that by convention is named setup.py. The setup.py script can, in theory, contain arbitrary Python code. However, in practice, setup.py always boils down to some variation of: from distutils.core import setup, Extension setup( many keyword arguments go here ) All the action is in the parameters you supply in the call to setup. You should not import Extension if your setup.py deals with a pure distribution. Extension is needed only for non-pure distributions, and you should import it only when you need it. It is fine to have a few statements before the call to setup, in order to arrange setup's arguments in clearer and more readable ways than could be managed by having everything inline as part of the setup call. The distutils.core.setup function accepts only keyword arguments, and there are a large number of such arguments that you could potentially supply. A few deal with the internal operations of the distutils themselves, and you never supply such arguments unless you are extending or debugging the distutils, an advanced subject that I do not cover in this book. Other keyword arguments to setup fall into two groups: metadata about the distribution, and information about what files are in the distribution. 26.1.3 Metadata About the DistributionYou should provide metadata about the distribution by supplying some of the following keyword arguments when you call the distutils.core.setup function. The value you associate with each argument name you supply is a string that is intended mostly to be human-readable; therefore, any specifications about the string's format are just advisory. The explanations and recommendations about the metadata fields in the following are also non-normative, and correspond only to common, not universal, conventions. Whenever the following explanations refer to "this distribution," it can be taken to refer to the material included in the distribution, rather than to the packaging of the distribution.
26.1.4 Distribution ContentsA distribution can contain a mix of Python source files, C-coded extensions, and other files. setup accepts optional keyword arguments detailing files to put in the distribution. Whenever you specify file paths, the paths must be relative to the distribution root directory and use / as the path separator. distutils adapts location and separator appropriately when it installs the distribution. Note, however, that the keyword arguments packages and py_modules do not list file paths, but rather Python packages and modules respectively. Therefore, in the values of these keyword arguments, use no path separators or file extensions. When you list subpackage names in argument packages, use Python syntax (e.g., top_package.sub_package). 26.1.4.1 Python source filesBy default, setup looks for Python modules (which you list in the value of the keyword argument py_modules) in the distribution root directory, and for Python packages (which you list in the value of the keyword argument packages) as sub-directories of the distribution root directory. You may specify keyword argument package_dir to change these defaults. However, things are simpler when you locate files according to setup's defaults, so I do not cover package_dir further in this book. The setup keyword arguments you will most frequently use to detail what Python source files to put in the distribution are the following.
For each package name string p in the list, setup expects to find a subdirectory p in the distribution root directory, and includes in the distribution the file p/_ _init_ _.py, which must be present, as well as any other file p/*.py (i.e., all the modules of package p). setup does not search for subpackages of p: you must explicitly list all subpackages, as well as top-level packages, in the value of keyword argument packages.
For each module name string m in the list, setup expects to find a file m.py in the distribution root directory, and includes m.py in the distribution.
Scripts are Python source files meant to be run as main programs (generally from the command line). The value of the scripts keyword lists the path strings of these files, complete with .py extension, relative to the distribution root directory. Each script file should have as its first line a shebang line, that is, a line starting with #! and containing the substring python. When distutils install the scripts included in the distribution, distutils adjust each script's first line to point to the Python interpreter. This is quite useful on many platforms, since the shebang line is used by the platform's shells or by other programs that may run your scripts, such as web servers. 26.1.4.2 Other filesTo put data files of any kind in the distribution, supply the following keyword argument.
The value of keyword argument data_files is a list of pairs. Each pair's first item is a string and names a target directory (i.e., a directory where distutils places data files when installing the distribution); the second item is the list of file path strings for files to put in the target directory. At installation time, distutils places each target directory as a subdirectory of Python's sys.prefix for a pure distribution, or of Python's sys.exec_prefix for a non-pure distribution. distutils places the given files directly in the respective target directory, never in subdirectories of the target. For example, given the following data_files usage: data_files = [ ('miscdata', ['conf/config.txt', 'misc/sample.txt']) ] distutils includes in the distribution the file config.txt from sub-directory conf of the distribution root, and the file sample.txt from subdirectory misc of the distribution root. At installation time, distutils creates a subdirectory named miscdata in Python's sys.prefix directory (or in the sys.exec_prefix directory, if the distribution is non-pure), and copies the two files into miscdata/config.txt and miscdata/sample.txt. 26.1.4.3 C-coded extensionsTo put C-coded extensions in the distribution, supply the following keyword argument.
All the details about each extension are supplied as arguments when instantiating the distutils.core.Extension class. Extension's constructor accepts two mandatory arguments and many optional keyword arguments, as follows.
name is the module name string for the C-coded extension. name may include dots to indicate that the extension module resides within a package. sources is the list of source files that the distutils must compile and link in order to build the extension. Each item of sources is a string giving a source file's path relative to the distribution root directory, complete with file extension .c. kwds lets you pass other, optional arguments to Extension, as covered later in this section. The Extension class also supports other file extensions besides .c, indicating other languages you may use to code Python extensions. On platforms having a C++ compiler, file extension .cpp indicates C++ source files. Other file extensions that may be supported, depending on the platform and on add-ons to the distutils that are still in experimental stages at the time of this writing, include .f for Fortran, .i for SWIG, and .pyx for Pyrex files. See Chapter 24 for information about using different languages to extend Python. In some cases, your extension needs no further information besides mandatory arguments name and sources. The distutils implicitly perform all that is necessary to make the Python headers directory and the Python library available for your extension's compilation and linking, and also provide whatever compiler or linker flags or options are needed to build extensions on a given platform. When it takes additional information to compile and link your extension correctly, you can supply such information via the keyword arguments of class Extension. Such arguments may potentially interfere with the cross-platform portability of your distribution. In particular, whenever you specify file or directory paths as the values of such arguments, the paths should be relative to the distribution root directory—using absolute paths seriously impairs your distribution's cross-platform portability. Portability is not a problem when you just use the distutils as a handy way to build your extension, as suggested in Chapter 24. However, when you plan to distribute your extensions to other platforms, you should examine whether you really need to provide build information via keyword arguments to Extension. It is sometimes possible to bypass such needs by careful coding at the C level, and the already mentioned Distributing Python Modules manual provides important examples. The keyword arguments that you may pass when calling Extension are the following:
26.1.5 The setup.cfg FileThe distutils let the user who is installing your distribution specify many options at installation time. Most often the user will simply enter the following command at a command line: C:\> python setup.py install but the already mentioned manual Installing Python Modules explains many alternatives in detail. If you wish to provide suggested values for some installation options, you can put a setup.cfg file in your distribution root directory. setup.cfg can also provide appropriate defaults for options you can supply to build-time commands. For copious details on the format and contents of file setup.cfg, see the already mentioned manual Distributing Python Modules. 26.1.6 The MANIFEST.in and MANIFEST FilesWhen you run: python setup.py sdist to produce a packaged-up source distribution (typically a .zip file on Windows, or a .tgz file, also known as a tarball, on Unix), the distutils by default insert the following in the distribution:
You can add yet more files in the source distribution .zip file or tarball by placing in the distribution root directory a manifest template file named MANIFEST.in, whose lines are rules, applied sequentially, about files to add (include) or subtract (prune) from the overall list of files to place in the distribution. The sdist command of the distutils also produces an exact list of the files placed in the source distribution as a text file named MANIFEST in the distribution root directory. 26.1.7 Creating Prebuilt Distributions with distutilsThe packaged source distributions you create with python setup.py sdist are the most widely useful files you can produce with distutils. However, you can make life even easier for users with specific platforms by also creating prebuilt forms of your distribution with the command python setup.py bdist. For a pure distribution, supplying prebuilt forms is merely a matter of convenience for the users. You can create prebuilt pure distributions for any platform, including ones different from those on which you work, as long as you have available on your path the needed commands (such as zip, gzip, bzip2, and tar). Such commands are freely available on the Net for all sorts of platforms, so you can easily stock up on them in order to provide maximum convenience to users who want to install your distribution. For a non-pure distribution, making prebuilt forms available may be more than just an issue of convenience. A non-pure distribution, by definition, includes code that is not pure Python, generally C code. Unless you supply a prebuilt form, users need to have the appropriate C compiler installed in order to build and install your distribution. This is not a terrible problem on platforms where the appropriate C compiler is the free and ubiquitous gcc. However, on other platforms, the C compiler needed for normal building of Python extensions is commercial and costly. For example, on Windows, the normal C compiler used by Python and its C-coded extensions is Microsoft Visual C++ (Release 6, at the time of this writing). It is possible to substitute other compilers, including free ones such as the mingw32 and cygwin versions of gcc, and Borland C++ 5.5, whose command-line version you can download from the Net at no cost. However, the process of using such alternative compilers, as documented in the Python online manuals, is rather complex and intricate, particularly for end users who may not be experienced programmers. Therefore, if you want your non-pure distribution to be widely adopted on such platforms as Windows, it's highly advisable to make your distribution also available in prebuilt form. However, unless you have developed or purchased advanced cross-compilation environments, building a non-pure distribution and packaging it up in prebuilt form is only feasible on the target platform. You also need to have the necessary C compiler installed. When those conditions are satisfied, however, the distutils make the procedure quite simple. In particular, the command: python setup.py bdist_wininst creates an .exe file that is a Windows installer for your distribution. If your distribution is non-pure, the prebuilt distribution is dependent on the specific Python version. The distutils reflect this fact in the name of the .exe installer they create for you. Say, for example, that your distribution's name metadata is mydist, your distribution's version metadata is 0.1, and the Python version you use is 2.2. In this case, the distutils build a Windows installer named mydist-0.1.win32-py2.2.exe. |