Chapter 21 TOC Appendix B

Appendix A. Recent Python Changes

This appendix summarizes prominent changes introduced in Python releases since the first edition of this book. It is divided into three sections, mostly because the sections on 1.6 and 2.0 changes were adapted from release note documents:

· Changes introduced in Python 2.0 (and 2.1)

· Changes introduced in Python 1.6

· Changes between the first edition and Python 1.5.2

Python 1.3 was the most recent release when the first edition was published (October 1996), and Python 1.6 and 2.0 were released just before this second edition was finished. 1.6 was the last release posted by CNRI, and 2.0 was released from BeOpen (Guido's two employers prior to his move to Digital Creations); 2.0 adds a handful of features to 1.6.

With a few notable exceptions, the changes over the last five years have introduced new features to Python, but have not changed it in incompatible ways. Many of the new features are widely useful (e.g., module packages), but some seem to address the whims of Python gurus (e.g., list comprehensions) and can be safely ignored by anyone else. In any event, although it is important to keep in touch with Python evolution, you should not take this appendix too seriously. Frankly, application library and tool usage is much more important in practice than obscure language additions.

For information on the Python changes that will surely occur after this edition's publication, consult either the resources I maintain at this book's web site (http://rmi.net/~lutz/about-pp.html), the resources available at Python's web site (http://www.python.org ), or the release notes that accompany Python releases.

A.1 Major Changes in 2.0

This section lists changes introduced in Python release 2.0. Note that third-party extensions built for Python 1.5.x or 1.6 cannot be used with Python 2.0; these extensions must be rebuilt for 2.0. Python bytecode files (*.pyc and *.pyo) are not compatible between releases either.

A.1.1 Core Language Changes

The following sections describe changes made to the Python language itself.

A.1.1.1 Augmented assignment

After nearly a decade of complaints from C programmers, Guido broke down and added 11 new C-like assignment operators to the language:

+= -= *= /= %= **= <<= >>= &= ^= |=

The statement A += B is similar to A = A + B except that A is evaluated only once (useful if it is a complex expression). If A is a mutable object, it may be modified in place; for instance, if it is a list, A += B has the same effect as A.extend(B).

Classes and built-in object types can override the new operators in order to implement the in-place behavior; the non-in-place behavior is automatically used as a fallback when an object does not implement the in-place behavior. For classes, the method name is the method name for the corresponding non-in-place operator prepended with an "i" (e.g., __iadd__ implements in-place __add__ ).

A.1.1.2 List comprehensions

A new expression notation was added for lists whose elements are computed from another list (or lists):

[<expression> for <variable> in <sequence>]

For example, [i**2 for i in range(4)] yields the list [0,1,4,9]. This is more efficient than using map with a lambda, and at least in the context of scanning lists, avoids some scoping issues raised by lambdas (e.g., using defaults to pass in information from the enclosing scope). You can also add a condition:

[<expression> for <variable> in <sequence> if <condition>]

For example, [w for w in words if w == w.lower( )] yields the list of words that contain no uppercase characters. This is more efficient than filter with a lambda. Nested for loops and more than one if is supported as well, though using this seems to yield code that is as complex as nested maps and lambdas (see Python manuals for more details).

A.1.1.3 Extended import statements

Import statements now allow an "as" clause (e.g., import mod as name), which saves an assignment of an imported module's name to another variable. This works with from statements and package paths too (e.g., from mod import var as name. The word "as" was not made a reserved word in the process. (To import odd filenames that don't map to Python variable names, see the __import_ _ built-in function.)

A.1.1.4 Extended print statement

The print statement now has an option that makes the output go to a different file than the default sys.stdout. For instance, to write an error message to sys.stderr, you can now write:

print >> sys.stderr, "spam"

As a special case, if the expression used to indicate the file evaluates to None, the current value of sys.stdout is used (like not using >> at all). Note that you can always write to file objects such as sys.stderr by calling their write method; this optional extension simply adds the extra formatting performed by the print statement (e.g., string conversion, spaces between items).

A.1.1.5 Optional collection of cyclical garbage

Python is now equipped with a garbage collector that can hunt down cyclical references between Python objects. It does not replace reference counting (and in fact depends on the reference counts being correct), but can decide that a set of objects belongs to a cycle if all their reference counts are accounted for in their references to each other. A new module named gc lets you control parameters of the garbage collection; an option to the Python "configure" script lets you enable or disable the garbage collection. (See the 2.0 release notes or the library manual to check if this feature is enabled by default or not; because running this extra garbage collection step periodically adds performance overheads, the decision on whether to turn it on by default is pending.)

A.1.2 Selected Library Changes

This is a partial list of standard library changes introduced by Python release 2.0; see 2.0 release notes for a full description of the changes.

A.1.2.1 New zip function

A new function zip was added: zip(seq1,seq2,...) is equivalent to map(None,seq1,seq2,...) when the sequences have the same length. For instance, zip([1, 2, 3], [10, 20, 30]) returns [(1,10), (2,20), (3,30)]. When the lists are not all the same length, the shortest list defines the result's length.

A.1.2.2 XML support

A new standard module named pyexpat provides an interface to the Expat XML parser. A new standard module package named xml provides assorted XML support code in (so far) three subpackages: xml.dom , xml.sax , and xml.parsers.

A.1.2.3 New web browser module

The new webbrowser module attempts to provide a platform-independent API to launch a web browser. (See also the LaunchBrowser script at the end of Chapter 4.)

A.1.3 Python/C Integration API Changes

Portability was ensured to 64-bit platforms under both Linux and Win64, especially for the new Intel Itanium processor. Large file support was also added for Linux64 and Win64.

The garbage collection changes resulted in the creation of two new slots on an object, tp_traverse and tp_clear. The augmented assignment changes result in the creation of a new slot for each in-place operator. The GC API creates new requirements for container types implemented in C extension modules. See Include/objimpl.h in the Python source distribution.

A.1.4 Windows Changes

New popen2, popen3, and popen4 calls were added in the os module.

The os.popen call is now much more usable on Windows 95 and 98. To fix this call for Windows 9x, Python internally uses the w9xpopen.exe program in the root of your Python installation (it is not a standalone program). See Microsoft Knowledge Base article Q150956 for more details.

Administrator privileges are no longer required to install Python on Windows NT or Windows 2000. The Windows installer also now installs by default in \Python20\ on the default volume (e.g., C:\Python20 ), instead of the older-style \Program Files\Python-2.0\.

The Windows installer no longer runs a separate Tcl/Tk installer; instead, it installs the needed Tcl/Tk files directly in the Python directory. If you already have a Tcl/Tk installation, this wastes some disk space (about 4 MB) but avoids problems with conflicting Tcl/Tk installations and makes it much easier for Python to ensure that Tcl/Tk can find all its files.

Python 2.1 Alpha Features

Like the weather in Colorado, if you wait long enough, Python's feature set changes. Just before this edition went to the printer, the first alpha release of Python 2.1 was announced. Among its new weapons are these:

· Functions can now have arbitrary attributes attached to them; simply assign to function attribute names to associate extra information with the function (something coders had been doing with formatted documentation stings).

· A new rich comparison extension now allows classes to overload individual comparison operators with distinct methods (e.g., __lt__ overloads < tests), instead of trying to handle all tests in the single __cmp__ method.

· A warning framework provides an interface to messages issued for use of deprecated features (e.g., the regex module).

· The Python build system has been revamped to use the Distutils package.

· A new sys.displayhook attribute allows users to customize the way objects are printed at the interactive prompt.

· Line-by-line file input/output (the file readline method) was made much faster, and a new xreadlines file method reads just one line at a time in for loops.

· Also: the numeric coercion model used in C extensions was altered, modules may now set an __all__ name to specify which names they export for from * imports, the ftplib module now defaults to "passive" mode to work better with firewalls, and so on.

· Other enhancements, such as statically nested scopes and weak references, were still on the drawing board in the alpha release.

As usual, of course, you should consult this book's web page (http://www.rmi.net/~lutz/about-pp.html) and Python 2.1 and later release notes for Python developments that will surely occur immediately after I ship this insert off to my publisher.

A.2 Major Changes in 1.6

This section lists changes introduced by Python release 1.6; by proxy, most are part of release 2.0 as well.

A.2.1 Incompatibilities

The append method for lists can no longer be invoked with more than one argument. This used to append a single tuple made out of all arguments, but was undocumented. To append a tuple, write l.append((a, b, c)).

The connect, connect_ex, and bind methods for sockets require exactly one argument. Previously, you could call s.connect(host, port), but this was not by design; you must now write s.connect((host, port)).

The str and repr functions are now different more often. For long integers, str no longer appends an "L"; str(1L) is "1", which used to be "1L", and repr(1L) still returns "1L". For floats, repr now gives 17 digits of precision to ensure that no precision is lost (on all current hardware).

Some library functions and tools have been moved to the deprecated category, including some widely used tools such as find. The string module is now simply a frontend to the new string methods, but given that this module is used by almost every Python module written to date, it is very unlikely to go away.

A.2.2 Core Language Changes

The following sections describe changes made to the Python language itself.

A.2.2.1 Unicode strings

Python now supports Unicode (i.e., 16-bit wide character) strings. Release 1.6 added a new fundamental datatype (the Unicode string), a new built-in function unicode, and numerous C APIs to deal with Unicode and encodings. Unicode string constants are prefixed with the letter "u", much like raw strings (e.g., u"..."). See the file Misc/unicode.txt in your Python distribution for details, or visit web site http://starship.python.net/crew/lemburg/unicode-proposal.txt.

A.2.2.2 String methods

Many of the functions in the string module are now available as methods of string objects. For instance, you can now say str.lower( ) instead of importing the string module and saying string.lower(str). The equivalent of string.join(sequence,delimiter) is delimiter.join(sequence). (That is, you use " ".join(sequence) to mimic string.join(sequence)).

A.2.2.3 New (internal) regular expression engine

The new regular expression engine, SRE, is fully backward-compatible with the old engine, and is invoked using the same interface (the re module). That is, the re module's interface remains the way to write matches, and is unchanged; it is simply implemented to use SRE. You can explicitly invoke the old engine by importing pre, or the SRE engine by importing sre. SRE is faster than pre, and supports Unicode (which was the main reason to develop yet another underlying regular expression engine).

A.2.2.4 apply-like function calls syntax

Special function call syntax can be used instead of the apply function: f(*args, **kwds) is equivalent to apply(f, args, kwds). You can also use variations like f(a1, a2, *args, **kwds), and can leave one or the other out (e.g., f(*args), f(**kwds)).

A.2.2.5 String to number conversion bases

The built-ins int and long take an optional second argument to indicate the conversion base, but only if the first argument is a string. This makes string.atoi and string.atol obsolete. (string.atof already was.)

A.2.2.6 Better errors for local name oddities

When a local variable is known to the compiler but undefined when used, a new exception UnboundLocalError is raised. This is a class derived from NameError, so code that catches NameError should still work. The purpose is to provide better diagnostics in the following example:

x = 1
def f( ):
 print x
 x = x+1

This used to raise a confusing NameError on the print statement.

A.2.2.7 Membership operator overloading

You can now override the in operator by defining a __contains_ _ method. Note that it has its arguments backward: x in a runs a.__contains__(x) (that's why the name isn't __in__).

A.2.3 Selected Library Module Changes

This section lists some of the changes made to the Python standard library.

distutils

New; tools for distributing Python modules.

zipfile

New; read and write zip archives (module gzip does gzip files).

unicodedata

New; access to the Unicode 3.0 database.

_winreg

New; Windows registry access (one without the _ is in progress).

socket , httplib , urllib

Expanded to include optional OpenSSL secure socket support (on Unix only).

_tkinter

Support for Tk versions 8.0 through 8.3.

string

This module no longer uses the built-in C strop module, but takes advantage of the new string methods to provide transparent support for both Unicode and ordinary strings.

A.2.4 Selected Tools Changes

This section lists some of the changes made to Python tools.

IDLE

Completely overhauled. See the IDLE home page at http://www.python.org for more information.

Tools/i18n/pygettext.py

Python equivalent of xgettext message text extraction tool used for internationalizing applications written in Python.

A.3 Major Changes Between 1.3 and 1.5.2

This section describes significant language, library, tool, and C API changes in Python between the first edition of this book (Python 1.3) and Python release 1.5.2.

A.3.1 Core Language Changes

The following sections describe changes made to the Python language itself.

A.3.1.1 Pseudo-private class attributes

Python now provides a name-mangling protocol that hides attribute names used by classes. Inside a class statement, a name of the form _ _X is automatically changed by Python to _Class_ _X , where Class is the name of the class being defined by the statement. Because the enclosing class name is prepended, this feature limits the possibilities of name clashes when you extend or mix existing classes. Note that this is not a "private" mechanism at all, just a class name localization feature to minimize name clashes in hierarchies and the shared instance object's namespace at the bottom of the attribute inheritance links chain.

A.3.1.2 Class exceptions

Exceptions may now take the form of class (and class instance) objects. The intent is to support exception categories. Because an except clause will now match a raised exception if it names the raised class or any of its superclasses, specifying superclasses allows try statements to catch broad categories without listing all members explicitly (e.g., catching a numeric-error superclass exception will also catch specific kinds of numeric errors). Python's standard built-in exceptions are now classes (instead of strings) and have been organized into a shallow class hierarchy; see the library manual for details.

A.3.1.3 Package imports

Import statements may now reference directory paths on your computer by dotted-path syntax. For instance:

import directory1.directory2.module # and use path
from directory1.directory2.module import name # and use "name"

Both load a module nested two levels deep in packages (directories). The leftmost package name in an import path (directory1) must be a directory within a directory that is listed in the Python module search path (sys.path initialized from PYTHONPATH). Thereafter, the import statement's path denotes subdirectories to follow. Paths prevent module name conflicts when installing multiple Python systems on the same machine that expect to find their own version of the same module name (otherwise, only the first on PYTHONPATH wins).

Unlike the older ni module that this feature replaces, the new package support is always available (without running special imports) and requires each package directory along an import path to contain a (possibly empty) __init__.py module file to identify the directory as a package, and serve as its namespace if imported directly. Packages tend to work better with from than with import, since the full path must be repeated to use imported objects after an import.

A.3.1.4 New assert statement

Python 1.5 added a new statement:

assert test [, value]

which is the same as:

if __debug__:
 if not test:
 raise AssertionError, value

Assertions are mostly meant for debugging, but can also be used to specify program constraints (e.g., type tests on entry to functions).

A.3.1.5 Reserved word changes

The word "assert" was added to the list of Python reserved words; "access" was removed (it has now been deprecated in earnest).

A.3.1.6 New dictionary methods

A few convenience methods were added to the built-in dictionary object to avoid the need for manual loops: D.clear( ), D.copy( ), D.update( ), and D.get( ). The first two methods empty and copy dictionaries, respectively. D1.update(D2) is equivalent to the loop:

for k in D2.keys( ): D1[k] = D2[k]

D.get(k) returns D[k] if it exists, or None (or its optional second argument) if the key does not exist.

A.3.1.7 New list methods

List objects have a new method, pop, to fetch and delete the last item of the list:

x = s.pop( ) 
...is the same as the two statements...
 x = s[-1]; del s[-1]

and extend, to concatenate a list of items on the end, in place:

s.extend(x) 
...is the same as... 
s[len(s):len(s)] = x

The pop method can also be passed an index to delete (it defaults to -1). Unlike append, extend is passed an entire list and adds each of its items at the end.

A.3.1.8 "Raw" string constants

In support of regular expressions and Windows, Python allows string constants to be written in the form r"...\...", which works like a normal string except that Python leaves any backslashes in the string alone. They remain as literal \ characters rather than being interpreted as special escape codes by Python.

A.3.1.9 Complex number type

Python now supports complex number constants (e.g., 1+3j) and complex arithmetic operations (normal math operators, plus a cmath module with many of the math module's functions for complex numbers).

A.3.1.10 Printing cyclic objects doesn't core dump

Objects created with code like L.append(L) are now detected and printed specially by the interpreter. In the past, trying to print cyclic objects caused the interpreter to loop recursively (which eventually led to a core dump).

A.3.1.11 raise without arguments: re-raise

A raise statement without any exception or extra-data arguments now makes Python re-raise the most recently raised uncaught exception.

A.3.1.12 raise forms for class exceptions

Because exceptions can now either be string objects or classes and class instances, you can use any of the following raise statement forms:

raise string # matches except with same string object
raise string, data # same, with optional data
 
raise class, instance # matches except with class or its superclass
raise instance # same as: raise instance.__class__, instance
 
raise # reraise last exception

You can also use the following three forms, which are for backwards-compatibility with earlier releases where all built-in exceptions were strings:

raise class # same as: raise class( ) (and: raise class, instance)
raise class, arg # same as: raise class(arg)
raise class, (arg,...) # same as: raise class(args...)
A.3.1.13 Power operator X ** Y

The new ** binary operator computes the left operand raised to the power of the right operand. It works much like the built-in pow function.

A.3.1.14 Generalized sequence assignments

In an assignment (= statements and other assignment contexts), you can now assign any sort of sequence on the right to a list or tuple on the left (e.g., (A,B) = seq, [A,B] = seq ). In the past, the sequence types had to match.

A.3.1.15 It's faster

Python 1.5 has been clocked at almost twice the speed of its predecessors on the Lib/test/pystone.py benchmark. (I've seen almost a threefold speedup in other tests.)

A.3.2 Library Changes

The following sections describe changes made to the Python standard library.

A.3.2.1 dir(X) now works on more objects

The built-in dir function now reports attributes for modules, classes, and class instances, as well as for built-in objects such as lists, dictionaries, and files. You don't need to use members like __methods__ (but you still can).

A.3.2.2 New conversions: int(X), float(X), list(S)

The int and float built-in functions now accept string arguments, and convert from strings to numbers exactly like string.atoi/atof. The new list(S) built-in function converts any sequence to a list, much like the older and obscure map(None, S) trick.

A.3.2.3 The new re regular expression module

A new regular expression module, re, offers full-blown Perl-style regular expression matching. See Chapter 18, for details. The older regex module described in the first edition is still available, but considered obsolete.

A.3.2.4 splitfields/joinfields became split/join

The split and join functions in the string module were generalized to do the same work as the original splitfields and joinfields.

A.3.2.5 Persistence: unpickler no longer calls __init__

Beginning in Python 1.5, the pickle module's unpickler (loader) no longer calls class __init__ methods to recreate pickled class instance objects. This means that classes no longer need defaults for all constructor arguments to be used for persistent objects. To force Python to call the __init_ _ method (as it did before), classes must provide a special __getinitargs__ method; see the library manual for details.

A.3.2.6 Object pickler coded in C: cPickle

An implementation of the pickle module in C is now a standard part of Python. It's called cPickle and is reportedly many times faster than the original pickle. If present, the shelve module loads it instead of pickle automatically.

A.3.2.7 anydbm.open now expects a "c" second argument for prior behavior

To open a DBM file in "create new or open existing for read+write" mode, pass a "c" in argument 2 to anydbm.open. This changed as of Python 1.5.2; passing a "c" now does what passing no second argument used to do (the second argument now defaults to "r" -- read-only). This does not impact shelve.open.

A.3.2.8 rand module replaced by random module

The rand module is now deprecated; use random instead.

A.3.2.9 Assorted Tkinter changes

Tkinter became portable to and sprouted native look-and-feel for all major platforms (Windows, X, Macs). There has been a variety of changes in the Tkinter GUI interface:

StringVar objects can't be called

The __call_ _ method for StringVar class objects was dropped in Python 1.4; that means you need to explicitly call their get( )/set( ) methods, instead of calling them with or without arguments.

ScrolledText changed

The ScrolledText widget went through a minor interface change in Python 1.4, which was apparently backed out in release 1.5 due to code breakage (so never mind).

Gridded geometry manager

Tkinter now supports Tk's new grid geometry manager. To use it, call the grid method of widget objects (much like pack , but passes row and column numbers, not constraints).

New Tkinter documentation site

Fredrik Lundh now maintains a nice set of Tkinter documentation at http://www.pythonware.com, which provides references and tutorials.

A.3.2.10 CGI module interface change

The CGI interface changed. An older FormContent interface was deprecated in favor of the FieldStorage object's interface. See the library manual for details.

A.3.2.11 site.py, user.py, and PYTHONHOME

These scripts are automatically run by Python on startup, used to tailor initial paths configuration. See the library manuals for details.

A.3.2.12 Assignment to os.environ[key] calls putenv

Assigning to a key in the os.environ dictionary now updates the corresponding environment variable in the C environment. It triggers a call to the C library's putenv routine such that the changes are reflected in integrated C code layers as well as in the environment of any child processes spawned by the Python program. putenv is now exposed in the os module too (os.putenv).

A.3.2.13 New sys.exc_info( ) tuple

The new exc_info( ) function in the sys module returns a tuple of values corresponding to sys.exc_type and sys.exc_value. These older names access a single global exception; exc_info is specific to the calling thread.

A.3.2.14 The new operator module

There is a new standard module called operator, which provides functions that implement most of the built-in Python expression operators. For instance, operator.add(X,Y) does the same thing as X+Y, but because operator module exports are functions, they are sometimes handy to use in things like map, so you don't have to create a function or use a lambda form.

A.3.3 Tool Changes

The following sections describe major Python tool-related changes.

A.3.3.1 JPython (a.k.a. Jython): a Python-to-Java compiler

The new JPython system is an alternative Python implementation that compiles Python programs to Java Virtual Machine ( JVM) bytecode and provides hooks for integrating Python and Java programs. See Chapter 15.

A.3.3.2 MS-Windows ports: COM, Tkinter

The COM interfaces in the Python Windows ports have evolved substantially since the first edition's descriptions (it was "OLE" back then); see Chapter 15. Python also now ships as a self-installer for Windows, with built-in support for the Tkinter interface, DBM-style files, and more; it's a simple double-click to install today.

A.3.3.3 SWIG growth, C++ shadow classes

The SWIG system has become a primary extension writers' tool, with new "shadow classes" for wrapping C++ classes. See Chapter 19.

A.3.3.4 Zope (formerly Bobo): Python objects for the Web

This system for publishing Python objects on the Web has grown to become a popular tool for CGI programmers and web scripters in general. See the Zope section in Chapter 15.

A.3.3.5 HTMLgen: making HTML from Python classes

This tool for generating correct HTML files (web page layouts) from Python class object trees has grown to maturity. See Chapter 15.

A.3.3.6 PMW: Python mega-widgets for Tkinter

The PMW system provides powerful, higher-level widgets for Tkinter-based GUIs in Python. See Chapter 6.

A.3.3.7 IDLE: an integrated development environment GUI

Python now ships with a point-and-click development interface named IDLE. Written in Python using the Tkinter GUI library, IDLE either comes in the source library's Tools directory or is automatically installed with Python itself (on Windows, see IDLE's entry in the Python menu within your Start button menus). IDLE offers a syntax-coloring text editor, a graphical debugger, an object browser, and more. If you have Python with Tk support enabled and are accustomed to more advanced development interfaces, IDLE provides a feature-rich alternative to the traditional Python command line. IDLE does not provide a GUI builder today.

A.3.3.8 Other tool growth: PIL, NumPy, Database API

The PIL image processing and NumPy numeric programming systems have matured considerably, and a portable database API for Python has been released. See Chapter 6 and Chapter 16.

A.3.4 Python/C Integration API Changes

The following sections describe changes made to the Python C API.

A.3.4.1 A single Python.h header file

All useful Python symbols are now exported in the single Python.h header file; no other header files need be imported in most cases.

A.3.4.2 A single libpython*.a C library file

All Python interpreter code is now packaged in a single library file when you build Python. For instance, under Python 1.5, you need only link in libpython1.5.a when embedding Python (instead of the older scheme's four libraries plus .o's).

A.3.4.3 The "Great (Grand?) Renaming" is complete

All exposed Python symbols now start with a "Py" prefix.

A.3.4.4 Threading support, multiple interpreters

A handful of new API tools provide better support for threads when embedding Python. For instance, there are tools for finalizing Python (Py_Finalize) and for creating "multiple interpreters" (Py_NewInterpreter).

Note that spawning Python language threads may be a viable alternative to C-level threads, and multiple namespaces are often sufficient to isolate names used in independent system components; both schemes are easier to manage than multiple interpreters and threads. But in some threaded programs, it's also useful to have one copy of system modules and structures per thread, and this is where multiple interpreters come in handy (e.g., without one copy per thread, imports might find an already-loaded module in the sys.modules table if it was imported by a different thread). See the new C API documentation manuals for details.

A.3.4.5 New Python C API documentation

There is a new reference manual that ships with Python and documents major C API tools and behavior. It's not fully fleshed out yet, but it's a useful start.

Chapter 21 TOC Appendix B