17.2 Debugging
Since
Python's development cycle is so fast, the most
effective way to debug is often to edit your code to make it output
relevant information at key points. Python has many ways to let your
code explore its own state in order to extract information that may
be relevant for debugging. The inspect and
traceback modules specifically support such
exploration, which is also known as reflection or introspection.
Once you have obtained debugging-relevant information, statement
print is often the simplest way to display it. You
can also log debugging information to files. Logging is particularly
useful for programs that run unattended for a long time, as is
typically the case for server programs. Displaying debugging
information is like displaying other kinds of information, as covered
in Chapter 10 and Chapter 16, and similarly for logging it, as covered in
Chapter 10 and Chapter 11. Python 2.3 will also include a module
specifically dedicated to logging. As covered in Chapter 8, rebinding attribute
excepthook of module sys lets
your program log detailed error information just before your program
is terminated by a propagating
exception.
Python also offers hooks enabling interactive debugging. Module
pdb supplies a simple text-mode interactive
debugger. Other interactive debuggers for Python are part of
integrated development environments (IDEs), such as IDLE and various
commercial offerings. However, I do not cover IDEs in this book.
17.2.1 The inspect Module
The inspect module
supplies functions to extract information from all kinds of objects,
including the Python call stack (which records all function calls
currently executing) and source files. At the time of this writing,
module inspect is not yet available for Jython.
The most frequently used functions of module
inspect are as follows.
getargspec, formatargspec |
|
f is a
function object. getargspec returns a tuple with
four items
(arg_names,
extra_args,
extra_kwds,
arg_defaults).
arg_names is the sequence of names of
f's formal arguments.
extra_args is the name of the special
formal argument of the form
*args, or
None if f has no such
special argument. extra_kwds is the name
of the special formal argument of the form
**kwds, or
None if f has no such
special argument. arg_defaults is the
tuple of default values for
f's arguments. You can
deduce other details about
f's signature from
getargspec's results. For
example, f has
len(arg_names)-len(arg_defaults)
mandatory arguments, and the names of
f's optional arguments
are the strings that are the items of the list slice
arg_names[-len(arg_defaults):].
formatargspec accepts one to four arguments that
are the same as the items of the tuple that
getargspec returns, and returns a formatted string
that displays this information. Thus,
formatargspec(*getargspec(f))
returns a formatted string with
f's formal arguments
(i.e., f's
signature) in parentheses, as used in the
def statement that created
f.
getargvalues, formatargvalues |
|
f is a frame object, for example the
result of a call to the function _getframe in
module sys (covered in Chapter 8) or to function
currentframe in module inspect.
getargvalues returns a tuple with four items
(arg_names,
extra_args,
extra_kwds,
locals).
arg_names is the sequence of names of
f's
function's formal arguments.
extra_args is the name of the special
formal argument of form
*args, or
None if
f's function has no such
special argument. extra_kwds is the name
of the special formal argument of form
**kwds, or
None if
f's function has no such
special argument. locals is the dictionary
of local variables for f. Since arguments,
in particular, are local variables, the value of each actual argument
can be obtained from locals by indexing
the locals dictionary with the
argument's name.
formatargvalues accepts one to four arguments that
are the same as the items of the tuple that
getargvalues returns, and returns a formatted
string that displays this information.
formatargvalues(*getargvalues(f))
returns a formatted string with
f's actual arguments in
parentheses, in named (keyword) form, as used in the call statement
that created f. For example:
def f(x=23): return inspect.currentframe( )
print inspect.formatargvalues(inspect.getargvalues(f( )))
# prints: (x=23)
Returns the frame object for the current function (caller of
currentframe).
formatargvalues(getargvalues(currentframe( )), for
example, returns a formatted string with the actual arguments of the
calling function.
Returns the docstring for obj, with tabs
expanded to spaces and redundant whitespace stripped from each line.
Returns the name of the file that defined
obj, and raises
TypeError when unable to determine the file. For
example, getfile raises
TypeError if obj is
built-in. getfile returns the name of a binary or
source file. getsourcefile returns the name of a
source file, and raises TypeError when it can
determine only a binary file, not the corresponding source file.
getmembers(obj, filter=None)
|
|
Returns all attributes (members) of obj, a
sorted list of
(name,value)
pairs. When filter is not
None, returns only attributes for which callable
filter returns a true result when called
on the attribute's value,
like:
[ (n, v) for n, v in getmembers(obj) if filter(v) ]
Returns the module object that defined
obj, or None if unable
to determine it.
Returns a tuple of bases and ancestors of class
c in method resolution order.
c is the first item in the tuple. Each
class appears only once in the tuple.
getsource, getsourcelines |
|
Returns a single multiline string that is the source code for
obj, and raises IOError
if unable to determine or fetch it. getsourcelines
returns a pair: the first item is the source code for
obj (a list of lines), and the second item
is the line number of the list's first line in the
source file it comes from.
isbuiltin,isclass,iscode, isframe, isfunction, ismethod, ismodule, isroutine |
|
Each of these functions accepts a
single argument obj and returns
True if obj belongs to
the type indicated in the function name. Accepted objects are,
respectively: built-in (C-coded) functions, class objects, code
objects, frame objects, Python-coded functions (including
lambda expressions), methods, modules, and, for
isroutine, all methods or functions, either
C-coded or Python-coded. These functions are often used as the
filter argument to
getmembers.
Returns a list of six-item tuples. The first tuple is about
stack's caller, the second tuple
is about the caller's caller, and so on. Each
tuple's items, in order, are: frame object,
filename, line number, function name, list of
context source code lines around the
current line, and index of current line within the list.
For example, suppose that at some point in your program you execute a
statement such as:
x.f( )
and unexpectedly receive an AttributeError
informing you that object x has no
attribute named f. This means that object
x is not as you expected, so you want to
determine more about x as a preliminary to
ascertaining why x is that way and what
you should do about it. Change the statement to:
try: x.f( )
except AttributeError:
import sys, inspect
sys.stderr.write('x is type %s(%r)\n'%(x,type(x)))
sys.stderr.write("x's methods are: ")
for n, v in inspect.getmembers(x, callable):
sys.stderr.write('%s '%n)
sys.stderr.write('\n')
raise
This example uses sys.stderr (covered in Chapter 8), since it's displaying
diagnostic information related to an error, not program results.
Function getmembers of module
inspect obtains the name of all methods available
on x in order to display them. Of course,
if you need this kind of diagnostic functionality often, you should
package it up into a separate function, such as:
import sys, inspect
def show_obj_methods(obj, name, show=sys.stderr.write):
show('%s is type %s(%r)\n'%(name,obj,type(obj)))
show("%s's methods are: "%name)
for n, v in inspect.getmembers(obj, callable):
show('%s '%n)
show('\n')
And then the example becomes just:
try: x.f( )
except AttributeError:
show_obj_methods(x, 'x')
raise
Good program structure and organization are just as necessary in code
intended for diagnostic and debugging purposes as they are in code
that implements your program's functionality. See
also Section 6.6.4 in Chapter 6 for a good
technique to use when defining diagnostic and debugging
functions.
17.2.2 The traceback Module
The traceback
module lets you extract, format, and output information about
tracebacks as normally produced by uncaught exceptions. By default,
module traceback reproduces the formatting Python
uses for tracebacks. However, module traceback
also lets you exert fine-grained control. The module supplies many
functions, but in typical use you will use only one of them.
print_exc(limit=None, file=sys.stderr)
|
|
Call print_exc from an exception handler or a
function directly or indirectly called by an exception handler.
print_exc outputs to file-like object
file the traceback information that Python
outputs to stderr for uncaught exceptions. When
limit is not None,
print_exc outputs only
limit traceback nesting levels. For
example, when, in an exception handler, you want to cause a
diagnostic message just as if the exception propagated, but actually
stop the exception from propagating any further (so that your program
keeps running, and no further handlers are involved), call
traceback.print_exc( ).
17.2.3 The pdb Module
The
pdb module exploits the Python
interpreter's debugging and tracing hooks to
implement a simple, command-line-oriented interactive debugger.
pdb lets you set breakpoints, single-step on
sources, examine stack frames, and so on.
To run some code under pdb's
control, you import pdb and then call
pdb.run, passing as the single argument a string
of code to execute. To use pdb for post-mortem
debugging (meaning debugging of code that terminated by propagating
an exception at an interactive prompt), call pdb.pm(
) without arguments. When pdb starts, it
first reads text files named .pdbrc in your home
directory and in the current directory. Such files can contain any
pdb commands, but most often they use the
alias command in order to define useful synonyms
and abbreviations for other commands.
When pdb is in control, it prompts you with the
string '(Pdb) ', and you can
enter pdb commands. Command
help (which you can also enter in the abbreviated
form h) lists all available commands. Call
help with an argument (separated by a space) to
get help about any specific command. You can abbreviate most commands
to the first one or two letters, but you must always enter commands
in lowercase: pdb, like Python itself, is
case-sensitive. Entering an empty line repeats the previous command.
The most frequently used pdb commands are the
following.
Executes
Python statement statement in the
currently debugged context.
alias [ name [ command ] ]
|
|
alias without arguments lists currently defined
aliases. alias name
outputs the current definition of the alias
name. In the full form,
command is any pdb
command, with arguments, and may contain %1,
%2, and so on to refer to arguments passed to the
new alias name being defined, or
%* to refer to all such arguments together.
Command unalias name
removes an alias.
Lists all actual arguments passed to the function you are currently
debugging.
break [ location [ ,condition ] ]
|
|
break without arguments lists currently defined
breakpoints and the number of times each breakpoint has triggered.
With an argument, break sets a breakpoint at the
given location.
location can be a line number or a
function name, optionally preceded by
filename: to set a
breakpoint in a file that is not the current one or at the start of a
function whose name is ambiguous (i.e., a function that exists in
more than one file). When condition is
present, condition is an expression to
evaluate (in the debugged context) each time the given line or
function is about to execute; execution breaks only when the
expression returns a true value. When setting a new breakpoint,
break returns a breakpoint number, which you can
then use to refer to the new breakpoint in any other
breakpoint-related pdb command.
clear [ breakpoint-numbers ]
|
|
Clears (removes) one or more breakpoints. clear
without arguments removes all breakpoints after asking for
confirmation. To deactivate a breakpoint without removing it, see
disable.
condition breakpoint-number [ expression ]
|
|
condition n
expression sets or changes the condition
on breakpoint n.
condition n, without
expression, makes breakpoint
n unconditional.
Continues execution of the code being debugged, up to a breakpoint if
any.
disable [ breakpoint-numbers ]
|
|
Disables one or more breakpoints. disable without
arguments disables all breakpoints (after asking for confirmation).
This differs from clear in that the debugger
remembers the breakpoint, and you can reactivate it via
enable.
Moves one frame down in the stack (i.e., toward the most recent
function call). Normally, the current position in the stack is at the
bottom (i.e., at the function that was called most recently and is
now being debugged). Therefore, command down
can't go further down. However, command
down is useful if you have previously executed
command up, which moves the current position
upward.
enable [ breakpoint-numbers ]
|
|
Enables one or more breakpoints. enable without
arguments enables all breakpoints after asking for confirmation.
ignore breakpoint-number [ count ]
|
|
Sets the breakpoint's ignore count (to
0, if count is
omitted). Triggering a breakpoint whose ignore count is greater than
0 just decrements the count. Execution stops,
presenting you with an interactive pdb prompt,
only when you trigger a breakpoint whose ignore count is
0. For example, say that module
fob.py contains the following code:
def f( ):
for i in range(1000):
g(i)
def g(i):
pass Now, consider the following interactive pdb
session:
>>> import pdb
>>> import fob
>>> pdb.run('fob.f( )')
> <string>(0)?( )
(Pdb) break fob.g
Breakpoint 1 at C:\mydir\fob.py:6
(Pdb) ignore 1 500
Will ignore next 500 crossings of breakpoint 1.
(Pdb) continue
> <string>(1)?( )
(Pdb) continue
> C:\mydir\fob.py(6)g( )
-> pass
(Pdb) print i
500 The ignore command, as pdb
shows in response to it, asks pdb to ignore the
next 500 hits on breakpoint 1,
which we just set at fob.g in the previous
break statement. Therefore, when execution finally
stops, function g has already been called
500 times, as we show by printing its argument
i, which indeed is now 500.
Note that the ignore count of breakpoint 1 is now
0; if we give another continue
and print i,
i will then show as 501. In
other words, once the ignore count is decremented back to
0, execution stops every time the breakpoint is
hit. If we want to skip some more hits, we need to give
pdb another ignore command, in
order to set the ignore count of breakpoint 1 at
some value greater than 0 yet again.
list [ first [ , last ] ]
|
|
list without arguments lists 11
lines centered on the current one, or the next 11
lines if the previous command was also a list. By
giving arguments to the list command, you may
explicitly specify the first and last lines to list within the
current file. The list command deals with physical
lines, including comments and empty lines, not with logical lines.
Executes the current line, without stepping into any function called
from the current line. However, hitting breakpoints in functions
called directly or indirectly from the current line does stop
execution.
Evaluates expression in the current
context and displays the result.
Immediately terminates both pdb and the program
being debugged.
Executes the rest of the current function, stopping only at
breakpoints if any.
Executes the current line, stepping into any function called from the
current line.
tbreak [ location [ ,condition ] ]
|
|
Like break, but the breakpoint is temporary (i.e.,
pdb automatically removes the breakpoint as soon
as the breakpoint is triggered).
Moves one frame up in the stack (i.e., away from the most recent
function call and toward the calling function).
Shows the stack of frames and indicates the current one (i.e., in
what frame's context command !
executes statements, command args shows arguments,
command p evaluates expressions,
etc.).
17.2.4 Debugging in IDLE
IDLE, the Interactive DeveLopment
Environment that comes with Python, offers debugging functionality
similar to that of pdb, although not quite as
powerful. Thanks to IDLE's GUI, however, you may
find the functionality easier to access. For example, instead of
having to ask for source lists and stack lists explicitly with such
pdb commands as list and
where, you just activate one or more of four
checkboxes in the Debug Control window to see source, stack, locals,
and globals always displayed in the same window at each step.
To start IDLE's interactive debugger, use menu Debug
Debugger in IDLE's *Python Shell*
window. IDLE opens the Debug Control window, outputs
[DEBUG ON] in the shell window,
and gives you another >>> prompt in the
shell window. Keep using the shell window as you normally
would—any command you give at the shell
window's prompt now runs under the debugger. To
deactivate the debugger, use Debug Debugger again;
IDLE then toggles the debug state, closes the Debug Control window,
and outputs [DEBUG OFF] in the
shell window. To control the debugger when the debugger is active,
use the GUI controls in the Debug Control window. You can toggle the
debugger away only when it is not busy actively tracking code:
otherwise, IDLE disables the Quit button in the Debug Control window.
|