11.4 The Python Database API (DBAPI) 2.0As I mentioned earlier, the Python standard library does not come with an RDBMS interface, but there are many free third-party modules that let your Python programs access specific databases. Such modules mostly follow the Python Database API 2.0 standard, also known as the DBAPI. At the time of this writing, Python's DBAPI Special Interest Group (SIG) was busy preparing a new version of the DBAPI (possibly to be known as 3.0 when it is ready). Programs written against DBAPI 2.0 should work with minimal or no changes with the future DBAPI 3.0, although 3.0 will no doubt offer further enhancements that future programs will be able to take advantage of. If your Python program runs only on Windows, you may prefer to access databases by using Microsoft's ADO package through COM. For more information on using Python on Windows, see the book Python Programming on Win32, by Mark Hammond and Andy Robinson (O'Reilly). Since ADO and COM are platform-specific, and this book focuses on cross-platform use of Python, I do not cover ADO nor COM further in this book. After importing a DBAPI-compliant module, you call the module's connect function with suitable parameters. connect returns an instance of class Connection, which represents a connection to the database. This instance supplies commit and rollback methods to let you deal with transactions, a close method to call as soon as you're done with the database, and a cursor method that returns an instance of class Cursor. This instance supplies the methods and attributes that you'll use for all database operations. A DBAPI-compliant module also supplies exception classes, descriptive attributes, factory functions, and type-description attributes. 11.4.1 Exception ClassesA DBAPI-compliant module supplies exception classes Warning, Error, and several subclasses of Error. Warning indicates such anomalies as data truncation during insertion. Error's subclasses indicate various kinds of errors that your program can encounter when dealing with the database and the DBAPI-compliant module that interfaces to it. Generally, your code uses a statement of the form: try: ... except module.Error, err: ... in order to trap all database-related errors that you need to handle without terminating. 11.4.2 Thread SafetyWhen a DBAPI-compliant module has an attribute threadsafety that is greater than 0, the module is asserting some specific level of thread safety for database interfacing. Rather than relying on this, it's safer and more portable to ensure that a single thread has exclusive access to any given external resource, such as a database, as outlined in Chapter 14. 11.4.3 Parameter StyleA DBAPI-compliant module has an attribute paramstyle that identifies the style of markers to use as placeholders for parameters. You insert such markers in SQL statement strings that you pass to methods of Cursor instances, such as method execute, in order to use runtime-determined parameter values. Say, for example, that you need to fetch the rows of database table ATABLE where field AFIELD equals the current value of Python variable x. Assuming the cursor instance is named c, you could perform this task by using Python's string formatting operator % as follows: c.execute('SELECT * FROM ATABLE WHERE AFIELD=%r' % x) However, this is not the recommended approach. This approach generates a different statement string for each value of x, requiring such statements to be parsed and prepared anew each time. With parameter substitution, you pass to execute a single statement string, with a placeholder instead of the parameter value. This lets execute perform parsing and preparation just once, giving potentially better performance. For example, if a module's paramstyle attribute is 'qmark', you can express the above query as: c.execute('SELECT * FROM ATABLE WHERE AFIELD=?', [x]) The read-only attribute paramstyle is meant to inform your program about the way to use parameter substitution with that module. The possible values of paramstyle are:
When paramstyle does not imply named parameters, the second argument of method execute is a sequence. When parameters are named, the second argument of method execute is a dictionary. 11.4.4 Factory FunctionsParameters passed to the database via placeholders must typically be of the right type. This means Python numbers (integers or floating-point values), strings (plain or Unicode), and None to represent SQL NULL. Python has no specific types to represent dates, times, and binary large objects (BLOBs). A DBAPI-compliant module supplies factory functions to build such objects. The types used for this purpose by most DBAPI-compliant modules are those supplied by module mxDateTime, covered in Chapter 12, and strings or buffer types for BLOBs. The factory functions are as follows.
Returns an object representing the given string of bytes as a BLOB.
Returns an object representing the specified date.
Returns an object representing the date that is s seconds after the epoch of module time, covered in Chapter 12. For example, DateFromTicks(time.time( )) is today's date.
Returns an object representing the specified time.
Returns an object representing the time that is s seconds after the epoch of module time, covered in Chapter 12. For example, TimeFromTicks(time.time( )) is the current time.
Returns an object representing the specified date and time.
Returns an object representing the date and time that is s seconds after the epoch of module time, covered in Chapter 12. For example, TimestampFromTicks(time.time( )) is the current date and time. 11.4.5 Type Description AttributesA Cursor instance's attribute description describes the types and other characteristics of each column of a query. Each column's type (the second item of the tuple describing the column) equals one of the following attributes of the DBAPI-compliant module:
A cursor's description, and in particular each column's type, is mostly useful for introspection about the database your program is working with. Such introspection can help you write general modules that are able to work with databases that have different schemas, schemas that may not be fully known at the time you are writing your code. 11.4.6 The connect FunctionA DBAPI-compliant module's connect function accepts arguments that vary depending on the kind of database and the specific module involved. The DBAPI standard recommends, but does not mandate, that connect accept named arguments. In particular, connect should at least accept optional arguments with the following names:
11.4.7 Connection ObjectsA DBAPI-compliant module's connect function returns an object x that is an instance of class Connection. x supplies the following methods.
Terminates the database connection and releases all related resources. Call close as soon as you're done with the database, since keeping database connections uselessly open can be a serious resource drain on the system.
Commits the current transaction in the database. If the database does not support transactions, x.commit( ) is an innocuous no-op.
Returns a new instance of class Cursor, covered later in this section.
Rolls back the current transaction in the database. If the database does not support transactions, x.rollback( ) raises an exception. The DBAPI recommends, but does not mandate, that for databases that do not support transactions class Connection supplies no rollback method, so that x.rollback( ) raises AttributeError. You can test whether transaction support is present with hasattr(x,'rollback'). 11.4.8 Cursor ObjectsA Connection instance provides a cursor method that returns an object c that is an instance of class Cursor. A SQL cursor represents the set of results of a query and lets you work with the records in that set, in sequence, one at a time. A cursor as modeled by the DBAPI is a richer concept, since it also represents the only way in which your program executes SQL queries in the first place. On the other hand, a DBAPI cursor allows you only to advance in the sequence of results (some relational databases, but not all, also provide richer cursors that are able to go backward as well as forward), and does not support the SQL clause WHERE CURRENT OF CURSOR. These limitations of DBAPI cursors enable DBAPI-compliant modules to provide cursors even on RDBMSes that provide no real SQL cursors at all. An instance of class Cursor c supplies many attributes and methods; the most frequently used ones are documented here.
Closes the cursor and releases all related resources.
A read-only attribute that is a sequence of seven-item tuples, one per column in the last query executed: name, typecode, displaysize, internalsize, precision, scale, nullable c.description is None if the last operation on c was not a query or returned no usable description of the columns involved. A cursor's description is mostly useful for introspection about the database your program is working with. Such introspection can help you write general modules that are able to work with databases that have different schemas, including schemas that may not be fully known at the time you are writing your code.
Executes a SQL statement on the database with the given parameters. parameters is a sequence when the module's paramstyle is 'format', 'numeric', or 'qmark', and a dictionary when 'named' or 'pyformat'.
Executes a SQL statement on the database, once for each item of the given parameters. parameters is a sequence of sequences when the module's paramstyle is 'format', 'numeric', or 'qmark', and a sequence of dictionaries when 'named' or 'pyformat'. For example, the statement: c.executemany('UPDATE atable SET x=? WHERE y=?', (12,23),(23,34)) that uses a module whose paramstyle is 'qmark' is equivalent to, but probably faster than, the two statements: c.execute('UPDATE atable SET x=12 WHERE y=23') c.execute('UPDATE atable SET x=23 WHERE y=34')
Returns all remaining result rows from the last query as a sequence of tuples. Raises an exception if the last operation was not a SELECT query.
Returns up to n remaining result rows from the last query as a sequence of tuples. Raises an exception if the last operation was not a SELECT query.
Returns the next result row from the last query as a tuple. Raises an exception if the last operation was not a SELECT query.
A read-only attribute that specifies the number of rows fetched or affected by the last operation, or -1 if the module is unable to determine this value. 11.4.9 DBAPI-Compliant ModulesWhatever relational database you want to use, there's at least one (and often more than one) DBAPI-compliant module that you can download from the Internet. All modules listed in the following sections, except mxODBC, have liberal licenses that are mostly similar to Python's license (the SAP DB, however, is licensed under GPL) and that let you use them freely in either open source or closed source programs. mxODBC can be used freely for noncommercial purposes, but you must purchase a license for any commercial use. There are so many relational databases that it's impossible to list them all, but here are some of the most popular ones:
11.4.10 GadflyGadfly, available at http://gadfly.sf.net, is not an interface to some other RDBMS, but rather a complete RDBMS engine written in Python. Gadfly supports a large subset of standard SQL. For example, Gadfly lacks NULL, but it does support VIEW, which is a crucial SQL feature that engines such as MySQL still lack at the time of this writing. Gadfly can run as a daemon server, to which clients connect with TCP/IP. Alternatively, you can run the Gadfly engine directly in your application's process, if you don't need other processes to be able to access the same database concurrently. The gadfly module has several discrepancies from the DBAPI 2.0 covered in this chapter because Gadfly implements a variant of the older DBAPI 1.0. The concepts are quite close, but several details differ. The main differences are:
The gadfly module supplies the following functions.
Returns a connection object for the database named dbname, which must have been previously created in the directory indicated by string dirpath. The database engine runs in the same process as your application.
Returns a connection object for the database served by a gfserve process on the given host and port. policyname identifies the level of access required, and is often 'admin' to specify unlimited access. |