URL of the article:

Issue: 04.2004
PEAR DB: Ripe for the Picking
An Overview of the sweet New Features in the 1.6.x Releases
by Daniel Convissor
Two solid months of tilling the proverbial soil, committing over 500 bits of fertilizer and squishing 90 documented bugs -- and countless undocumented ones -- has brought PEAR's most popular package to a new state of maturity. It is stable and fully tested on most of the supported DBMS's. The new release focuses on portability, allowing you to write applications that can be used with a wide variety of database back-ends.

Introduction
PEAR DB provides a common object oriented Application Programming Interface (API) for 13 of PHP's database extensions: dBase, FrontBase, InterBase/Firebird, Informix, mSQL, MS SQL Server, MySQL, MySQLi, Oracle, ODBC (tested with DB2 and MS Access), PostgreSQL, SQLite, and Sybase. Just to make sure we're all on the same page, PEAR (PHP Extension and Application Repository) is a community and the infrastructure for developing, storing, and distributing reusable PHP classes.

Stig Bakken, Tomas Cox, and Sterling Hughes established a tremendous foundation for PEAR DB. Unfortunately, none of the developers had more than a couple of the Database Management Systems (DBMSs) installed, therefore serious cross-platform testing couldn't happen. I got involved with DB in order to make interoperability a reality. So I installed eight DBMSs to complement the two I was already running and got down to business. The ensuing changes were so dramatic and the existing code base was already given release candidate status, that it seemed prudent to skip over a stable 1.5.0 release and head directly into 1.6.0 release cycle.

The Importance of Portability
Some people wonder why database portability is helpful. While changing the database server behind an existing application is rare, it's not unheard of. Some reasons for migrating between DBMSs include new licensing terms, price increases, and the need for different features as your demands evolve.

While you probably know that the most popular data store for PHP systems is MySQL, do you realize that MySQL 4.1 relies on a different API? Accessing the modernized API via PHP requires using the recently released mysqli extension, which has a different syntax. So, eventually all MySQL-based scripts will need to be updated as the old API is phased out. Programming with a database portability layer now will save you the headache of porting everything later.
The most important reason to me for using database abstraction is the ability to broadly distribute my applications because I don't need to worry about which software end-users have on their machines.

Along these same lines, some DBMS makers require licensing fees for programs which can only operate with their software. If your application runs on an assortment of back-ends, you're not liable to pay for such licenses. Using a database abstraction layer alone doesn't get you over this hurdle. You also need to ensure your table structures and SQL statements actually work on multiple platforms.

New Portability Modes
Each DBMS has it's own behaviors. For example, some databases capitalize field names in their output, some lowercase them, while others leave them alone. These quirks make it difficult to port your scripts over to another server type. PEAR DB overcomes most of these differences so programs (well, at lease ones that are well thought out) can switch between DBMSs without any changes.

DB's portability features used to be handled by the optimize configuration option. The effect of setting optimize to portability was limited and inconsistent, so optimize has been deprecated and replaced with an option called portability. The portability system is comprised of eight modes. Each mode is represented by a bitwise constant, so each mode can be easily enabled (via |) and disabled (via ^).

Configuration options, including portability modes, are controlled by using the connect() and setOption() methods. Following is an example of utilizing setOption() to invoke portability for lowercasing and trimming:

<?php
// Once you have a valid DB object named $db...
$db->setOption('portability',
DB_PORTABILITY_ALL ^ DB_PORTABILITY_RTRIM);
?>


To activate all portability options except trimming by way of setOption():

<?php
// Once you have a valid DB object named $db...
$db->setOption('portability',
DB_PORTABILITY_LOWERCASE |
DB_PORTABILITY_RTRIM);
?>


Most of the portability mode descriptions are accompanied by sample code. The scripts all refer to a file named connect.inc (see Fig. 1). This file includes DB.php from the include_path and instantiates a DB object for each database type the examples use. The examples then use variable variables to reference the appropriate object for the test in question.


Fig. 1: Instantiating the DB Objects Used in the Examples

DB_PORTABILITY_LOWERCASE
InterBase and Oracle converts the names of tables and columns to upper case, PostgreSQL makes them lower case, while the remainder of the DBMSs leave the case alone. This causes havoc when fetching rows into associative arrays due to PHP's variable name case sensitivity: $row[`NAME'], $row[`name'] and $row[`Name'] are all different. Referring to an element via the wrong case produces an undefined index notice. The DB_PORTABILITY_LOWERCASE mode resolves this issue by converting the names of tables and fields to lower case when using the get*(), fetch*(), and tableInfo() methods (see Fig. 2).


Fig. 2: Converting Array Keys to Lower Case so Scripts Always Find the Data

DB_PORTABILITY_RTRIM
Fixed width character columns right pad the unused portions with spaces when storing the data. Most databases return that padding in query results, though MySQL and SQLite never do, while MS SQL and Sybase do sometimes. This makes string comparisons impossible on a cross-platform basis. Right trimming the strings output by get*() and fetch*() eliminates this headache (see Fig. 3).


Fig. 3: Trimming Extra Right Hand Space ensures String Comparisons Work

DB_PORTABILITY_DELETE_COUNT
Some systems don't count the number of rows deleted when performing simple DELETE FROM queries. The Delete Count portability mode tricks such DBMSs into returning the deleted row count by appending WHERE 1=1 to these DELETE queries (see Fig. 4). This issue applies to FrontBase, SQLite and MySQL (at or below version 3.23).


Fig. 4: The Delete Count Hack Comes In Handy

DB_PORTABILITY_NUMROWS
Oracle's API doesn't have a means of determining the number of rows in a result set. Therefore, DBs numRows() method won't work for Oracle. DB includes a hack to figure out the number of rows by running a second query containing the original query. This workaround is disabled for performance considerations unless DB_PORTABILITY_NUMROWS is turned on.

DB_PORTABILITY_NULL_TO_EMPTY
Another charming Oracle feature is the inability to distinguish between NULL and an empty string. Saving an empty string to a column results in it being converted to NULL. Query results of such data from Oracle naturally contain NULL as well. But, all other database systems properly represent the data as an empty string. This leads to problems in PHP scripts if they rely on the data type being a string (see Fig. 5). If the Null to Empty portability mode is enabled, (yes, you guessed it) NULLs get converted to empty strings.


Fig. 5: Null to Empty conversion for Oracle

DB_PORTABILITY_ERRORS
Consistent error reporting is essential for any platform independent application. Setting the DB_PORTABILITY_ERRORS mode on takes care of this concern by remapping three error constants. The MySQL driver's error codes for primary and unique key conflicts are inappropriately mapped to DB_ERROR_ALREADY_EXISTS rather than the DB_ERROR_CONSTRAINT constant used in other drivers (see Fig. 6). Microsoft's ODBC implementation for its Access database program mistakenly reports no such field errors as code 07001, but that code really means too few parameters. DB's Error Mapping portability redirects that error code from DB_ERROR_MISMATCH to DB_ERROR_NOSUCHFIELD.


Fig. 6: Error Mapping Creates Consistency

DB_PORTABILITY_ALL
If you like all of these features you can turn them on using the DB_PORTABILITY_ALL constant.

DB_PORTABILITY_NONE
The default configuration has the portability system turned off. If you need to turn the system off in the middle of your scripts, use the DB_PORTABILITY_NONE constant.

Error Code Fixes
Everyone using DB with Microsoft SQL or Sybase back-ends must be careful when upgrading. Prior versions of DB's mssql and Sybase drivers inadvertently raised errors using the database's codes. Version 1.6.0 brings errorCode() into conformity with every other driver by having it return DB's error codes. An errorNative() method was added to the mssql class so users can obtain MS SQL's error code. Sybase can now return error codes each time an error raised rather than just the first time. I fixed the location of data in DB_Error objects created by the Sybase driver - native messages went to $userinfo and needlessly tried sending DB's own message to $native.

Several native error messages were mapped to existing DB error constants. Most notable is tying syntax errors to DB_ERROR_SYNTAX as well as primary and unique index violations to DB_ERROR_CONSTRAINT. The updated error reporting brings much needed consistency between the ibase, ifx, mssql, oci8, odbc(db2), pgsql, sqlite, and sybase drivers. This uniformity extends to mysql, mysqli, and odbc(access) for the most errors, and reaches all errors when error code portability mode is turned on.
The new DB_ERROR_CONSTRAINT_NOT_NULL constant handles null values in NOT NULL columns.

tableInfo()
The protracted transition of tableInfo() from the DB_result class (which represents query results) to the DB_common class (which is the base class for all of the database specific classes) has been completed. The old configuration had users looking for information on tables in a DB_common object while examining query results via a DB_result object. Everything is now accessed via the method in DB_common. Explore tables by passing a table name:

<?php
// Once you have a valid DB object named $db...
$info = $db->tableInfo('tbl');
print_r($info);
?>


Probe query results by passing a DB_result object:

<?php
// Once you have a valid DB object named $db...
$res =& $db->query('SELECT * FROM tbl');
$info = $db->tableInfo($res);
print_r($info);
?>


Note that most of PHP's database drivers aren't capable of determining table names from queries.
Sybase and Informix now have tableInfo() methods. Efforts were also made in each driver to make the method's output as similar as possible. Column and table names are lowercased if the DB_PORTABILITY_LOWERCASE portability mode is turned on. Metadata in the flags element is consistently space separated and the DEFAULT value of a column, when it can be determined, is passed through rawurlencode() to avoid problems caused by potential spaces in the value.

MySQLi
Users may have noticed that a mysqli class was added in the 1.6.0RC2 release. It didn't work because the code was essentially copied from the mysql driver, which has a different API. Fortunately, PHP's mysqli extension in the snapshots is now working, so I took the opportunity to rework DB_mysqli and test it out. I'm happy to report that DB's MySQLi support is fully functional as of the 1.6.3 release. It passes DB's test suite but is still marked as experimental because we still need to implement the native API calls for prepare/execute and commit/revert, let alone the driver hasn't received exposure to a wide user base yet. In case it matters, my PHP snapshot is from April 28, 2004 and MySQL is at version 4.1.1a-alpha.

Prepare / Execute
Significant revisions have been made to the various prepare() and execute() methods in order to make them more robust and behave the same way despite which DBMS is being used. Adjustments to placeholders include the ability to use any of them (?, !, and &) in each driver, the capacity to escape them for use in literals and the choice of not using them at all. Parameters passed to execute() can now be scalars and associative arrays in addition to the original behavior of accepting ordered arrays. Two changes were made when execute() is given file names. First, fopen($file, 'rb') is used to ensure we get binary data properly on Windows platforms. Second, bad file name now raise an DB_ERROR_ACCESS_VIOLATION error.

Quoting
Prior to 1.6.0, DB provided the quote() method for sanitizing input to be used as literals in queries. This method's behavior varied widely from database to database. MySQL, ODBC, and PostgreSQL had complete data type handling, (for example, strings were quoted, integers were not). The remaining drivers just quoted anything that came in, except NULLs, which were turned into the string NULL.

To instil uniformity, quote() was deprecated and two new methods were added. escapeSimple() escapes a string according to the current DBMS's standards. quoteSmart() escapes the data using escapeSimple() and, if the input is a string, encapsulates the string between single quotes.

Notable Bug Fixes
Following are the major bug fixes in the latest release:

(a) All Drivers - free() now works, eliminating some memory leaks. All methods that return by reference always return a variable in order to avoid memory corruption in PHP 4. Implemented loads of small code optimizations found by Zend Studio's Code Analyzer.

(b) Informix - fetchRow() now adds 1 to the requested $rownum because the index for ifx_fetch_row() starts at 1. Zero is returned by affectedRows() in response to SELECT queries, just like the other DBMS's.

(c) Oracle - Several improvements have been made to limitQuery() so it works (what a concept), accepts parameters for use in prepare/execute mode and handles delimited identifiers. numRows() now functions when called after execute(). Removed ocifreestatement() from freePrepared() to allow examination of result objects returned when using prepare/execute mode.

(d) ODBC - DB's normal DSN syntax can now be used, eliminating the need for the workaround of placing ODBC DSNs in the hostspec rather than the database segments.

(e) Sybase - Empty passwords are now permissible. Sequences have been implemented. Numbered keys are dropped from arrays when using DB_FETCHMODE_ASSOC.

Test Suite
There is an expanded range of tests, covering more options and additional methods. Adjustments ensure the tests run regardless of the operating system or PHP version. The exams work the same way when checked out from CVS or installed via pear install DB.
Each test creates and drops all needed tables, even when tests fail. New shell scripts simplify test execution. The pear/DB/doc/TESTERS file contains complete step-by-step instructions for installing and running the test rig.

Documentation
PEAR DB's documentation was already pretty good. Now it's great. Data types of parameters and returns are accurate. Additional examples ease the learning process. New features are covered. French and Japanese translations are in place, though they need minor updates. Anyone interested in providing more languages is encouraged to contact the 'pear-doc' mailing list.

Plans for the Next Crop
There are two immediate goals for DB development. First is getting dBase, FrontBase, Informix, and mSQL running on my system in order to ensure DB's portability on those platforms. Second is refactoring the code base to improve performance.

Discussions have begun about potentially merging DB and MDB2 to create the next major version of PEAR's database interface. It is hard to say now where this will end up, but it certainly will lead to interesting improvements.

About the Author
Daniel Convissor runs a consulting firm in New York City, The Analysis and Solutions Company. He is PEAR DB's lead maintainer, a frequent presenter for the NYPHP user group and the author of several popular tutorials on topics including PHP, XML, and databases.

Links und Literature

© 2004 Software & Support Verlag GmbH. Reproduction has to be permitted by the publisher. Questions?