![]() |
|
URL of the article:
Issue:
Special.2004
Benchmarking PHP with no BS
Understanding the Performance Characteristics of PHP
by John Lim
This article focuses on a rather difficult topic - understanding the performance characteristics of PHP. It is difficult, not because PHP is complex, but because PHP's performance depends on many outside factors such as software versions, available hardware, operating system settings, and network configuration.
Writing high performance PHP code involves many factors. Some of the more important ones include:
PHP is a scripting language that has been designed to favour ease-of-use over performance. This however, does not mean that PHP is slow. Much of the important functionality of PHP is written as high-speed C extensions. As you will see in the benchmarks, the fastest PHP code tries to make use of extension functionality as much as possible. Further, due to PHP's dynamic nature, PHP scripts are recompiled every time they are invoked. The most important optimisation you can perform is to avoid this recompilation by installing an opcode cache such as Zend Accelerator or Turck MMCache. You will typically get a 250% or more increase in performance for short-lived PHP scripts. An understanding of how PHP internally stores variables is also helpful in analysing the benchmarks. PHP variables are stored in a C data structure called a zval (described in zend.h). For integers and floats, the numeric value is stored in the zval itself. For arrays, strings and objects, additional storage has to be allocated outside of the zval. This means that benchmarks involving numeric variables should run faster than ones with string, object, or array variables.
To install the suite, unpack the archive into a directory accessible by your Web server. Then open index.php from a Web browser. You will see a list of tests to run. Select one of the tests, and the benchmark results will be displayed. The benchmark results follow a standard format, as shown in the following table. Each test file contains two benchmark functions: bench1() and bench2(). Both functions are run 1000 times, and the total execution time for each function is shown in seconds. In all benchmarks, faster is better. Back Run bench2() first.
All tests were done on 3 machines.
These tests can only be run on PHP5b4, and are meant to test exception overhead. By default, the total time in seconds for 1000 executions is displayed.
Software programs spend most of their time in loops, so loops need to be highly optimised. In PHP, I believe that the most common looping operation is to perform some operation on each element of an array. Since PHP provides several techniques for doing this, it makes choosing the optimal method extremely confusing. In this test suite, a 50 element array $ARR, indexed from 0 to 49, is processed using different methods. In the first set of tests, we process $ARR when the array contains integers:
; --$i>=0 ; ) process_element($array[$i]); Most programming languages are tuned for very fast evaluation of zero and non-zero values, and the looping condition here (--$i>=0) takes advantage of this fact. This technique is particularly useful when the processing order of the elements is not important. For example, when you are calculating the average of an array of numbers. Some programmers might find the above construct ugly. In fact, there is nothing unusual about this technique; it is a standard pattern for high performance loops among C programmers. The foreach loop has pretty good performance too. It came 2nd for string processing and is reasonably fast when handling numbers. In general, I would stick to using foreach because I feel code clarity is more important, with judicious use of the faster for loop technique (;--i >= 0;) only when speed is really essential. Regular Expression Suite The following figures below are taken from PHP4. PHP5b4 results are similar:
This is a mixed bag of tests. The results are taken from PHP4. We will highlight results that differ in PHP5b4. The first test measures whether accessing constants or global variables is faster. In PHP 4.2.3, globals were faster. In PHP4.3.3 and PHP5b4, the reverse is true.
This tests various parameter-passing techniques. The next two results test passing parameters by value and by reference when calling a function. In PHP4, for both arrays and objects, passing by reference is substantially faster.
{ var myarray = array(); } There are several other benchmarks in this section. It appears to make no difference whether we pass strings by reference or by value. We also tested long function names and short function names. Calling functions with short function names were again faster. There are two common ways to return multiple values from a function. You could pass reference parameters into the function and update those parameters, or return using an array: return $array; It seems using an array is faster. XML Suite These tests measure the time taken to parse an RSS newsfeed for all title tags. I tested using regular expressions, explode with substr, DOM, XPath, and SAX. I was surprised to find the following regular expression gave the best performance: preg_match_all( '/<title>([^<]*)/', $XML, $titles_array) SAX came last in performance. This was another surprise as people have always said that SAX gives better performance than DOM, because DOM recreates the XML structure in memory, while SAX does not. The reason why regular expressions are fastest is because regular expressions do not have any knowledge of XML. This means that tag validation is neither required nor performed. The other surprise, that DOM is faster than SAX, can be explained like this - a large percentage of the time in SAX processing is spent in callbacks to slower PHP code. In contrast, DOM generates the XML structures in very fast C code with no PHP callbacks required. So the conventional wisdom that SAX is faster than DOM is true when everything is written in C, but not in hybrid environments such as PHP with C extensions. Following are the Windows results for PHP4:
This suite was recently added to measure various algorithms. One common task is encoding binary data for XML, or data storage. The na? way is to use rawurlencode(). Turns out that it's dead slow. Base64_encode() appears to be the fastest. PHP has multiple functions for generating hashes. This is useful for generating unique passwords, or generating a security checksum for verification purposes. We compared crc32 and md5. crc32 was faster, though md5 is known to generate more unique hashes. And lastly we test storing configuration parameters in an INI file or in a PHP file as an associative array. We found that storing in an INI file and using parse_ini_file() is faster than parsing a PHP file. Benchmark Suite System Design The BS System is designed as a set of independent test suites. Each suite is stored in its own sub-directory and each test is a separate PHP script file. You don't need to perform any special setup to add new tests. Create a new sub-directory, and the BS System will auto-detect the new test suite. Any .php script files found in a suite directory will be executed in alphabetical order. Each script file should have one or two benchmark functions defined, bench1() and bench2(). Each function should return a value that can be used to verify that the function actually worked. There is also some required metadata embedded as a comment in the source code. This is read by DoScanDir( ) in bench.php. The metadata format is (in one line): //~~ Title to display, // description of bench1(), // description of bench2(), // # iterations Note that #iterations is optional, and will default to 1000 if not defined. Check out the sample test script in Listing 1. Listing 1 <?php include_once("../init.inc.php"); // METADATA THAT IS READ BY ../bench.php: //~~ Reading CONSTANTS versus Global Variables, Constants, Globals, 1000 //================================================ INIT define('CONSTANT',1); $CONSTANT = 1; //================================================ TEST CODE function bench1() { return CONSTANT+CONSTANT+CONSTANT; } function bench2() { global $CONSTANT; return $CONSTANT+$CONSTANT+$CONSTANT; } //================================================ BENCHMARK! include_once("../bench.inc.php"); ?> You can also cache the results by setting $CACHE=1 in config.inc.php. The results are cached in the _cache directory. You can save the contents of this directory for a permanent record. John runs the popular php.weblogs.com Web site, and is the lead developer of the phplens app server. Links and Literature:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||