The Apache Software Foundation has released a new version of their http server for the first time in more than six years. So…is this a big deal? Yes, it’s a very big deal. Apache’s http server powers more than 65% of the active websites on the Internet, with a majority of those sites living in either a shared or managed hosting environment.
Apache is a standard installed item on most Linux servers. Mac OS X comes installed with Apache as well. Because it is open-source, many branches and variations of Apache are out there in the ecosystem, from small embedded systems to large parallel computing initiatives.
Apache 2.4 comes with several new and improved features. It looks to me that Apache is working hard to stem the flow of people migrating over to nginx (pronounced engine-X) by improving support for asynchronous operations. A server that supports asynchronous operation is more scalable, and Apache has been criticized in the past for its noticeable lack of scalability compared to newcomers like nginx and Lighthttpd. The latest trend in server infrastructures is to run Apache as your “process server” and use nginx to serve static content. Developers and system administrators don’t normally want to manage an additional server process, so we welcome any solution to help prevent that.
One of the barriers to Apache running with asynchronous I/O was its wonky multithreading support. PHP has specifically warned users against running on an Apache server with MPM (Multi-Process Modules) enabled. Not exactly a ringing endorsement. The new baked-in MPM support will hopefully remedy those issues.
Another improvement in Apache 2.4 is a modest reduction in memory usage. Apache 2.2 wasn’t really known as a memory hog, but it’s nice to see that the dev team took steps to optimize their codebase. Normally, when versioned systems are released (especially systems with large developer teams), subsequent releases tend to take up more memory.
There are several new modules baked into the new Apache. I will focus on only a few here, but I would like to point out that Apache is obviously focusing on scalability issues with this release. The
mod_lbmethod_heartbeat modules provide a more responsive feedback mechanism to load balancers so that they can more efficiently handle incoming traffic.
The new session module is a welcome addition to Apache. Currently, middleware tiers like PHP, Python, and Ruby have to implement their own session-generation mechanisms. Although I don’t expect this to change, it’s good that this option is now available so that future platforms can take advantage of it. There is still the issue of sharing sessions across multiple servers, something that is not unique to Apache. Apache implements the same mitigation as other server setups by storing session data in a cookie. There are security implications here, but cookies can be easily encrypted via the
Typically, when a user logs into a website, the application code has to run to verify the user’s identity. It then sets a cookie in the user’s browser, and subsequent visits to the website are authenticated through that cookie. From the end-user perspective, this is how it almost always works. There may be some trickery going on server-side, but the user isn’t aware of it. In previous iterations of Apache, the HTTP request would get passed from the server to the next layer.
For example, when you log into a WordPress site, the HTML form data gets parsed by the server, and the server invokes a PHP handler to process the request. The PHP then executes and returns its output back to the server, which then sends the response to your browser.
A colorful example
Charlie goes to a Swedish furniture store and purchases a bookshelf. Charlie straps the bookshelf to the roof of his car and drives home. When Charlie gets home, he gives the bookshelf to Deandra. Deandra is expecting a red bookshelf, but upon opening the box, she discovers it’s blue. She hands the box over to Charlie, and Charlie drives the bookshelf back to the Swedish furniture store.
This is analogous to a failed authorization in WordPress and most other platforms. The client (you) are the furniture store, Charlie is the server, and Deandra is the application working in the background.
In Apache 2.4, using the
mod_auth_form module, Charlie could inspect the contents of the box before he even bothers Deandra, who charges an exorbitant amount of money just to take a look at the things Charlie brings home for her to build. Now, if Charlie discovers that the bookshelf is the wrong color, he could just return it.
The main advantage of this type of authentication is that it separates the actual authentication mechanism from the underlying application code, meaning that the application code is no longer a vector for intrusion (sql and command injections, brute force attacks, etc.). It also means that servers could potentially be more protected against DoS (Denial of Service) attacks by not passing every single request through to the middle tier. In my opinion, this is the best enhancement in the new Apache.
Apache has included, albeit on an experimental basis, support for the Lua programming language. Lua is syntactically similar to Python and Ruby, and seems to me to be well-suited for writing web applications.
Although there are a few web frameworks built on Lua, they all must either implement their own server daemon or run as a CGI script. Bringing Lua under the Apache umbrella means that web applications built in Lua can now take full advantage of the power of Apache.
I think the Apache Software Foundation is making great progress in improving their flagship project to meet the needs of the next generation of web applications. Considering that Apache is free (as in beer) and has a very liberal license, it has definitely more than proven itself as one of the few keystones in the Internet ecosystem. I think the changes in Apache 2.4 prove that this is an evolving project that’s eager to meet the future demands of a shifting industry.
I wanted to take Apache 2.4 for a spin, mainly to try out the new form authentication module. Here are all the gory details.
~$ sudo apt-get install gcc subversion gedit chromium-browser
The gedit and Chromium apps are just helpful for viewing and editing files, and the gcc is necessary to build the software we will download. Once things are set up, open up Terminal and SSH into the machine you just created. Be sure to use the
-X option so you can open an XTerm interface.
~$ ssh -X email@example.com
This will allow you to open a text editor and web browser directly from the new VM.
This will open up a Chromium window over XTerm. Once that is open, navigate to the Apache 2.4 downloads page and grab the .tar.gz version of the source. This will download directly to your home folder. Going back to the command line:
~$ cd ~/ ~$ mkdir apache24 ~$ mv httpd-2.4.X.tar.gz apache24/ ~$ cd apache24 ~$ tar -zxvf httpd-2.4.X.tar.gz ~$ rm httpd-2.4.X.tar.gz ~$ cd httpd-2.4.X
So we made an
apache24 directory in our home folder and put the Apache source in there. Now we can go into the source and grab the Apache Portable Runtime (APR) library from subversion. This is not installed on the system by default.
~$ svn co http://svn.apache.org/repos/asf/apr/apr/trunk srclib/
We’re now ready to configure Apache.
~$ ./configure --with-included-apr --prefix=/home/steve/apache24
Substituting your home directory for
~$ make ~$ make install
If you’re missing any dependencies, here is where you’ll find out. Just install them, do a
make clean and retry.
~$ gedit /home/steve/apache24/conf/httpd.conf
Somewhere around line 52 is the directive
Listen 80. Change that to
Listen 8080 and be sure to save the file. Back in the command line, start up Apache, then go to your new web server.
~$ ~/apache24/bin/apachectl start ~$ chromium-browser
Point your browser (the Chromium over XTerm) to
http://localhost:8080. You should get the familiar It works! message.
php-5.3.10.tar.gz. This will again download to your home folder.
~$ cd ~/ ~$ mkdir php5 ~$ mv php-5.3.10.tar.gz php5/ ~$ cd php5 ~$ tar -zxvf php-5.3.10.tar.gz ~$ rm php-5.3.10.tar.gz ~$ cd php-5.3.10
You will want to configure PHP to use the Apache Extension Tool (APXS) that was built when you built the server. Configure PHP like so:
~$ ./configure --with-apxs2=/home/steve/apache24/bin/apxs --with-mysql ~$ make ~$ make install
PHP will build and insert a
LoadModule directive into your server’s configuration file. All you need to do is edit httpd.conf again by adding the following line:
AddType application/x-httpd-php .php
It doesn’t really matter where you add this directive, but I like to place it right under the newly created
LoadModule directive. Now, go to your server’s document root and create an index.php file.
~$ cd ~/apache24/htdocs ~$ touch index.php ~$ gedit index.php
This will open up the gedit editor. Enter the following:
Be sure to add the opening
<?php tag. WordPress removes this from blog posts. Restarting the server and going to
http://localhost:8080/index.php should give you the familiar PHP info page.
<form method="POST" action="doLogin.php"> <h2>LOGIN!</h2> <input type="text" name="username" /> <input type="password" name="password" /> <input type="submit" value="submit" /> </form>
mod_auth_form module requires the enabling of the
mod_session module, which makes sense, because
mod_session will create a session cookie and handle it for you.
mod_auth_form to work properly, you should combine it with an authentication module and an authorization module. Since there are multiple types of each module, administrators have some freedom in choosing which scheme to employ when handling user logins.
To authenticate a user means that the system recognizes the supplied credentials and verifies that they do indeed belong to a valid user. Authorizing a user means that a system will determine what permissions the user has once they are authenticated. Authorization is generally used to separate administrators from other registered users in a protected space.
With all of this in mind, I decided to authenticate users via the
mod_authn_file module. This module uses the same authentication mechanism that is resident on any Unix system: a flat file with a username and encrypted password. This does not lend well to a dynamic website, where people will be creating new user accounts on the fly, since the password file will need to be manually updated every time an account is created. But for this exercise, it will do just fine.
I decided to forego the authorization in this case, since the object of my test is aimed at the authentication portion of the login process.
I edited my Apache config file (httpd.conf) by adding the following to the end:
<Location /dologin.php> SetHandler form-login-handler AuthFormLoginRequiredLocation http://localhost:8080/login.php AuthFormLoginSuccessLocation http://localhost:8080/success.html AuthFormProvider file AuthUserFile /home/steve/apachePasswd AuthType form AuthName Apache_Test Session On SessionCookieName session path=/ SessionCryptoPassphrase MyPassPhrase </Location>
SetHandler directive tells Apache which internal module to invoke when requests are made to the page
doLogin.php. Note that at no time did I ever create a page called
doLogin.php. You could call this file anything, as long as you specify it as the action in the form that submits to the server.
The next two directives are unique to
AuthFormLoginRequiredLocation tells Apache where the request must come from. This is a nominal bit of security, since HTTP Referrer headers can easily be spoofed, but some security is better than none. The
AuthFormLoginSuccessLocation directive tells Apache where to redirect the user after a successful login. Note that if the user disables cookies, the login will not succeed, even if it is correct.
AuthFormProvider directive tells Apache how the authentication will take place (a file in this case), and the
AuthUserFile directive gives the location of the file.
AuthType tells Apache that we are using the new
AuthName is just the name supplied in the header when the server issues the auth challenge. Think of this as a sign on a locked door labeled “Private”.
The next three directives pertain to
mod_session. The first directive simply tells Apache that we are indeed using the
mod_session module. The second directive tells Apache the scope of the session cookie (e.g., where within the website this cookie is valid), and the third directive is just a salt that is used to encrypt and decrypt the cookie data.
mod_auth_form was simple enough. Invalid logins were pointed to the HTTP 401 (unauthorized) status handler. This is default Apache behavior. Successful logins were redirected to
success.html, which is just a blank web page. In this exercise,
success.html is not protected, but in the real world, you would point your
Require user directives at whatever content you wanted to protect.
The real results I am looking for come in the form of any performance boost. To do this, I created a simple
.ini file with a plaintext password.
I wrote a line of PHP to open this file and match its contents against what is posted within the request. This operation involves significantly less overhead than connecting and authenticating to a database, which most modern websites do in order to verify user identity. From the command line, I spammed the following command:
~$ curl \\ ~$ -d username=steve&password=password \\ ~$ -e http://localhost:8080/login.php \\ ~$ -j \\ ~$ -c /home/steve/cookie \\ ~$ -L \\ ~$ http://localhost:8080/doLogin.php
This curl command will emulate a form submitted from
login.php. I created a shell script to create 1,000 of these requests synchronously.
I expected successful logins to take a slightly longer time using
mod_auth_form versus not using it. This assumption was correct. On the VM, the requests using
mod_auth_form took an average of 43 milliseconds each to complete, where requests not using
mod_auth_form took about 32 milliseconds to complete. So then, on average, the workload incurred by using
mod_auth_form averages out to about 11 milliseconds per request.
It was with unsuccessful logins that the advantage of
mod_auth_form becomes apparent. Requests averaged 14 milliseconds each with
mod_auth_form enabled, and the same 32 milliseconds with it disabled. That’s a savings of 18 milliseconds per request, but more important, that’s a huge savings in memory usage because the PHP handler is never called, and the thread is never created on the machine to handle the PHP output. This can cost anywhere between 65 kilobytes to 2 megabytes of RAM per request, depending on how PHP is being run and what modules are involved. This does not take other processes into account, particularly the slow database connection involved in most logins.
In my opinion,
mod_auth_form is a great benefit to application security, as it does mitigate DoS attacks. I believe that along with the new rate limiting modules,
mod_auth_form will prove itself as a strong antispam tool for web admins.
Apache 2.4 is a leap forward for the 17-year-old project, and definitely worth the upgrade.