Performance tuning a CentOS LAMP web server for high traffic volumes

In August 2010 I was contracted to performance tune a LAMP server to handle approximately 70 full page loads per second which equated to 4,250 concurrent virtual users. We ended up doubling this expectation to 140 full page loads per second without striking issue. If this speed was maintained for 24 hours it would equate to over 12 million hits per day. This article will let you know how we achieved it.

The load tests were conducted using the HP performance center; a technology that HP obtained as part of its acquisition of Mercury for approximately USD$4.5 billion in 2006.

To find out more about the load testing software visit http://en.wikipedia.org/wiki/HP_LoadRunner

Goal:
Handle 4,250 concurrent users generating approximately 70 full page loads per second.

1 full page load consisted of:
– 1 dynamically generated PHP file using MySQL
– 4 JavaScript files
– 7 CSS files
– 8 image files

Original starting environment:
– ServerModel: Dell R300
– RAM: 2GB (2 x 1GB chips)
– Operating System: CentOS release 5.5 (Final)
– Apache: v2.2.3 (running in prefork mode)
– MySQL: v5.0.77
– PHP: v5.1.6 (as an apache module)
– eAccelerator: v0.9.5.3
– 120Mbits of bandwidth

Round 1: Initial Test

Round 1: Configuration

At the start of the process we were pretty much using the default configurations for the entire lamp stack. Linux was running iptables and ip6tables in its default configuration. eAccelerator was operating with 32MB of memory with optimization and caching enabled.

Apache (/etc/httpd/conf/httpd.conf):
For more info on variables for Apache 2.0.x go to:http://httpd.apache.org/docs/2.0/mod/mpm_common.html

<IfModule prefork.c>

StartServers       8

MinSpareServers    5

MaxSpareServers   20

ServerLimit      256

MaxClients       256

MaxRequestsPerChild  4000

</IfModule>

MySQL (/etc/my.cnf):
For more info on variables for MySQL 5.0.x go to:http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html

[mysqld]

max_connections = 100

max_user_connections = 0

max_connect_errors = 10

max_allowed_packet = 1M

table_cache = 64

sort_buffer_size = 2M

read_buffer_size = 131072

read_rnd_buffer_size = 262144

myisam_sort_buffer_size = 8M

thread_cache_size = 0

query_cache_size= 0

thread_concurrency = 10

Round 1: Results

With these settings we got up to 30 page loads per second which was 42% of our target. Interestingly, we were only operating at about 8% CPU and about 50% of our memory capacity when we hit this limit.

Round 1: Review

Looking at the apache error logs we were getting a large number of MySQL errors:

mysql_connect() [<a href='function.mysql-connect'>function.mysql-connect</a>]: Too many connections in xxx.php on line 15

So the MySQL configuration seemed to be our bottleneck:

Round 2

Round 2: Configuration

We did our first major review of the Apache and MySQL performance settings and adjusted them accordingly. We doubled the Apache settings and used the ‘huge’ configuration as supplied with mysql (/usr/share/doc/mysql-server-5.0.77/my-huge.cnf).

Apache (/etc/httpd/conf/httpd.conf):
For more info on variables for Apache 2.0.x go to:http://httpd.apache.org/docs/2.0/mod/mpm_common.html

<IfModule prefork.c>

StartServers       16

MinSpareServers    10

MaxSpareServers   40

ServerLimit      512

MaxClients       512

MaxRequestsPerChild  8000

</IfModule>

MySQL (/etc/my.cnf):
For more info on variables for MySQL 5.0.x go to:http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html

[mysqld]

# Memory usage

skip-locking

max_connections = 500

max_user_connections = 500

max_connect_errors = 999999

key_buffer = 384M

max_allowed_packet = 1M

table_cache = 512

sort_buffer_size = 2M

read_buffer_size = 2M

read_rnd_buffer_size = 8M

myisam_sort_buffer_size = 64M

thread_cache_size = 8

query_cache_size = 32M

# Try number of CPU's*2 for thread_concurrency (eHound has 4 CPU's)

thread_concurrency = 8

 

# Disable Federated by default

skip-federated

 

[mysqld_safe]

log-error=/var/log/mysqld.log

pid-file=/var/run/mysqld/mysqld.pid

 

[mysqldump]

quick

max_allowed_packet = 16M

 

[mysql]

no-auto-rehash

 

[isamchk]

key_buffer = 256M

sort_buffer_size = 256M

read_buffer = 2M

write_buffer = 2M

 

[myisamchk]

key_buffer = 256M

sort_buffer_size = 256M

read_buffer = 2M

write_buffer = 2M

 

[mysqlhotcopy]

interactive-timeout

As an extra precaution we locked the network card in the server to use 1Gbit:

#ethtool -s eth0 speed 1000 duplex full

Edit the configuration for the network card:

#vim /etc/sysconfig/network-scripts/ifcfg-eth0

Add the following line:

ETHTOOL_OPTS='autoneg on speed 1000 duplex full'

Restart the network:

#service network restart

Round 2: Results

With these settings we got up to 58 full page loads per second which was 59% of our target. Interestingly, we were still only operating at about 10% CPU capacity when we hit this limit but we were using approximately 70-80% of our memory.

Our MySQL errors had disappeared and there were no more errors in the Apache logs.

Round 2: Review

We were concerned that the system was starting to use swap memory which was slowing the server to a halt.

Round 3

Round 3: Configuration

We added an additional 2GB of RAM to the server so it now contained 4 x 1GB chips.

Round 3: Results

With the new RAM we still only got up to 58 full page loads per second which was 59% of our target. We were still only operating at about 10% CPU capacity but now we were only using about 40% of our memory.

Round 3: Review

Still no errors in the Apache logs and the load test farm was not receiving Apache errors. In fact it was reporting that it could not even connect to the server. This led us to believe that it was either a lack of bandwidth or a NIC/network/firewall configuration issue. After checking with our datacenter, we found that there were no inhibiting factors that would cause the problem described.

We increased the Apache & MySQL Limits and ran a different style of test.

Round 4

Round 4: Configuration

In this test we only loaded the dynamic components of the page as generated by PHP and MySQL and served by Apache. This meant that we told the load test farm not to download static content such as images, CSS or JavaScript files.

Also we increased the MySQL and Apache limits as follows:

Apache (/etc/httpd/conf/httpd.conf):
For more info on variables for Apache 2.0.x go to:http://httpd.apache.org/docs/2.0/mod/mpm_common.html

<IfModule prefork.c>

StartServers     280

MinSpareServers   100

MaxSpareServers   300

ServerLimit      1536

MaxClients       1536

MaxRequestsPerChild  32000

</IfModule>

MySQL (/etc/my.cnf):
For more info on variables for MySQL 5.0.x go to:http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html

[mysqld]

# Memory usage

skip-locking

max_connections = 764

max_user_connections = 764

max_connect_errors = 999999

key_buffer = 256M

max_allowed_packet = 1M

table_cache = 256

sort_buffer_size = 1M

read_buffer_size = 1M

read_rnd_buffer_size = 4M

myisam_sort_buffer_size = 64M

thread_cache_size = 8

query_cache_size= 16M

# Try number of CPU's*2 for thread_concurrency (eHound has 4 CPU's)

thread_concurrency = 8

 

# Disable Federated by default

skip-federated

 

[mysqld_safe]

log-error=/var/log/mysqld.log

pid-file=/var/run/mysqld/mysqld.pid

 

[mysqldump]

quick

max_allowed_packet = 16M

 

[mysql]

no-auto-rehash

 

[isamchk]

key_buffer = 128M

sort_buffer_size = 128M

read_buffer = 2M

write_buffer = 2M

 

[myisamchk]

key_buffer = 128M

sort_buffer_size = 128M

read_buffer = 2M

write_buffer = 2M

 

[mysqlhotcopy]

interactive-timeout

Round 4: Results

The results of this test were very interesting. We got up to 263 page loads without any issue. This consumed a lot more bandwidth than test 3 so we knew that bandwidth was not the issue. However the number of connections that both tests started to fail at were very similar.

Round 4: Review

So we knew we had a connection limit issue.

We also knew that the eAccelerator optcode cache was not dying at these high volumes, nor was MySQL, PHP or Apache.

We reviewing the kernel messages and found thousands of the following messages that were logged at the time of testing:

#cat /var/log/messages* | grep 'Aug 15'

...

Aug 15 01:04:27 localhost kernel: printk: 1395 messages suppressed.

Aug 15 01:04:27 localhost kernel: ip_conntrack: table full, dropping packet.

Aug 15 01:04:32 localhost kernel: printk: 1561 messages suppressed.

Aug 15 01:04:32 localhost kernel: ip_conntrack: table full, dropping packet.

Aug 15 01:04:37 localhost kernel: printk: 1274 messages suppressed.

Aug 15 01:04:37 localhost kernel: ip_conntrack: table full, dropping packet.

Aug 15 01:04:42 localhost kernel: printk: 1412 messages suppressed.

...

Further investigation revealed that the iptables/ip6tables was activated and limiting the number of connections to the box because its table was full. Ordinarily when I set up a linux server I turn iptables off because I place hardware firewalls in front of the servers. However I didn’t have the opportunity to setup this box initially, so they were still activated. I however didn’t need them, so I deactivated them.

If you still need to keep iptables running you can simply adjust the following settings:
Check the current connections limit (only works if iptables is running):

#sysctl net.ipv4.netfilter.ip_conntrack_max

65536

Change the connections limit:

#vim /etc/sysctl.conf

Add the following lines:

# conntrack limits

#inet.ipv4.netfilter.ip_conntrack_max = 65536

net.ipv4.netfilter.ip_conntrack_max = 196608

Reload the config file:

#sysctl -p

Check the new connections limit:

#sysctl net.ipv4.netfilter.ip_conntrack_max

196608

Check the current buckets limit (only works if iptables is running):

#cat /proc/sys/net/ipv4/netfilter/ip_conntrack_buckets

8192

To change the buckets limit:

#vim /etc/modprobe.conf

Add the following lines:

options ip_conntrack hashsize=32768

Reboot the server:

#shutdown -r now

Check the new buckets limit:

#cat /proc/sys/net/ipv4/netfilter/ip_conntrack_buckets

24576


Alternatively if you don’t need iptables like me, you can just disable them:

#service iptables stop

#service ip6tables stop

#chkconfig iptables off

#chkconfig ip6tables off

Round 5

Round 5: Configuration

This test used exactly the same configuration with iptables disabled.

Round 5: Results

Success!!! We got to 4,250 concurrent users which is about 70 pages per second (loading all additional image, CSS and JavaScript files also) with zero errors and a 0.7 second average response time. This used about 120Mbits worth of bandwidth pipe. The datacenter ended up running out of pipe before the server had any issues.

At this rate we were running at about:
– 15% CPU utilisation
– 30% Memory usage (with 4GB RAM installed)
– 400 apache threads
– 100% Bandwidth

Round 5: Review

Key findings:
– Increase your Apache and MySQL limits
– Turn off iptables
– Ensure that you have enough RAM
– Ensure that you are checking logs from MySQL, Apache, and the kernel to pick up any errors and give you clues as to how to best solve them

Round 6

Round 6: Configuration

This test used exactly the same configuration as round 5 with 250Mbit pipe instead of a 120Mbit pipe.

Round 6: Results

Success!!! We got to 140 full page loads per second (including additional images, CSS and JavaScript files also) with zero errors and still a stable 0.7 second average response time. This used the full 250Mbits worth of bandwidth pipe. The datacenter ended up running out of pipe again before the server had any issues.

At this rate we were running at about:
– 30% CPU utilisation
– 40% Memory usage (with 4GB RAM installed)
– 800 apache threads
– 100% Bandwidth

Round 6: Review

Key findings:
– Even with 250Mbits of pipe, bandwidth is still the bottleneck in this configuration.

Round 7

Round 7: Configuration

Even though our server was performing fine, we were given another server to experiment on with much higher specs.

It was a Dell R710 with 48GB of RAM and 8 2.53MHz Xeon processors running in hyper-threading mode (essentially making 16 processors).

We also had this box connected to a dedicated 4Gbit optical internet feed to give it as much bandwidth as it needed.

Everything on the box was configured the same except for Apache and MySQL (which we took the last settings and multipled them by 4) and sysctl.

Apache (/etc/httpd/conf/httpd.conf):
For more info on variables for Apache 2.0.x go to:http://httpd.apache.org/docs/2.0/mod/mpm_common.html

<IfModule prefork.c>

StartServers     1120

MinSpareServers   400

MaxSpareServers   1200

ServerLimit      6144

MaxClients       6144

MaxRequestsPerChild  128000

</IfModule>

MySQL (/etc/my.cnf):
For more info on variables for MySQL 5.0.x go to:http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html

[mysqld]

# Memory usage

skip-locking

max_connections = 3056

max_user_connections = 3056

max_connect_errors = 999999

key_buffer = 1024M

max_allowed_packet = 4M

table_cache = 1024

sort_buffer_size = 4M

read_buffer_size = 4M

read_rnd_buffer_size = 16M

myisam_sort_buffer_size = 256M

thread_cache_size = 32

query_cache_size= 64M

# Try number of CPU's*2 for thread_concurrency (eHound has 4 CPU's)

thread_concurrency = 32

 

# Disable Federated by default

skip-federated

 

[mysqld_safe]

log-error=/var/log/mysqld.log

pid-file=/var/run/mysqld/mysqld.pid

 

[mysqldump]

quick

max_allowed_packet = 64M

 

[mysql]

no-auto-rehash

 

[isamchk]

key_buffer = 512M

sort_buffer_size = 512M

read_buffer = 8M

write_buffer = 8M

 

[myisamchk]

key_buffer = 512M

sort_buffer_size = 512M

read_buffer = 8M

write_buffer = 8M

 

[mysqlhotcopy]

interactive-timeout

We also added the following lines to sysctl:
ip_conntrack_max = 196608
net.ipv4.ip_local_port_range = 1025 65535
net.ipv4.tcp_max_tw_buckets = 1000000
net.core.somaxconn = 10000
net.ipv4.tcp_max_syn_backlog = 2000
net.ipv4.tcp_fin_timeout = 30

Round 7: Results

We got to 200 full page loads per second (including additional images, CSS and JavaScript files also) with zero errors and still a stable 0.8 second average response time. This test used 330Mbits or about 8% worth of the bandwidth available. We stopped the test simply because we didn’t need to go any higher, but potentially could have gone much higher.

At this rate we were running at about:
– 16% CPU utilisation
– 6% Memory usage (with 48GB RAM installed)
– 1227 apache threads
– 8% Bandwidth

Round 7: Review

Key findings:
– Bandwidth seem to be a much bigger bottleneck than server capability.

 

Leave a comment

Leave your opinion