Debian LAMP Failover cluster: Implementation

Today, I had to build a solution to implement a redundant LAMP failover cluster.. I will report here how I achieved that and which software I used.

Requirements:

  • 2 Nodes (hardware identical, if possible);
  • 2 Network card per node;
  • 2 Hard drive per node;

Softwares used:

  • Debian GNU/Linux (any other linux distribution could fit);
  • Software RAID-1 (md);
  • Heartbeat-2;
  • DRBD-8;
  • Apache-2;
  • PHP5;
  • MySQL;
  • BIND9

This permit to have following functionnalities:

  • Prevent Harddisk failure (RAID-1);
  • Prevent network failure (Bonding, for my part, two interfaces connected on redundant network switches)
  • Prevent node failure (DRBD+Heartbeat : files are replicated to second node and services are balanced when a node falls)

Assumptions:

  • node1 is at 42.42.42.20;
  • node2 is at 42.42.42.21;
  • clusturized ip is at 42.42.42.22;
  • gateway is 42.42.42.254

Read more to see how exactly the implentation has goes on…

Here are step by step how to complete this failover cluster, the steps specifics to debian are specified with a “(Debian)” header.

Install (Debian) GNU/Linux

  • Don’t forget to use RAID-1 during the installation
  • Also, keep a RAID-1 partition for all your datas (www,mysql,bind)

(Optional) Compile your kernel and include following things:

  • bonding driver (Module);
  • Cryptography algorythm that you want to use for disk synchronization;
  • Personnally, I decided to put everything hard linked into the kernel (no modules) except for bonding driver.

(Debian) Boot your system and upgrade it to testing (lenny) distribution. Activate bonding for your interfaces:

  • (Debian) Add following lines to your /etc/network/interfaces :
    allow-hotplug bond0 iface bond0 inet static address 42.42.42.20 netmask 255.255.255.0 network 42.42.42.0 gateway 42.42.42.254 up /sbin/ifenslave bond0 eth0 up /sbin/ifenslave bond0 eth1 dns-nameservers 10.42.253.254 dns-search espix.org 
  • (Debian) Add following lines to /etc/modules:
        bonding mode=active-backup miimon=100 downdelay=200 updelay=200 
  • Reboot and check that bonding works fine (remove a card, then another, …)

(Debian) Install following softwares:

  • apt-get install drbd8-utils drbd8-module-source drbd8-source build-essential
  • m-a -t build drbd8-source
  • dpkg -i drbd8-2.6.26.6-espix-p4_8.0.13-2+2.6.26.6-espix-p4-10.00.Custom_i386.deb

Configure DRBD-8

  • Make /etc/drbd.conf looks like this on both nodes:
global { usage-count no; } common { protocol C; syncer { rate 4M; # I've put 4Mo/sec of resync rate to avoid taking up all the bandwidth available al-extents 257; } } resource srv { startup { become-primary-on node1; } net { cram-hmac-alg sha1; shared-secret "mysharedsecretsentence"; } on node1 { device    /dev/drbd0; disk      /dev/md7; address   42.42.42.20:7789; meta-disk internal; } on node2 { device    /dev/drbd0; disk      /dev/md7; address   42.42.42.21:7789; meta-disk internal; } } 
  • Start drbd and launch resync:
drbdadm create-md srv drbdadm attach srv drbdadm connect srv drbdadm -- --overwrite-data-of-peer primary srv 
  • You can check on /proc/drbd that resync goes well.
  • Also, you can speed up the resync time by bypassing the resync rate pre-defined:
drbdsetup /dev/drbd0 syncer -r 110M 
This setting will not last accross reboot :-)

Configuring your drbd partition

  • mkreiserfs /dev/drbd0
  • mount /dev/drbd0 /srv
  • Add entry to /etc/fstab and don’t forget to put the “noauto” option and avoid fscking the partition at boot.
/dev/drbd0      /srv            reiserfs        noauto,noatime,nodev,nosuid,usrquota,grpquota 0 0 
  • cd /srv ; mkdir www mysql dns etc

Installing Services

Please note that every service will be first configured on node1. Although every symlink in /etc should be replicated on node2.

MySQL
  • (on both nodes) apt-get install mysql-server
  • Edit /etc/mysql/my.cf and reflect following changes:
datadir         = /srv/mysql 
  • cp -Rp /etc/mysql /srv/etc/
  • (on both nodes) rm -rf /etc/mysql
  • (on both nodes) ln -s /srv/etc/mysql /etc/mysql
  • cp -Rp /var/lib/mysql/* /srv/mysql/
  • rm -rf /var/lib/mysql
  • /etc/init.d/mysql stop ; /etc/init.d/mysql start
  • check that mysql is running properly.
  • (on both nodes) update-rc.d -f mysql remove
Apache2
  • (on both nodes) apt-get install apache2 apache2-utils libapache2-mod-php5
  • cp -Rp /etc/apache2 /srv/etc/
  • (on both nodes) rm -rf /etc/apache2
  • (on both nodes) ln -sf /srv/etc/apache2 /etc/apache2
  • Modify config for your webpages to point inside /srv/www
  • cp -Rp /etc/php5 /srv/etc/
  • (on both nodes) rm -rf /etc/php5
  • (on both nodes) ln -sf /srv/etc/php5 /etc/php5
  • (on both nodes) update-rc.d -f apache2 remove
Bind9
  • (on both nodes) apt-get install bind9
  • cp -Rp /etc/bind /srv/etc/
  • (on both nodes) rm -rf /etc/bind
  • (on both nodes) ln -sf /srv/etc/bind /etc/bind
  • (on both nodes) update-rc.d -f bind9 remove

When you got every service you want located inside /srv, you can then go to next step…

Installing heartbeat

  • apt-get install heartbeat-2
  • Edit /etc/ha.d/ha.cf
mcast bond0 239.0.0.2 694 1 0 keepalive 2 warntime 10 deadtime 15 initdead 120 node node1 node node2 respawn hacluster /usr/lib/heartbeat/ipfail apiauth default uid=nobody gid=haclient apiauth ipfail uid=hacluster apiauth ping gid=nogroup uid=nobody,hacluster auto_failback on ping 42.42.42.254 deadping 15 debugfile /var/log/ha-debug logfile /var/log/ha-log logfacility     local0 
  • Edit /etc/ha.d/haresources ( MUST BE EXACTLY THE SAME ON BOTH NODE )
node1 IPaddr::42.42.42.22/24/bond0 node1 drbddisk::srv Filesystem::/dev/drbd0::/srv::reiserfs::noatime,nodev,nosuid,usrquota,grpquota mysql apache2 bind9 
  • Edit /etc/ha.d/authkeys
auth 1 1 sha1 mysecretkey 
  • chmod 600 /etc/ha.d/authkeys

Start everything up:

  • (on both nodes) /etc/init.d/heartbeat start
  • look at /var/log/ha-log and see if /srv gets mounted…
  • Play with your nodes to see if everything is well working.

Tuning

There is a lot of tunning that could be done to this setup… it is a good start for thoses who want to build a failover service.. We should for example taking more care about drbd disk error and handle them to stop heartbeat when it is failling…

It’ll maybe be the scope of a next post 😉

Have a nice day

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *