[icinga-users] Setup Icinga for high availability and scalability

Carl R. Friend crfriend at rcn.com
Mon Mar 11 22:02:26 CET 2013

    On Mon, 11 Mar 2013, Peter Albrecht wrote:

> A more general question: What would be the preferred method of making
> Icinga highly available? Using the default cluster setup using Pacemaker
> (on SUSE Linux Enterprise Server) and defining Icinga as a resource? The
> database would (most probably) be on a separate system.

    There are a few ways to achieve HA with Icinga (or any other app
for that matter), but most of them come down to either shared storage
or replication of some ilk and appropriate control harnesses that
ensure that only one instance is active at any given point in time.
Shared storage has the benefit of being easy to understand; 
replication, though, protects you if the SAN heads south (which iSCSI
ones are oft wont to do) and can be done with off-the-shelf iron.

    On databases: IDOutils essentially uses a "write-only" model for
the database; it doesn't read anything from it save to ascertain its
"instance ID" on startup.  Most of the important dynamic tables, in
fact, are cleared and recreated when the Icinga instance starts.  So,
this is not really a good mechanism for backing up current status
data because you'd have to read it from the database and then reformat
that into the files (status.dat, retention.dat, &c.) that Icinga
reads (and feeds into the database) when it starts.  Using the
database for this purpose would be a complex issue indeed as it's
"out of model" for the way that Icinga was designed.

    A poor man's replication setup could be as simple as running rsyncs
every minute or so on the /var areas of the operational Icinga system
and every 15 minutes or so for the less dynamic configuration area.
Then, in the event of a failure, the backup system assumes the IP
address of the failed primary and tries to start Icinga with the last
rsync'ed data.

    The more ambitious would probably gravitate to block-level file-
system replication via metadevices that keep a remote mirror of the
local running filesystems upon which the Icinga data are stored.

> And finally: I need a scalable setup for Icinga. Is there a comparison
> available for mod_gearman and DNX? Could someone share their experience
> with either of these solutions?

    We cheated where I work and threw hardware at the problem.  24
cores can do a lot of work, and there are multiple Icinga instances
(well, Nagios at the moment, but I'm pushing for an upgrade) on
each system, all of which roll up into a global master which
does all the alerting and handles interaction with the operators.


| Carl Richard Friend (UNIX Sysadmin)            | West Boylston       |
| Minicomputer Collector / Enthusiast            | Massachusetts, USA  |
| mailto:crfriend at rcn.com                        +---------------------+
| http://users.rcn.com/crfriend/museum           | ICBM: 42:22N 71:47W |

More information about the icinga-users mailing list