[icinga-devel] Icinga Core API

Gunnar Beutner gunnar.beutner at netways.de
Thu Jun 30 09:59:39 CEST 2011

Am 29.06.2011 18:51, schrieb Michael Friedrich:
> Hi,

> Gunnar Beutner wrote:
>> Hello,
>> as requested by Michael Friedrich I'm pushing the discussion that's so
>> far taken place on our internal team list to the icinga-devel list. For
>> anyone new to join this discussion here's a short introduction to what
>> this is all about:
> i do think that such things are way off the hook to be kept internal. 
> those are major changes to be discussed, and not just implemented and 
> thrown at someone.
Something as complex as our MQ idea should definitely be able to stand 
up to a certain amount of scrutiny. I'd rather spend more time on 
discussing issues with our design (we're not infallible after all :)) 
than to start coding only to realize later on that we've missed some 
major problems.

>> During the Icinga meet-up last weekend we've discussed some ideas for an
>> Icinga Core API and a way to deal with scalability issues in large-scale
>> monitoring setups.
> i wasn't taking part during the meeting, so my opinion only reflects 
> what i was reading and reflecting in the past 3 days.
> those things are 2 seperated portions to take care of.
> 1) a core api
> * adding things like mklivestatus already proposed (GET)
> * allowing the SET/POST bidirectional communication
> * don't introduce external dependencies
> * providing compatibility to existing addons
> 2) a distributed message queue system
> * can be done either as (neb) module (mod_gearman!) or on the new core 
> api layer
> * stays independent from the existing setup, can be provided "as is" 
> and loaded if demanded
> * does the internal message queueing itsself and what might be on the 
> top layer
> * doesn't bypass the core api
It would definitely make sense to break down the MQ design into 
individual milestones which can be reviewed independently - we're 
certainly not going to throw a big blob of code at you once everything 
is "finished".

>> The basic idea revolves around using a message queue to decouple
>> components like service checks and API methods from the Icinga Core
>> while providing a unified interface to these components that can be used
>> either locally (i.e. in-process or via IPC) or remotely (via TCP). For
>> now we've been thinking about using ZeroMQ for this which is an
>> embeddable, (mostly) platform-agnostic message queue library written in C.
> i'd love to hear opinions why exactly this should be zeromq and not 
> rabbitmq or other candicates like described here e.g.
> http://wiki.secondlife.com/wiki/Message_Queue_Evaluation_Notes
It doesn't necessarily have to be ZeroMQ. The description on the wiki is 
actually pretty much implementation-agnostic. ZeroMQ is merely what we 
came up with during the weekend and it seems to have a number of 
properties other MQ implementations do not have. For one it doesn't have 
any external dependencies (like Java or Oracle DB) and can be embedded 
directly into applications - without having to run a separate message 

Obviously RabbitMQ and other MQs have their own advantages too. I'll try 
and come up with a more detailed comparison matrix for the MQ 
implementations we've been investigating so far.

>> You can find an updated in-depth description of the proposed changes
>> involved in the MQ implementation at
>> https://wiki.icinga.org/display/Dev/Ideas+for+new+Core+API  . I'd love to
>> hear your feedback about the design specifications which for now are
>> primarily being worked on by Ricardo and me.
> summarizing the points i've already posted in recent internal mails,
> * the newly introduced core api should be used as poc for classicui 
> and integrated live search, providing simple GET hosts in json and 
> nothing more basically
While the first proof-of-concept version of the API/MQ implementation 
will most likely not feature the whole spectrum of API commands we 
shouldn't just focus on read-only GET queries as in doing so we might 
easily miss issues we'll encounter later on (like the whole 
thread-safety issue when updating/creating Icinga objects via the MQ).

> * the libicinga is on top of the mq architecture, which then remains 
> rather independant of the underlaying core architecture. so whatever 
> you are wrapping around, you could still do it a layer down below, if 
> you know what you are doing .... "libicinga is built on zeromq and 
> people tend to use other technologies - like gearman or livestatus 
> wrappers for nagvis, nagiosbp and such. the idea is *not* to reinvent 
> the wheel, but stay compatible to existing solutions. addons devs 
> won't jump onto that just because it's cool and costs dev time."
> * the proof of concept for distributed monitoring should take place 
> with the current neb architecture .... "proof of concept as core 
> module - take a look at mod_gearman, and replace that parts with 
> zeromq, realize the proof of concept as neb module with 
> * the core api should be extendable. not only for icingamq but also 
> for ...
> ** idoutils replacement, with a cached worker
> ** add configs via api, store internally, provide a config editor (or 
> at least a community addon) like demanded over 
> http://feedback.icinga.org/forums/50329-general/suggestions/754152-api-to-add-new-hosts-groups-services-commands-auto?ref=title
Ideally these components wouldn't be accessing the Core API directly but 
would instead use libicinga. One benefit of that would be that idoutils 
could asynchronously pull event notifications from the message queue and 
process these events without blocking the Icinga Core. In that scenario 
the message queue would also take care of buffering events - which I 
guess is what you meant by "cached worker".

Being able to dynamically reconfigure Icinga would be a very cool 
feature. However the changes required for this lie somewhat outside the 
scope of the API. In order to implement this the Icinga Core would first 
have to provide basic support for this sort of reconfiguration 
(persisting dynamically added objects, ability to remove objects, etc.). 
One thing I'm concerned about here is backwards-compatibility with other 
plugins like mk_livestatus (which probably wouldn't be able to 
gracefully handle deleted objects - not without modifications anyway).

> [--SNIP--]
> 1) design a core api capable of that (icingamq doesn't include that by
> default as far as i have read)
> 2) allow simple configs to be added into the core, and still written
> somewhere
> 3) adopt a config data layout
> 4) build a custom config web onto that
> 5) integrate that web in classicui and icingaweb
> [--SNIP--]
> * the core api as well as the icingamq design *must* entirely stay out 
> of the current 1.x tree and probably provided as package to be tested, 
> but not provided officially (and also not supported).
> and tbh, this would be something for icinga 2.x where new features can 
> be introduced and some show stoppers can be deprecated whilst 
> introducing new things.
While it should be made obvious to our users that the Core API/MQ 
feature-set is highly experimental in the first couple of versions (and 
therefore should be disabled by default) having these components as an 
external set of patches or a NEB module would impede our efforts to 
actually get people to test these changes - which is pretty much the 
only way of finding bugs and other problems.

> * the message format for icingamq must be discussed either on 
> icinga-devel or icinga-users mailinglists (or a public poll if there 
> was a decision), allowing the community to add their ideas to a 
> general RFC - which needs to be written in the first place.
> * evaluate the core logic on the eventloop block for api fetches, or 
> threaded synchronisation. this includes feeding checkresults e.g.
> * keep the code clean, follow the style guide and developer 
> guidelines. this includes doxygen and also further documentation / rfcs.
> kind regards,
> michael
Best regards,

> -- 
> DI (FH) Michael Friedrich
> Vienna University Computer Center
> Universitaetsstrasse 7 A-1010 Vienna, Austria
> email: 	michael.friedrich at univie.ac.at
> phone: 	+43 1 4277 14359
> mobile: +43 664 60277 14359
> fax: 	+43 1 4277 14338
> web:	http://www.univie.ac.at/zid
> 	http://www.aco.net
> Icinga Core&  IDOUtils Developer
> http://www.icinga.org
> ------------------------------------------------------------------------------
> All of the data generated in your IT infrastructure is seriously valuable.
> Why? It contains a definitive record of application performance, security
> threats, fraudulent activity, and more. Splunk takes this data and makes
> sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-d2d-c2
> _______________________________________________
> icinga-devel mailing list
> icinga-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/icinga-devel

More information about the icinga-devel mailing list