[icinga-devel] Icinga redesign

Michael Friedrich michael.friedrich at univie.ac.at
Wed Jun 9 01:01:21 CEST 2010


Hi Vitali,

On 2010-06-08 23:05, Vitali Voroth wrote:
> Sorry, did I miss a thing?
>    

Not really. We were discussing via Email, via IRC, even via Twitter and 
Mumble, what we can do and decided to drop this on icinga-devel in order 
to start a public discussion about our thoughts and ideas.
There also is an issue opened for that, collecting and brainstorming ideas.

https://dev.icinga.org/issues/378

We had this discussion in the very beginning of the fork too, but there 
were other things to focus on - apparently patching the core (where I 
should finally write a readable Changelog on the past year for all of 
you), a new api, a new web, new docbook docs and finally announced new 
reporting stuff. Overall, personally speaking, I needed to really extend 
my knowledge on both, IDOUtils and now the core.

Right now it is about time to think what the future roadmap will be - we 
recently defined to bring 1.0.4 as unified stable merging all parts back 
together and present that on OSMC 2010.

But what's up afterwards? Where should Icinga go? What can we do with 
the current implementations?

One thing is a core redesign and a rewrite. Not generally the approach 
Shinken does with full Python, we are currently discussing a core 
implementation in C or C++, using Python for Scripting and also API 
stuff. On my personal background, I am currently following the 
development of bind10 and nmsg, both dns related applications - so it 
will be to discuss if this direction will lead to a consensus among all 
of us. Maybe someone of you might wanna join this proposal and get your 
own know how and code into that (I know that you already provided the 
escalation conditions patch, Vitali, so I am pretty sure you might be 
interested ;-))

> So, you want to write icinga completely new from scratch?
>    

Yep regarding the fact that the current code is only patchable, not 
really readable and horribly grown by the years. Looking at the current 
source tree it costs me several minutes and hours to search and reflect 
changes on the core. If we should maintain those things we should take 
care of the code, and take it to the next level. The only problem 
remains that the current code is not very well commented, there are no 
typical function headers (e.g. for automated docs generation like java 
proposes and which is doxygen iirc). But it's not only the code when you 
look at it - it's also grown algorithms, here a condition, over there a 
function call - and e.g. sending an event or processing results still 
requires a returned answer from the function - so to speak, the 
possibilities on multicore systems nowadays should be used for our 
advantage - see the collected multithreading attempt by steven d morrey 
on the dev tracker - https://dev.icinga.org/issues/309).

This won't be done by 1 month, 2 months or half a year, it really needs 
be discussed, designed, re-thought and finally created, modularized, 
created, tested with test frameworks etc etc.
And it won't be done without the community's help of course. That is 
also why this can be found here. Icinga should be open development for 
everyone. And to ease the entry steps the code needs to be adapted in 
several ways (see above ;-))

> What about backward compatibility to nagios?
>    

This should be kept in any way. Somewhere it might be difficult, but 
e.g. the data sources (IDO, Livestatus) can be easily hidden into the 
API blackbox view. Based on that view several addons can ask for data or 
even send commands, like nagvis using those data sources. Or the Icinga 
Web too - we can't break the things up we built during this year ;) 
Furthermore known behavior, historical and status stuff should be kind 
of the same - but not the performance, and also not the internal 
processing of data. The interfaces should be kept too - some kind of old 
NEB API for compatibility, but a full rewritten IDOUtils modul bound 
(buffered) onto the core, also making the usage of modules more easy - 
the icinga.cfg:event_broker=.... line is horrible - ask packagers.

> Shouldn't switching from nagios to icinga be as easy as possible?
>    
It is and will be. And if there might be some adaptions to put your 
config into a small wrapper script and transform that to a new one - me 
and the others will take care of that. I don't like difficult ways to 
get over with, so I like hacking those scripts and writing nifty little 
howtows for that (check the dev tracker on my activities). So this won't 
be a real problem then - switching from Nagios to Icinga will be kept as 
easy as possible. Switching back is another story which affects current 
Nagios development, but not Icinga ;-)

> Shouldn't one using nagios be able to (at least) reuse his old
> configuration files?
>    

We are not talking about extensively rewriting the configs. The config 
format and syntax is good, it can be extended and enhanced by several 
features. Recently I saw people complaining about tabs and whitespaces 
in their configs not to be recognized correctly. or the utf8 encoding 
things which are not really working.
Small things, which are already config parsing, not really related to 
that. And slight enhancements, opt-ins like the escalation conditions or 
the event profiler - both general and objects configs.


Anyone who is interested in being part of this (r)evolution is very 
welcome to do so.

Kind regards,
Michael


> Am 08.06.2010 22:15, schrieb Hiren Patel:
>    
>> posting this to devel for input and ideas, discussions etc.
>>
>>      
>>>> --------
>>>> before we start any dev, perhaps a small outline of coding style we will use
>>>>
>>>>          
>>> before we even code a line, we need to clarify the exact coding style like
>>>
>>> if(condition) {
>>>       test();
>>> }
>>>
>>>        
>> cool, I'll put together a small list in the week to come.
>>
>>      
>>> even more, the comments and (function) headers should be written in
>>> doxygen format.
>>>
>>>
>>>        
>>>> my thought is to have lists and each thread processing specific lists, locking where appropriate,
>>>> eg, a notification list, the notification thread will continuously watch for anything on this list,
>>>> if/when there is jobs on this list, the thread will lock, remove job, unlock, and process job.
>>>> job being a notification to send out, the list will point to a object with all the details.
>>>> as such each thread handles such a list, and others add onto it where need be.
>>>>
>>>>          
>>> Sounds good. We should talk about in deep, what should be possible with
>>> those lists, and if we can create that simply with C, or if we should
>>> switch over to C++ in this regard. That's the more or less basic
>>> question before starting to code.
>>> I can see a small problem in C - we could begin looking at the old code
>>> and just copy pasting things. That's not the way it's meant to be.
>>>
>>>        
>> I haven't really done c++, but I'm sure I could learn it in a few weeks.
>> is there any real advantage to switch to c++ beside resisting copy/paste from existing core?
>>
>>      
>>>> was thinking, these lists will have global locks that threads would lock and unlock, and the main data structure storage
>>>> could possibly have locks per object if feasible, so one thread wanting too update a service struct with new perf data
>>>> for eg, and another wanting to read from a different service struct to populate macros, could do so at the same time.
>>>>
>>>>          
>>> Yep of course. As a matter of fact you would run into deadlocks if only
>>> having a global lock.
>>>
>>>        
>>>> we do away with as much global variables as possible
>>>>
>>>> make macro functions reenterent, so that thread jobs can request macros etc without conflict
>>>>
>>>> separate threads to:
>>>> =======
>>>> run host/service checks
>>>>    with ability to active check on slave nodes like dnx does
>>>>    possibly have a built in ping module to thread on instead of fork (since ping is common)
>>>> reschedule checks
>>>> external command checks
>>>> check reaper
>>>> retention save
>>>> notifications
>>>> event handlers
>>>> performance data processing
>>>> module handling
>>>> general tasks:
>>>>    schedule downtime
>>>>    freshness check
>>>>    comment handling
>>>>    flapping calc
>>>>    stats handling including profiling
>>>> general low pri tasks:
>>>>    log rotation
>>>>
>>>>          
>>> status api (like livestatus currently is)
>>> event broker
>>>
>>>        
>> yep the module handling listed was meant to refer to event brokering.
>>
>>      
>>>> do away with status.dat file:
>>>> ========
>>>> fifo/socket to listen on to dump live object data to
>>>> anything that queries it
>>>> dump all or parts of objects depending on request etc
>>>>
>>>>          
>>> ok, sounds even better. So in fact the current livestatus implementation
>>> handed directly into a core api. This should be both directions, so that
>>>
>>> * question for livestatus data, answer
>>> * dumping all data like idomod does, based on different settings
>>> (config, live, historical)
>>> * adding commands not via pipe but on this api
>>> * secure that in every possible way in regard of performance
>>> * adapt api output in an easy format for icinga api, e.g. add sth like a
>>> JSON writer, or couchDB (NoSQL DB) and livestatus
>>>
>>>        
>> sounds good to me.
>>
>>      
>>>> do away with xdata:
>>>> =========
>>>>
>>>>          
>>> xdata is horrible design. this needs to be reworked fully into the core.
>>>
>>>        
>>>> one base/config/ dir with all the config handling routines
>>>> read conf files, resolve inheritence etc, and add straight
>>>> to data structures (if feasible)
>>>>
>>>>          
>>> also handling modules, extra plugin configs (!), and so on.
>>>
>>>        
>>>> one base/retention/ dir with all the retention routines
>>>> one base/perfdata/ dir with all perf data routines
>>>>
>>>>          
>>> Ok then modules/ dir with all modules routines (aka neb init, callback etc)
>>>
>>> e.g.
>>>
>>> modules/broker/ dir with event broker stuff
>>> modules/api/ dir with everything regarding a status and command api
>>>
>>> Hmmm and even more, the overall structure needs to be fully written down
>>> before starting to code.
>>>
>>> But in fact, we should sum our mails up into a single one and drop that
>>> onto the devel mailinglist.
>>>
>>>        
>> doing so in this mail.
>> any input and ideas welcome.
>> I'll start documenting a design as you suggest above, and we can take it from there.
>>
>> ------------------------------------------------------------------------------
>> ThinkGeek and WIRED's GeekDad team up for the Ultimate
>> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
>> lucky parental unit.  See the prize list and enter to win:
>> http://p.sf.net/sfu/thinkgeek-promo
>> _______________________________________________
>> icinga-devel mailing list
>> icinga-devel at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/icinga-devel
>>      
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit.  See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> icinga-devel mailing list
> icinga-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/icinga-devel
>
>    





More information about the icinga-devel mailing list