[icinga-users] Icinga2: implicit service/parent dependency

Christophe HAEN christophe.haen at cern.ch
Mon Aug 4 10:27:49 CEST 2014


Really?? I was quite sure that the checks were not executed when the host
was down...
Okay, if then it is a normal behavior, no need to worry :-) Can I ask
though when would such a case be useful? I can hardly imagine one.

Hum, that's a cool feature, thanks for the hint and the follow up!

Cheers,
Chris


2014-07-30 16:13 GMT+02:00 Michael Friedrich <michael.friedrich at netways.de>:

> Am 29.07.2014 16:00, schrieb Christophe HAEN:
>
>  Hi,
>>
>> is there anything I could do/debug/look at/provide to make the case
>> solving easier?
>> This bug is almost the last step to go fully production with Icinga2 :-(
>>
>
> I was mistaken (thx Gunnar & Tom), that Icinga 1.x would prevent service
> checks when a host is down - the implicit dependency does not. It will
> suppress notifications though.
>
> Therefore the behavior is the same in Icinga 2, and the checks being
> executed even if the host is considered being down are normal and
> required behavior.
>
> Though the documentation is still unclear about it, and we will fix that
> accordingly. Same goes to the general dependency applied which prevents
> the service notifications on a host being down.
>
> https://dev.icinga.org/issues/6725
> https://dev.icinga.org/issues/6813
>
> For your question in preventing service checks when a host is down - set
> an additional dependency apply for all hosts & services like this.
>
> apply Dependency "disable-host-service-notifications" to Service {
>   disable_checks = true
>   assign where true
> }
>
> If parent_host_name is omitted, it will use the host_name the apply rule
> is matched for. That's a complex yet undocumented feature. I wasn't
> aware of it either.
>
>
> Kind regards,
> Michael
>
>
>
>> Cheers,
>> Chris
>>
>>
>> 2014-07-17 7:12 GMT+02:00 Christophe HAEN <christophe.haen at cern.ch
>> <mailto:christophe.haen at cern.ch>>:
>>
>>
>>     Good morning,
>>
>>     Some news. I did another test : I have my host X on which I have
>>     many services (A, B, C, D) running. All the services depends on one
>>     of the service of that machine (say B, C and D depends on A). Host X
>>     is down.
>>     The observed behavior is that the service A is still checked (which
>>     it should not because of the implicit dependency), but all the other
>>     checks are not because A is CRITICAL. In that case I can observe
>>     also the 'Skipping' line.
>>     It simply does not take into account the implicit dependency...
>>
>>     Cheers,
>>     Chris
>>
>>
>>     2014-07-15 13:06 GMT+02:00 Christophe HAEN <christophe.haen at cern.ch
>>     <mailto:christophe.haen at cern.ch>>:
>>
>>
>>         hum...
>>         [root at icinga2 ~]# grep -i CheckerComponent
>>         /var/log/icinga2/icinga2.log | wc -l
>>         530
>>         [root at icinga2 ~]# grep -i Skipping /var/log/icinga2/icinga2.log
>>         | wc -l
>>         0
>>
>>         The CheckerComponent lines look like that:
>>         [2014-07-15 13:04:18 +0200] notice/CheckerComponent: Pending
>>         checkables: 0; Idle checkables: 27; Checks/s: 0.6
>>         [2014-07-15 13:04:23 +0200] notice/CheckerComponent: Pending
>>         checkables: 0; Idle checkables: 27; Checks/s: 0.6
>>         [2014-07-15 13:04:28 +0200] notice/CheckerComponent: Pending
>>         checkables: 0; Idle checkables: 27; Checks/s: 0.8
>>         [2014-07-15 13:04:33 +0200] notice/CheckerComponent: Pending
>>         checkables: 0; Idle checkables: 27; Checks/s: 1.6
>>         [2014-07-15 13:04:38 +0200] notice/CheckerComponent: Pending
>>         checkables: 0; Idle checkables: 27; Checks/s: 2.6
>>         [2014-07-15 13:04:43 +0200] notice/CheckerComponent: Pending
>>         checkables: 0; Idle checkables: 27; Checks/s: 2.2
>>         [2014-07-15 13:04:48 +0200] notice/CheckerComponent: Pending
>>         checkables: 0; Idle checkables: 27; Checks/s: 2.2
>>         [2014-07-15 13:04:53 +0200] notice/CheckerComponent: Pending
>>         checkables: 0; Idle checkables: 27; Checks/s: 2.2
>>         [2014-07-15 13:04:58 +0200] notice/CheckerComponent: Pending
>>         checkables: 0; Idle checkables: 27; Checks/s: 0.6
>>         [2014-07-15 13:05:03 +0200] notice/CheckerComponent: Pending
>>         checkables: 0; Idle checkables: 27; Checks/s: 0.6
>>
>>         I'll enable the debug mode. Any pattern that you recommend to
>>         look for?
>>
>>         Thanks for your help
>>         Chris
>>
>>
>>
>>
>>         2014-07-15 10:37 GMT+02:00 Michael Friedrich
>>         <michael.friedrich at netways.de
>>         <mailto:michael.friedrich at netways.de>>:
>>
>>
>>             On 15.07.2014 08:02, Christophe HAEN wrote:
>>
>>>             Hi,
>>>
>>>             yes I understand that it is inside the code, but I would
>>>             assume that it is treated like any other dependency in the
>>>             logic, or is it not the case?
>>>
>>>             Ticket opened.
>>>
>>>             I don't have logs no, however, I can take an example
>>>             querying the DB:
>>>
>>
>>             Please set the log level to "notice" and look for lines
>>             containing "CheckerComponent" and "Skipping check for object
>>             'HOSTNAME!SERVICENAME': Dependency failed." (replace
>>             HOST/SERVICENAME with values)
>>
>>             There is a reachability check upon check execution, and it
>>             should be triggered preventing the additional service check.
>>             If these lines do not exist, something is really fishy and
>>             we need to go further with 'debug' log level (and disable
>>             the ido-mysql feature and any other unwanted ones too).
>>
>>             thanks,
>>             Michael
>>
>>
>>
>>>             ariaDB [icinga]> select status_update_time,
>>>             last_state_change, current_state, last_check, output from
>>>             icinga_hoststatus s , icinga_hosts h where
>>>             s.host_object_id = h.host_object_id and h.display_name =
>>>             'hlta0104';
>>>             +---------------------+---------------------+---------------
>>> +---------------------+----------------------------------------+
>>>             | status_update_time  | last_state_change   |
>>>             current_state | last_check          | output
>>>                               |
>>>             +---------------------+---------------------+---------------
>>> +---------------------+----------------------------------------+
>>>             | 2014-07-15 07:57:27 | 2014-07-09 09:30:55 |
>>>             1 | 2014-07-15 07:57:27 | CRITICAL - Host Unreachable
>>>             (hlta0104) |
>>>             +---------------------+---------------------+---------------
>>> +---------------------+----------------------------------------+
>>>             1 row in set (0.00 sec)
>>>
>>>             MariaDB [icinga]> select s.display_name,
>>>             last_state_change, status_update_time, current_state,
>>>             last_check from icinga_servicestatus ss , icinga_services
>>>             s, icinga_hosts h where s.host_object_id =
>>>             h.host_object_id and s.service_object_id =
>>>             ss.service_object_id and h.display_name = 'hlta0104';
>>>             +--------------------+---------------------+----------------
>>> -----+---------------+---------------------+
>>>             | display_name       | last_state_change   |
>>>             status_update_time  | current_state | last_check          |
>>>             +--------------------+---------------------+----------------
>>> -----+---------------+---------------------+
>>>             | cvmfs              | 2014-07-15 07:52:41 | 2014-07-15
>>>             07:57:38 |             3 | 2014-07-15 07:57:38 |
>>>             | disk_root          | 2014-07-15 07:52:45 | 2014-07-15
>>>             07:57:42 |             3 | 2014-07-15 07:57:42 |
>>>             | jumbo_vs_storerecv | 2014-07-15 07:56:34 | 2014-07-15
>>>             07:57:34 |             2 | 2014-07-15 07:57:34 |
>>>             | fmc_tmsrv          | 2014-07-15 07:57:25 | 2014-07-15
>>>             07:57:25 |             3 | 2014-07-15 07:57:25 |
>>>             | mem                | 2014-07-15 07:57:25 | 2014-07-15
>>>             07:57:25 |             2 | 2014-07-15 07:57:25 |
>>>             | jumbo_vs_storestrm | 2014-07-15 07:55:27 | 2014-07-15
>>>             07:57:27 |             2 | 2014-07-15 07:57:27 |
>>>             | mountpoint         | 2014-07-15 07:52:25 | 2014-07-15
>>>             07:57:28 |             3 | 2014-07-15 07:57:28 |
>>>             | uptime             | 2014-07-15 07:57:45 | 2014-07-15
>>>             07:57:45 |             2 | 2014-07-15 07:57:45 |
>>>             | mount              | 2014-07-15 07:52:11 | 2014-07-15
>>>             07:57:09 |             3 | 2014-07-15 07:57:09 |
>>>             | load               | 2014-07-15 07:57:45 | 2014-07-15
>>>             07:57:45 |             2 | 2014-07-15 07:57:45 |
>>>             | nrpe-users         | 2014-07-15 07:56:36 | 2014-07-15
>>>             07:57:36 |             3 | 2014-07-15 07:57:36 |
>>>             | ping               | 2014-07-09 09:31:15 | 2014-07-15
>>>             07:56:59 |             2 | 2014-07-15 07:56:59 |
>>>             | ssh                | 2014-07-11 18:08:37 | 2014-07-15
>>>             07:57:11 |             2 | 2014-07-15 07:57:11 |
>>>             | rsyslog            | 2014-07-15 07:56:44 | 2014-07-15
>>>             07:57:40 |             3 | 2014-07-15 07:57:40 |
>>>             | swap               | 2014-07-15 07:52:09 | 2014-07-15
>>>             07:57:07 |             3 | 2014-07-15 07:57:07 |
>>>             +--------------------+---------------------+----------------
>>> -----+---------------+---------------------+
>>>             15 rows in set (0.00 sec)
>>>
>>>             What is interesting is the last_state_change for the
>>>             service. In fact, it oscillates between
>>>             UNKNOWN  : connect to address x.x.x.x port y : No route to
>>>             host
>>>             CRITICAL : CHECK_NRPE: Socket timeout after 10 seconds.
>>>
>>>             This is also a bit strange.
>>>
>>>             Cheers,
>>>             Chris
>>>
>>>
>>>
>>>             2014-07-14 19:42 GMT+02:00 Michael Friedrich
>>>             <michael.friedrich at netways.de
>>>             <mailto:michael.friedrich at netways.de>>:
>>>
>>>
>>>                 On 14.07.2014 16:55, Christophe HAEN wrote:
>>>
>>>>                 Hi,
>>>>
>>>>                 The documentation states that there is an implicit
>>>>                 dependency between the services and the host on which
>>>>                 they run. However it does not give much details
>>>>                 regarding the configuration of this dependency.
>>>>
>>>
>>>                 There is no configuration of that, but that's
>>>                 implemented inside the reachability logic.
>>>
>>>
>>>                  Namely:
>>>>                 - the value of disable_checks
>>>>                 - the value of disable_notification
>>>>                 - the value of states
>>>>                 - what happen to this implicit dependency when there
>>>>                 are extra dependency on the services/hosts ? Are the
>>>>                 dependency combined by an OR? And AND? Is it just
>>>>                 overwritten?
>>>>
>>>
>>>                 If one of these dependencies fails, that condition is
>>>                 met. So I'd say that's an OR then. If the implicit
>>>                 dependency does not trigger, I'd say it's a bug then.
>>>
>>>
>>>
>>>>                 Could we had extra details on that point please?
>>>>
>>>
>>>                 Open a documentation ticket please.
>>>
>>>
>>>
>>>>                 I am asking this because we observe funny behavior,
>>>>                 like checks being executed despite the host is down,
>>>>                 which is not what I would expect
>>>>
>>>
>>>                 Any debug (logs) to get an idea of your described
>>>                 behavior?
>>>
>>>                 kind regards,
>>>                 Michael
>>>
>>>
>>>>                 Thanks and cheers,
>>>>                 Chris
>>>>
>>>>
>>>>                 _______________________________________________
>>>>                 icinga-users mailing list
>>>>                 icinga-users at lists.icinga.org  <mailto:
>>>> icinga-users at lists.icinga.org>
>>>>                 https://lists.icinga.org/mailman/listinfo/icinga-users
>>>>
>>>
>>>
>>>                 --
>>>                 Michael Friedrich, DI (FH)
>>>                 Application Developer
>>>
>>>                 NETWAYS GmbH | Deutschherrnstr. 15-19 | D-90429 Nuernberg
>>>                 Tel: +49 911 92885-0 | Fax: +49 911 92885-77
>>>                 GF: Julian Hein, Bernd Erk | AG Nuernberg HRB18461
>>>                 http://www.netways.de | Michael.Friedrich at netways.de
>>>                 <mailto:Michael.Friedrich at netways.de>
>>>
>>>
>>>                 ** Open Source Backup Conference 2014 - September -
>>>                 osbconf.org <http://osbconf.org> **
>>>
>>>                 ** Puppet Camp Duesseldorf 2014 - Oktober -
>>>                 netways.de/puppetcamp <http://netways.de/puppetcamp> **
>>>
>>>                 ** OSMC 2014 - November - netways.de/osmc
>>>                 <http://netways.de/osmc> **
>>>
>>>                 ** OpenNebula Conf 2014 - Dezember -
>>>                 opennebulaconf.com <http://opennebulaconf.com> **
>>>
>>>
>>>                 _______________________________________________
>>>                 icinga-users mailing list
>>>                 icinga-users at lists.icinga.org
>>>                 <mailto:icinga-users at lists.icinga.org>
>>>
>>>                 https://lists.icinga.org/mailman/listinfo/icinga-users
>>>
>>>
>>>
>>>
>>>             --
>>>             Christophe HAEN
>>>             CERN PH-LBC 2/R022
>>>             Phone : +41 (0)2 27 67 31 25
>>>             Mobile : +41 (0)7 64 87 88 57
>>>
>>>
>>>             _______________________________________________
>>>             icinga-users mailing list
>>>             icinga-users at lists.icinga.org  <mailto:icinga-users at lists.
>>> icinga.org>
>>>             https://lists.icinga.org/mailman/listinfo/icinga-users
>>>
>>
>>
>>             --
>>             Michael Friedrich, DI (FH)
>>             Application Developer
>>
>>             NETWAYS GmbH | Deutschherrnstr. 15-19 | D-90429 Nuernberg
>>             Tel: +49 911 92885-0 | Fax: +49 911 92885-77
>>             GF: Julian Hein, Bernd Erk | AG Nuernberg HRB18461
>>             http://www.netways.de | Michael.Friedrich at netways.de
>>             <mailto:Michael.Friedrich at netways.de>
>>
>>
>>             ** Open Source Backup Conference 2014 - September -
>>             osbconf.org <http://osbconf.org> **
>>
>>             ** Puppet Camp Duesseldorf 2014 - Oktober -
>>             netways.de/puppetcamp <http://netways.de/puppetcamp> **
>>
>>             ** OSMC 2014 - November - netways.de/osmc
>>             <http://netways.de/osmc> **
>>
>>             ** OpenNebula Conf 2014 - Dezember - opennebulaconf.com
>>             <http://opennebulaconf.com> **
>>
>>
>>             _______________________________________________
>>             icinga-users mailing list
>>             icinga-users at lists.icinga.org
>>             <mailto:icinga-users at lists.icinga.org>
>>
>>             https://lists.icinga.org/mailman/listinfo/icinga-users
>>
>>
>>
>>
>>         --
>>         Christophe HAEN
>>         CERN PH-LBC 2/R022
>>         Phone : +41 (0)2 27 67 31 25
>>         Mobile : +41 (0)7 64 87 88 57
>>
>>
>>
>>
>>     --
>>     Christophe HAEN
>>     CERN PH-LBC 2/R022
>>     Phone : +41 (0)2 27 67 31 25
>>     Mobile : +41 (0)7 64 87 88 57
>>
>>
>>
>>
>> --
>> Christophe HAEN
>> CERN PH-LBC 2/R022
>> Phone : +41 (0)2 27 67 31 25
>> Mobile : +41 (0)7 64 87 88 57
>>
>>
>> _______________________________________________
>> icinga-users mailing list
>> icinga-users at lists.icinga.org
>> https://lists.icinga.org/mailman/listinfo/icinga-users
>>
>>
>
> --
> Michael Friedrich, DI (FH)
> Application Developer
>
> NETWAYS GmbH | Deutschherrnstr. 15-19 | D-90429 Nuernberg
> Tel: +49 911 92885-0 | Fax: +49 911 92885-77
> GF: Julian Hein, Bernd Erk | AG Nuernberg HRB18461
> http://www.netways.de | Michael.Friedrich at netways.de
>
> ** Open Source Backup Conference 2014 - September - osbconf.org **
> ** Puppet Camp Duesseldorf 2014 - Oktober - netways.de/puppetcamp **
> ** OSMC 2014 - November - netways.de/osmc **
> ** OpenNebula Conf 2014 - Dezember - opennebulaconf.com **
> _______________________________________________
> icinga-users mailing list
> icinga-users at lists.icinga.org
> https://lists.icinga.org/mailman/listinfo/icinga-users
>



-- 
Christophe HAEN
CERN PH-LBC 2/R022
Phone : +41 (0)2 27 67 31 25
Mobile : +41 (0)7 64 87 88 57
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.icinga.org/pipermail/icinga-users/attachments/20140804/b4f5bcc3/attachment-0001.html>


More information about the icinga-users mailing list