[icinga-checkins] icinga.org: icinga-core/mfriedrich/ido: core: fix lockup on DEL_DOWNTIME_BY_HOST_NAME #1572

git at icinga.org git at icinga.org
Wed Oct 3 01:14:52 CEST 2012


Module: icinga-core
Branch: mfriedrich/ido
Commit: 4298e14939ef60c19b24caffc56c0e3de203d224
URL:    https://git.icinga.org/?p=icinga-core.git;a=commit;h=4298e14939ef60c19b24caffc56c0e3de203d224

Author: Michael Friedrich <michael.friedrich at gmail.com>
Date:   Sun Sep 23 12:29:00 2012 +0200

core: fix lockup on DEL_DOWNTIME_BY_HOST_NAME #1572

basically, we added a lock for delete_downtime_by_hostname_service_description_start_time_comment
which did a list traversal, looking for possible downtimes to be
deleted. what we did not think of - unschedule_downtime() will be called
every now and then. this does not fetch the mutex lock itsself, but the
underlaying delete_*_downtime, especially delete_downtime() will try to
get the mutex lock once again. since this is a globally shared resource,
we did lockup ourselves on the first downtime which was due being
deleted.

in order to stay sane on traversing the first and second list, we'll
clear the mutex lock before invoking unschedule_downtime() and after
returning, locking the mutex again. this way, we'll keep everything
safe, but do not lock up when deleting downtimes by a given hostname.

btw - DEL_DOWNTIME_BY_HOST_NAME can be extended further to

DEL_DOWNTIME_BY_HOST_NAME;hostname;svcdesc;starttime;comment

being hostname mandatory, but adding a service description, start time
and comment afterwards, will increase the filters on deleting it.

this possibly solves lockups on DEL_DOWNTIME_BY_HOSTGROUP_NAME and
DEL_DOWNTIME_BY_STARTTIME as well.

refs #1572

---

 common/downtime.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/common/downtime.c b/common/downtime.c
index c9587ad..374a127 100644
--- a/common/downtime.c
+++ b/common/downtime.c
@@ -914,7 +914,14 @@ int delete_downtime_by_hostname_service_description_start_time_comment(char *hos
 				continue;
 		}
 
+#ifdef NSCORE
+		/* unlock here, because delete_*_downtime will try to lock itsself */
+		pthread_mutex_unlock(&icinga_downtime_lock);
+#endif
 		unschedule_downtime(temp_downtime->type, temp_downtime->downtime_id);
+#ifdef NSCORE
+		pthread_mutex_lock(&icinga_downtime_lock);
+#endif
 		deleted++;
 	}
 #ifdef NSCORE





More information about the icinga-checkins mailing list