summaryrefslogtreecommitdiff
path: root/meta-phosphor/recipes-phosphor/systemd-policy
diff options
context:
space:
mode:
authorAndrew Geissler <geissonator@yahoo.com>2018-09-17 18:36:08 +0300
committerBrad Bishop <bradleyb@fuzziesquirrel.com>2018-09-24 14:43:49 +0300
commit0ee690fcb712718ca7ad179ec1a29a9803c80ed6 (patch)
treecaad817235ecb34529d49913fa0c599b42baaafb /meta-phosphor/recipes-phosphor/systemd-policy
parentedb619229fb988cd72a919e4068d820c632a7f5e (diff)
downloadopenbmc-0ee690fcb712718ca7ad179ec1a29a9803c80ed6.tar.xz
Increase StartLimitIntervalSec to 240s
The DefaultTimeoutStartSec is 90s. If a service is hitting this timeout repeatedly then the StartLimitIntervalSec needs to be set in a way to handle this worse case scenario so that the service which is timing out does not continuously get restarted. This means it needs to be set to: StartLimitBurst*DefaultTimeoutStartSec + StartLimitBurst*<worst case processing time> (30s) which currently would be 2x90 + 2x30 Ref: systemd-system.conf Tested: Verified that if 90s timeout is hit in service that it is no longer restarted after 2 attempts. Resolves openbmc/openbmc#3379 (From meta-phosphor rev: ee52526c80eaca65a581c01bcf703861ec1a80b6) Change-Id: I8ff4febeb46a746dd3e5e625c5bdc3735963799b Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Signed-off-by: Brad Bishop <bradleyb@fuzziesquirrel.com>
Diffstat (limited to 'meta-phosphor/recipes-phosphor/systemd-policy')
-rw-r--r--meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf14
1 files changed, 9 insertions, 5 deletions
diff --git a/meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf b/meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf
index 54516c2d47..17c9e6beae 100644
--- a/meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf
+++ b/meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf
@@ -13,19 +13,23 @@
# restarting once does the job or restarting all 5 times does not help
# and we just end up hitting the 5 limit anyway.
#
-# - Change the StartLimitIntervalSec to 30s
+# - Change the StartLimitIntervalSec to 240s
# The BMC CPU performance is already challenged. When a service is
# failing and a core dump is being generated and collected into a dump,
# it's even more challenged. Recent failures have shown situations where
# the service does not fail again until 15-20 seconds after the initial
# failure which means the default of 10s for this results in the service
-# being restarted indefinitely. Change this to 30s to only allow a service
-# to be restarted StartLimitBurst times within a 30s interval before
-# being put in a permanent fail state.
+# being restarted indefinitely.
+# Another issue that has cropped up recently is that the DefaultTimeoutStartSec
+# is 90s. If a service is hitting this timeout repeatedly then there
+# is a similar issue as noted above. Because of this, the StartLimitIntervalSec
+# needs to be StartLimitBurst*DefaultTimeoutStartSec +
+# StartLimitBurst* worst case processing time (30s)
+# which currently would be 2x90 + 2x30
#
# See systemd-system.conf(5) for details on the conf files
[Manager]
DefaultRestartSec=1s
DefaultStartLimitBurst=2
-DefaultStartLimitIntervalSec=30s
+DefaultStartLimitIntervalSec=240s