diff options
author | Andrew Geissler <geissonator@yahoo.com> | 2018-09-17 18:36:08 +0300 |
---|---|---|
committer | Brad Bishop <bradleyb@fuzziesquirrel.com> | 2018-09-24 14:43:49 +0300 |
commit | 0ee690fcb712718ca7ad179ec1a29a9803c80ed6 (patch) | |
tree | caad817235ecb34529d49913fa0c599b42baaafb /meta-phosphor | |
parent | edb619229fb988cd72a919e4068d820c632a7f5e (diff) | |
download | openbmc-0ee690fcb712718ca7ad179ec1a29a9803c80ed6.tar.xz |
Increase StartLimitIntervalSec to 240s
The DefaultTimeoutStartSec is 90s. If a service is hitting
this timeout repeatedly then the StartLimitIntervalSec needs
to be set in a way to handle this worse case scenario so
that the service which is timing out does not continuously
get restarted.
This means it needs to be set to:
StartLimitBurst*DefaultTimeoutStartSec +
StartLimitBurst*<worst case processing time> (30s)
which currently would be 2x90 + 2x30
Ref: systemd-system.conf
Tested: Verified that if 90s timeout is hit in service that
it is no longer restarted after 2 attempts.
Resolves openbmc/openbmc#3379
(From meta-phosphor rev: ee52526c80eaca65a581c01bcf703861ec1a80b6)
Change-Id: I8ff4febeb46a746dd3e5e625c5bdc3735963799b
Signed-off-by: Andrew Geissler <geissonator@yahoo.com>
Signed-off-by: Brad Bishop <bradleyb@fuzziesquirrel.com>
Diffstat (limited to 'meta-phosphor')
-rw-r--r-- | meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf | 14 |
1 files changed, 9 insertions, 5 deletions
diff --git a/meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf b/meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf index 54516c2d4..17c9e6bea 100644 --- a/meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf +++ b/meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf @@ -13,19 +13,23 @@ # restarting once does the job or restarting all 5 times does not help # and we just end up hitting the 5 limit anyway. # -# - Change the StartLimitIntervalSec to 30s +# - Change the StartLimitIntervalSec to 240s # The BMC CPU performance is already challenged. When a service is # failing and a core dump is being generated and collected into a dump, # it's even more challenged. Recent failures have shown situations where # the service does not fail again until 15-20 seconds after the initial # failure which means the default of 10s for this results in the service -# being restarted indefinitely. Change this to 30s to only allow a service -# to be restarted StartLimitBurst times within a 30s interval before -# being put in a permanent fail state. +# being restarted indefinitely. +# Another issue that has cropped up recently is that the DefaultTimeoutStartSec +# is 90s. If a service is hitting this timeout repeatedly then there +# is a similar issue as noted above. Because of this, the StartLimitIntervalSec +# needs to be StartLimitBurst*DefaultTimeoutStartSec + +# StartLimitBurst* worst case processing time (30s) +# which currently would be 2x90 + 2x30 # # See systemd-system.conf(5) for details on the conf files [Manager] DefaultRestartSec=1s DefaultStartLimitBurst=2 -DefaultStartLimitIntervalSec=30s +DefaultStartLimitIntervalSec=240s |