Commit f16a797
watchdog: fix watchdog may detect false positive of softlockup
commit 7123dbb upstream.
When updating `watchdog_thresh`, there is a race condition between writing
the new `watchdog_thresh` value and stopping the old watchdog timer. If
the old timer triggers during this window, it may falsely detect a
softlockup due to the old interval and the new `watchdog_thresh` value
being used. The problem can be described as follow:
# We asuume previous watchdog_thresh is 60, so the watchdog timer is
# coming every 24s.
echo 10 > /proc/sys/kernel/watchdog_thresh (User space)
|
+------>+ update watchdog_thresh (We are in kernel now)
|
| # using old interval and new `watchdog_thresh`
+------>+ watchdog hrtimer (irq context: detect softlockup)
|
|
+-------+
|
|
+ softlockup_stop_all
To fix this problem, introduce a shadow variable for `watchdog_thresh`.
The update to the actual `watchdog_thresh` is delayed until after the old
timer is stopped, preventing false positives.
The following testcase may help to understand this problem.
---------------------------------------------
echo RT_RUNTIME_SHARE > /sys/kernel/debug/sched/features
echo -1 > /proc/sys/kernel/sched_rt_runtime_us
echo 0 > /sys/kernel/debug/sched/fair_server/cpu3/runtime
echo 60 > /proc/sys/kernel/watchdog_thresh
taskset -c 3 chrt -r 99 /bin/bash -c "while true;do true; done" &
echo 10 > /proc/sys/kernel/watchdog_thresh &
---------------------------------------------
The test case above first removes the throttling restrictions for
real-time tasks. It then sets watchdog_thresh to 60 and executes a
real-time task ,a simple while(1) loop, on cpu3. Consequently, the final
command gets blocked because the presence of this real-time thread
prevents kworker:3 from being selected by the scheduler. This eventually
triggers a softlockup detection on cpu3 due to watchdog_timer_fn operating
with inconsistent variable - using both the old interval and the updated
watchdog_thresh simultaneously.
[nysal@linux.ibm.com: fix the SOFTLOCKUP_DETECTOR=n case]
Link: https://lkml.kernel.org/r/20250502111120.282690-1-nysal@linux.ibm.com
Link: https://lkml.kernel.org/r/20250421035021.3507649-1-luogengkun@huaweicloud.com
Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
Signed-off-by: Nysal Jan K.A. <nysal@linux.ibm.com>
Cc: Doug Anderson <dianders@chromium.org>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: "Nysal Jan K.A." <nysal@linux.ibm.com>
Cc: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>1 parent 68c173e commit f16a797
1 file changed
+27
-14
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
| 50 | + | |
50 | 51 | | |
51 | 52 | | |
52 | 53 | | |
| |||
863 | 864 | | |
864 | 865 | | |
865 | 866 | | |
866 | | - | |
| 867 | + | |
867 | 868 | | |
868 | 869 | | |
869 | 870 | | |
870 | 871 | | |
871 | 872 | | |
| 873 | + | |
| 874 | + | |
| 875 | + | |
| 876 | + | |
| 877 | + | |
| 878 | + | |
| 879 | + | |
| 880 | + | |
872 | 881 | | |
873 | 882 | | |
874 | 883 | | |
| |||
881 | 890 | | |
882 | 891 | | |
883 | 892 | | |
884 | | - | |
| 893 | + | |
885 | 894 | | |
886 | 895 | | |
887 | 896 | | |
| |||
901 | 910 | | |
902 | 911 | | |
903 | 912 | | |
904 | | - | |
| 913 | + | |
905 | 914 | | |
906 | 915 | | |
907 | 916 | | |
908 | 917 | | |
909 | 918 | | |
910 | | - | |
| 919 | + | |
911 | 920 | | |
912 | 921 | | |
913 | 922 | | |
| 923 | + | |
| 924 | + | |
914 | 925 | | |
915 | 926 | | |
916 | 927 | | |
917 | 928 | | |
918 | 929 | | |
919 | 930 | | |
920 | | - | |
| 931 | + | |
921 | 932 | | |
922 | 933 | | |
923 | 934 | | |
924 | | - | |
| 935 | + | |
925 | 936 | | |
926 | 937 | | |
927 | 938 | | |
| |||
939 | 950 | | |
940 | 951 | | |
941 | 952 | | |
942 | | - | |
| 953 | + | |
943 | 954 | | |
944 | 955 | | |
945 | 956 | | |
946 | | - | |
| 957 | + | |
947 | 958 | | |
948 | 959 | | |
949 | 960 | | |
| |||
976 | 987 | | |
977 | 988 | | |
978 | 989 | | |
979 | | - | |
| 990 | + | |
980 | 991 | | |
981 | 992 | | |
982 | 993 | | |
| |||
1027 | 1038 | | |
1028 | 1039 | | |
1029 | 1040 | | |
1030 | | - | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
1031 | 1044 | | |
1032 | 1045 | | |
1033 | | - | |
1034 | | - | |
| 1046 | + | |
| 1047 | + | |
1035 | 1048 | | |
1036 | 1049 | | |
1037 | 1050 | | |
| |||
1052 | 1065 | | |
1053 | 1066 | | |
1054 | 1067 | | |
1055 | | - | |
| 1068 | + | |
1056 | 1069 | | |
1057 | 1070 | | |
1058 | 1071 | | |
| |||
1072 | 1085 | | |
1073 | 1086 | | |
1074 | 1087 | | |
1075 | | - | |
| 1088 | + | |
1076 | 1089 | | |
1077 | 1090 | | |
1078 | 1091 | | |
| |||
0 commit comments