blob: a32190508751c330422b3d7aa40419fb84276fde [file] [log] [blame]
Akinobu Mitade1ba092006-12-08 02:39:42 -08001Fault injection capabilities infrastructure
2===========================================
3
4See also drivers/md/faulty.c and "every_nth" module option for scsi_debug.
5
6
7Available fault injection capabilities
8--------------------------------------
9
10o failslab
11
12 injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...)
13
14o fail_page_alloc
15
16 injects page allocation failures. (alloc_pages(), get_free_pages(), ...)
17
Davidlohr Buesoab51fba2015-06-29 23:26:02 -070018o fail_futex
19
20 injects futex deadlock and uaddr fault errors.
21
Akinobu Mitade1ba092006-12-08 02:39:42 -080022o fail_make_request
23
Don Mullis5d0ffa22006-12-08 02:39:50 -080024 injects disk IO errors on devices permitted by setting
Akinobu Mitade1ba092006-12-08 02:39:42 -080025 /sys/block/<device>/make-it-fail or
26 /sys/block/<device>/<partition>/make-it-fail. (generic_make_request())
27
Per Forlin1e4cb222011-08-19 14:52:38 +020028o fail_mmc_request
29
30 injects MMC data errors on devices permitted by setting
31 debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
32
Akinobu Mitade1ba092006-12-08 02:39:42 -080033Configure fault-injection capabilities behavior
34-----------------------------------------------
35
36o debugfs entries
37
38fault-inject-debugfs kernel module provides some debugfs entries for runtime
39configuration of fault-injection capabilities.
40
GeunSik Lim156f5a72009-06-02 15:01:37 +090041- /sys/kernel/debug/fail*/probability:
Akinobu Mitade1ba092006-12-08 02:39:42 -080042
43 likelihood of failure injection, in percent.
44 Format: <percent>
45
Don Mullis5d0ffa22006-12-08 02:39:50 -080046 Note that one-failure-per-hundred is a very high error rate
47 for some testcases. Consider setting probability=100 and configure
GeunSik Lim156f5a72009-06-02 15:01:37 +090048 /sys/kernel/debug/fail*/interval for such testcases.
Akinobu Mitade1ba092006-12-08 02:39:42 -080049
GeunSik Lim156f5a72009-06-02 15:01:37 +090050- /sys/kernel/debug/fail*/interval:
Akinobu Mitade1ba092006-12-08 02:39:42 -080051
52 specifies the interval between failures, for calls to
53 should_fail() that pass all the other tests.
54
55 Note that if you enable this, by setting interval>1, you will
56 probably want to set probability=100.
57
GeunSik Lim156f5a72009-06-02 15:01:37 +090058- /sys/kernel/debug/fail*/times:
Akinobu Mitade1ba092006-12-08 02:39:42 -080059
60 specifies how many times failures may happen at most.
61 A value of -1 means "no limit".
62
GeunSik Lim156f5a72009-06-02 15:01:37 +090063- /sys/kernel/debug/fail*/space:
Akinobu Mitade1ba092006-12-08 02:39:42 -080064
65 specifies an initial resource "budget", decremented by "size"
66 on each call to should_fail(,size). Failure injection is
67 suppressed until "space" reaches zero.
68
GeunSik Lim156f5a72009-06-02 15:01:37 +090069- /sys/kernel/debug/fail*/verbose
Akinobu Mitade1ba092006-12-08 02:39:42 -080070
71 Format: { 0 | 1 | 2 }
Don Mullis5d0ffa22006-12-08 02:39:50 -080072 specifies the verbosity of the messages when failure is
73 injected. '0' means no messages; '1' will print only a single
74 log line per failure; '2' will print a call trace too -- useful
75 to debug the problems revealed by fault injection.
Akinobu Mitade1ba092006-12-08 02:39:42 -080076
GeunSik Lim156f5a72009-06-02 15:01:37 +090077- /sys/kernel/debug/fail*/task-filter:
Akinobu Mitade1ba092006-12-08 02:39:42 -080078
Don Mullis5d0ffa22006-12-08 02:39:50 -080079 Format: { 'Y' | 'N' }
80 A value of 'N' disables filtering by process (default).
Akinobu Mitade1ba092006-12-08 02:39:42 -080081 Any positive value limits failures to only processes indicated by
82 /proc/<pid>/make-it-fail==1.
83
GeunSik Lim156f5a72009-06-02 15:01:37 +090084- /sys/kernel/debug/fail*/require-start:
85- /sys/kernel/debug/fail*/require-end:
86- /sys/kernel/debug/fail*/reject-start:
87- /sys/kernel/debug/fail*/reject-end:
Akinobu Mitade1ba092006-12-08 02:39:42 -080088
89 specifies the range of virtual addresses tested during
90 stacktrace walking. Failure is injected only if some caller
Akinobu Mita329409a2006-12-08 02:39:48 -080091 in the walked stacktrace lies within the required range, and
92 none lies within the rejected range.
93 Default required range is [0,ULONG_MAX) (whole of virtual address space).
94 Default rejected range is [0,0).
Akinobu Mitade1ba092006-12-08 02:39:42 -080095
GeunSik Lim156f5a72009-06-02 15:01:37 +090096- /sys/kernel/debug/fail*/stacktrace-depth:
Akinobu Mitade1ba092006-12-08 02:39:42 -080097
98 specifies the maximum stacktrace depth walked during search
Don Mullis5d0ffa22006-12-08 02:39:50 -080099 for a caller within [require-start,require-end) OR
100 [reject-start,reject-end).
Akinobu Mitade1ba092006-12-08 02:39:42 -0800101
GeunSik Lim156f5a72009-06-02 15:01:37 +0900102- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
Akinobu Mitade1ba092006-12-08 02:39:42 -0800103
Don Mullis5d0ffa22006-12-08 02:39:50 -0800104 Format: { 'Y' | 'N' }
105 default is 'N', setting it to 'Y' won't inject failures into
Akinobu Mitade1ba092006-12-08 02:39:42 -0800106 highmem/user allocations.
107
GeunSik Lim156f5a72009-06-02 15:01:37 +0900108- /sys/kernel/debug/failslab/ignore-gfp-wait:
109- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
Akinobu Mitade1ba092006-12-08 02:39:42 -0800110
Don Mullis5d0ffa22006-12-08 02:39:50 -0800111 Format: { 'Y' | 'N' }
112 default is 'N', setting it to 'Y' will inject failures
Akinobu Mitade1ba092006-12-08 02:39:42 -0800113 only into non-sleep allocations (GFP_ATOMIC allocations).
114
GeunSik Lim156f5a72009-06-02 15:01:37 +0900115- /sys/kernel/debug/fail_page_alloc/min-order:
Akinobu Mita54114992007-07-15 23:40:23 -0700116
117 specifies the minimum page allocation order to be injected
118 failures.
119
Davidlohr Buesoab51fba2015-06-29 23:26:02 -0700120- /sys/kernel/debug/fail_futex/ignore-private:
121
122 Format: { 'Y' | 'N' }
123 default is 'N', setting it to 'Y' will disable failure injections
124 when dealing with private (address space) futexes.
125
Akinobu Mitade1ba092006-12-08 02:39:42 -0800126o Boot option
127
128In order to inject faults while debugfs is not available (early boot time),
129use the boot option:
130
131 failslab=
132 fail_page_alloc=
Per Forlin1e4cb222011-08-19 14:52:38 +0200133 fail_make_request=
Davidlohr Buesoab51fba2015-06-29 23:26:02 -0700134 fail_futex=
Per Forlin199e3f42011-09-13 23:03:30 +0200135 mmc_core.fail_request=<interval>,<probability>,<space>,<times>
Akinobu Mitade1ba092006-12-08 02:39:42 -0800136
Dmitry Vyukove41d58182017-07-12 14:34:35 -0700137o proc entries
138
139- /proc/self/task/<current-tid>/fail-nth:
140
Akinobu Mita9049f2f2017-07-14 14:49:52 -0700141 Write to this file of integer N makes N-th call in the task fail.
142 Read from this file returns a single char 'Y' or 'N'
Dmitry Vyukove41d58182017-07-12 14:34:35 -0700143 that says if the fault setup with a previous write to this file was
144 injected or not, and disables the fault if it wasn't yet injected.
145 Note that this file enables all types of faults (slab, futex, etc).
146 This setting takes precedence over all other generic debugfs settings
147 like probability, interval, times, etc. But per-capability settings
148 (e.g. fail_futex/ignore-private) take precedence over it.
149
150 This feature is intended for systematic testing of faults in a single
151 system call. See an example below.
152
Akinobu Mitade1ba092006-12-08 02:39:42 -0800153How to add new fault injection capability
154-----------------------------------------
155
156o #include <linux/fault-inject.h>
157
158o define the fault attributes
159
160 DECLARE_FAULT_INJECTION(name);
161
162 Please see the definition of struct fault_attr in fault-inject.h
163 for details.
164
Don Mullis5d0ffa22006-12-08 02:39:50 -0800165o provide a way to configure fault attributes
Akinobu Mitade1ba092006-12-08 02:39:42 -0800166
167- boot option
168
169 If you need to enable the fault injection capability from boot time, you can
Don Mullis5d0ffa22006-12-08 02:39:50 -0800170 provide boot option to configure it. There is a helper function for it:
Akinobu Mitade1ba092006-12-08 02:39:42 -0800171
Don Mullis5d0ffa22006-12-08 02:39:50 -0800172 setup_fault_attr(attr, str);
Akinobu Mitade1ba092006-12-08 02:39:42 -0800173
174- debugfs entries
175
176 failslab, fail_page_alloc, and fail_make_request use this way.
Don Mullis5d0ffa22006-12-08 02:39:50 -0800177 Helper functions:
Akinobu Mitade1ba092006-12-08 02:39:42 -0800178
Akinobu Mitadd48c082011-08-03 16:21:01 -0700179 fault_create_debugfs_attr(name, parent, attr);
Akinobu Mitade1ba092006-12-08 02:39:42 -0800180
181- module parameters
182
183 If the scope of the fault injection capability is limited to a
184 single kernel module, it is better to provide module parameters to
185 configure the fault attributes.
186
187o add a hook to insert failures
188
Don Mullis5d0ffa22006-12-08 02:39:50 -0800189 Upon should_fail() returning true, client code should inject a failure.
Akinobu Mitade1ba092006-12-08 02:39:42 -0800190
Don Mullis5d0ffa22006-12-08 02:39:50 -0800191 should_fail(attr, size);
Akinobu Mitade1ba092006-12-08 02:39:42 -0800192
193Application Examples
194--------------------
195
Akinobu Mita18584872007-07-15 23:40:24 -0700196o Inject slab allocation failures into module init/exit code
Akinobu Mitade1ba092006-12-08 02:39:42 -0800197
Akinobu Mitade1ba092006-12-08 02:39:42 -0800198#!/bin/bash
199
Akinobu Mita18584872007-07-15 23:40:24 -0700200FAILTYPE=failslab
GeunSik Lim156f5a72009-06-02 15:01:37 +0900201echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
202echo 10 > /sys/kernel/debug/$FAILTYPE/probability
203echo 100 > /sys/kernel/debug/$FAILTYPE/interval
204echo -1 > /sys/kernel/debug/$FAILTYPE/times
205echo 0 > /sys/kernel/debug/$FAILTYPE/space
206echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
207echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
Akinobu Mitade1ba092006-12-08 02:39:42 -0800208
Akinobu Mita18584872007-07-15 23:40:24 -0700209faulty_system()
Akinobu Mitade1ba092006-12-08 02:39:42 -0800210{
Akinobu Mita18584872007-07-15 23:40:24 -0700211 bash -c "echo 1 > /proc/self/make-it-fail && exec $*"
Akinobu Mitade1ba092006-12-08 02:39:42 -0800212}
213
Akinobu Mita18584872007-07-15 23:40:24 -0700214if [ $# -eq 0 ]
215then
216 echo "Usage: $0 modulename [ modulename ... ]"
217 exit 1
218fi
Akinobu Mitade1ba092006-12-08 02:39:42 -0800219
Akinobu Mita18584872007-07-15 23:40:24 -0700220for m in $*
221do
222 echo inserting $m...
223 faulty_system modprobe $m
Akinobu Mitade1ba092006-12-08 02:39:42 -0800224
Akinobu Mita18584872007-07-15 23:40:24 -0700225 echo removing $m...
226 faulty_system modprobe -r $m
227done
Akinobu Mitade1ba092006-12-08 02:39:42 -0800228
229------------------------------------------------------------------------------
230
Akinobu Mita18584872007-07-15 23:40:24 -0700231o Inject page allocation failures only for a specific module
Akinobu Mitade1ba092006-12-08 02:39:42 -0800232
Akinobu Mitade1ba092006-12-08 02:39:42 -0800233#!/bin/bash
234
Akinobu Mita18584872007-07-15 23:40:24 -0700235FAILTYPE=fail_page_alloc
236module=$1
Akinobu Mitade1ba092006-12-08 02:39:42 -0800237
Akinobu Mita18584872007-07-15 23:40:24 -0700238if [ -z $module ]
239then
240 echo "Usage: $0 <modulename>"
241 exit 1
242fi
Akinobu Mitade1ba092006-12-08 02:39:42 -0800243
Akinobu Mita18584872007-07-15 23:40:24 -0700244modprobe $module
Akinobu Mitade1ba092006-12-08 02:39:42 -0800245
Akinobu Mita18584872007-07-15 23:40:24 -0700246if [ ! -d /sys/module/$module/sections ]
247then
248 echo Module $module is not loaded
249 exit 1
250fi
251
GeunSik Lim156f5a72009-06-02 15:01:37 +0900252cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
253cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
Akinobu Mita18584872007-07-15 23:40:24 -0700254
GeunSik Lim156f5a72009-06-02 15:01:37 +0900255echo N > /sys/kernel/debug/$FAILTYPE/task-filter
256echo 10 > /sys/kernel/debug/$FAILTYPE/probability
257echo 100 > /sys/kernel/debug/$FAILTYPE/interval
258echo -1 > /sys/kernel/debug/$FAILTYPE/times
259echo 0 > /sys/kernel/debug/$FAILTYPE/space
260echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
261echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
262echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
263echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
Akinobu Mita18584872007-07-15 23:40:24 -0700264
GeunSik Lim156f5a72009-06-02 15:01:37 +0900265trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
Akinobu Mita18584872007-07-15 23:40:24 -0700266
267echo "Injecting errors into the module $module... (interrupt to stop)"
268sleep 1000000
Akinobu Mitade1ba092006-12-08 02:39:42 -0800269
Akinobu Mitac24aa642012-07-30 14:43:20 -0700270Tool to run command with failslab or fail_page_alloc
271----------------------------------------------------
272In order to make it easier to accomplish the tasks mentioned above, we can use
273tools/testing/fault-injection/failcmd.sh. Please run a command
274"./tools/testing/fault-injection/failcmd.sh --help" for more information and
275see the following examples.
276
277Examples:
278
279Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab
280allocation failure.
281
282 # ./tools/testing/fault-injection/failcmd.sh \
283 -- make -C tools/testing/selftests/ run_tests
284
285Same as above except to specify 100 times failures at most instead of one time
286at most by default.
287
288 # ./tools/testing/fault-injection/failcmd.sh --times=100 \
289 -- make -C tools/testing/selftests/ run_tests
290
291Same as above except to inject page allocation failure instead of slab
292allocation failure.
293
294 # env FAILCMD_TYPE=fail_page_alloc \
295 ./tools/testing/fault-injection/failcmd.sh --times=100 \
296 -- make -C tools/testing/selftests/ run_tests
Dmitry Vyukove41d58182017-07-12 14:34:35 -0700297
298Systematic faults using fail-nth
299---------------------------------
300
301The following code systematically faults 0-th, 1-st, 2-nd and so on
302capabilities in the socketpair() system call.
303
304#include <sys/types.h>
305#include <sys/stat.h>
306#include <sys/socket.h>
307#include <sys/syscall.h>
308#include <fcntl.h>
309#include <unistd.h>
310#include <string.h>
311#include <stdlib.h>
312#include <stdio.h>
313#include <errno.h>
314
315int main()
316{
317 int i, err, res, fail_nth, fds[2];
318 char buf[128];
319
320 system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait");
321 sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid));
322 fail_nth = open(buf, O_RDWR);
Akinobu Mita9049f2f2017-07-14 14:49:52 -0700323 for (i = 1;; i++) {
Dmitry Vyukove41d58182017-07-12 14:34:35 -0700324 sprintf(buf, "%d", i);
325 write(fail_nth, buf, strlen(buf));
326 res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
327 err = errno;
328 read(fail_nth, buf, 1);
329 if (res == 0) {
330 close(fds[0]);
331 close(fds[1]);
332 }
333 printf("%d-th fault %c: res=%d/%d\n", i, buf[0], res, err);
334 if (buf[0] != 'Y')
335 break;
336 }
337 return 0;
338}
339
340An example output:
341
Dmitry Vyukove41d58182017-07-12 14:34:35 -07003421-th fault Y: res=-1/23
3432-th fault Y: res=-1/23
3443-th fault Y: res=-1/12
3454-th fault Y: res=-1/12
3465-th fault Y: res=-1/23
3476-th fault Y: res=-1/23
3487-th fault Y: res=-1/23
3498-th fault Y: res=-1/12
3509-th fault Y: res=-1/12
35110-th fault Y: res=-1/12
35211-th fault Y: res=-1/12
35312-th fault Y: res=-1/12
35413-th fault Y: res=-1/12
35514-th fault Y: res=-1/12
35615-th fault Y: res=-1/12
35716-th fault N: res=0/12