io_uring/io-wq: close io-wq full-stop gap There is an old problem with io-wq cancellation where requests should be killed and are in io-wq but are not discoverable, e.g. in @next_hashed or @linked vars of io_worker_handle_work(). It adds some unreliability to individual request canellation, but also may potentially get __io_uring_cancel() stuck. For instance: 1) An __io_uring_cancel()'s cancellation round have not found any request but there are some as desribed. 2) __io_uring_cancel() goes to sleep 3) Then workers wake up and try to execute those hidden requests that happen to be unbound. As we already cancel all requests of io-wq there, set IO_WQ_BIT_EXIT in advance, so preventing 3) from executing unbound requests. The workers will initially break looping because of getting a signal as they are threads of the dying/exec()'ing user task. Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/abfcf8c54cb9e8f7bfbad7e9a0cc5433cc70bdc2.1621781238.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit: 17a91051fe63b40ec651b80097c9fff5b093fdc5 [log] [tgz]
author: Pavel Begunkov <asml.silence@gmail.com> Sun May 23 15:48:39 2021 +0100
committer: Jens Axboe <axboe@kernel.dk> Tue May 25 19:39:58 2021 -0600
tree: 2ff23057d8bb5cbbed0b123eb9a8b3dd803c1586
parent: ba5ef6dc8a827a904794210a227cdb94828e8ae7 [diff] [blame]
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5f82954..6af8ca0 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c

@@ -9078,6 +9078,9 @@ static void io_uring_cancel_sqpoll(struct io_sq_data *sqd)
 
 	if (!current->io_uring)
 		return;
+	if (tctx->io_wq)
+		io_wq_exit_start(tctx->io_wq);
+
 	WARN_ON_ONCE(!sqd || sqd->thread != current);
 
 	atomic_inc(&tctx->in_idle);
@@ -9112,6 +9115,9 @@ void __io_uring_cancel(struct files_struct *files)
 	DEFINE_WAIT(wait);
 	s64 inflight;
 
+	if (tctx->io_wq)
+		io_wq_exit_start(tctx->io_wq);
+
 	/* make sure overflow events are dropped */
 	atomic_inc(&tctx->in_idle);
 	do {
commit	17a91051fe63b40ec651b80097c9fff5b093fdc5	[log] [tgz]
author	Pavel Begunkov <asml.silence@gmail.com>	Sun May 23 15:48:39 2021 +0100
committer	Jens Axboe <axboe@kernel.dk>	Tue May 25 19:39:58 2021 -0600
tree	2ff23057d8bb5cbbed0b123eb9a8b3dd803c1586
parent	ba5ef6dc8a827a904794210a227cdb94828e8ae7 [diff] [blame]