[lvc-project] [PATCH] net/9p: fix infinite loop in p9_client_rpc on fatal signal

Dominique Martinet asmadeus at codewreck.org
Fri Apr 17 01:52:52 MSK 2026


Vasiliy Kovalev wrote on Thu, Apr 16, 2026 at 03:49:59PM +0300:
> > It's unfortunately not that simple, just leaving here will leak things.
> 
> Following up on your concern about leaking things (as mentioned in the
> thread with Shigeru Yoshida's patch) - I did some tracing with `printk`
> inside `p9_req_put` and `p9_tag_remove` to see exactly what happens to
> the request lifecycle with my proposed patch vs Shigeru's approach:
> 
> diff --git a/net/9p/client.c b/net/9p/client.c
> index 748b92d3f0c1..4e8f4e0195b4 100644
> --- a/net/9p/client.c
> +++ b/net/9p/client.c
> @@ -274,10 +274,13 @@ static void p9_tag_remove(struct p9_client *c, struct
> p9_req_t *r)
>         spin_lock_irqsave(&c->lock, flags);
>         idr_remove(&c->reqs, tag);
>         spin_unlock_irqrestore(&c->lock, flags);
> +       printk("tag %d removed from IDR\n", tag);
>  }
> 
>  int p9_req_put(struct p9_client *c, struct p9_req_t *r)
>  {
> +       printk("p9_req_put req %p tag %d refcount before=%d\n",
> +         r, r->tc.tag, refcount_read(&r->refcount));
>         if (refcount_dec_and_test(&r->refcount)) {
>                 p9_tag_remove(c, r);
> 
> There is a significant difference.
> 
> If we use Shigeru's approach [1] (preventing the flush from being
> sent/processed using `!fatal_signal_pending(current)` in the condition),
> the original request (tag 0) is never freed from the IDR. It creates a
> definitive memory and tag leak:
> 
> [root at localhost]# ./repro
> executing program
> [   50.613200] p9_req_put req ff11000110157cb0 tag 65535 refcount before=3
> [   50.663513] p9_req_put req ff11000110157cb0 tag 65535 refcount before=2
> [   50.665156] p9_req_put req ff11000110157cb0 tag 65535 refcount before=1
> [   50.666593] tag 65535 removed from IDR
> executing program
> [   50.688232] p9_req_put req ff11000110157ba0 tag 65535 refcount before=3
> [   50.739634] p9_req_put req ff11000110157ba0 tag 65535 refcount before=2
> [   50.741170] p9_req_put req ff11000110157ba0 tag 65535 refcount before=1
> [   50.742587] tag 65535 removed from IDR
> executing program
> [   50.775708] p9_req_put req ff11000110157a90 tag 65535 refcount before=3
> [   50.825104] p9_req_put req ff11000110157a90 tag 65535 refcount before=2
> [   50.827719] p9_req_put req ff11000110157a90 tag 65535 refcount before=1
> [   50.830432] tag 65535 removed from IDR
> executing program
> ...
> 
> [1]
> https://lore.kernel.org/all/20260322111848.2837057-1-syoshida@redhat.com/
> 
> However, with my patch (jumping to `recalc_sigpending` *inside* the
> `P9_TFLUSH` wait loop), the logic works cleanly due to the `refcount`
> mechanism:
> 
> [root at localhost]# ./repro
> executing program
> [   48.875093] p9_req_put req ff110001056fdcb0 tag 65535 refcount before=3
> [   48.925981] p9_req_put req ff110001056fdba0 tag 0 refcount before=3
> [   53.873969] p9_req_put req ff110001056fdba0 tag 0 refcount before=2
> [   53.876638] p9_req_put req ff110001056fdcb0 tag 65535 refcount before=2
> [   53.879882] p9_req_put req ff110001056fdba0 tag 0 refcount before=1
> [   53.882505] tag 0 removed from IDR
> [   53.884184] p9_req_put req ff110001056fdcb0 tag 65535 refcount before=1
> [   53.886905] tag 65535 removed from IDR
> executing program
> [   53.914764] p9_req_put req ff1100010ca97a90 tag 65535 refcount before=3
> [   53.967191] p9_req_put req ff1100010ca97980 tag 0 refcount before=3
> [   58.909626] p9_req_put req ff1100010ca97980 tag 0 refcount before=2
> [   58.912420] p9_req_put req ff1100010ca97a90 tag 65535 refcount before=2
> [   58.915358] p9_req_put req ff1100010ca97980 tag 0 refcount before=1
> [   58.918478] tag 0 removed from IDR
> [   58.920730] p9_req_put req ff1100010ca97a90 tag 65535 refcount before=1
> [   58.923715] tag 65535 removed from IDR
> executing program
> [   58.946905] p9_req_put req ff1100010c5a3870 tag 65535 refcount before=3
> [   58.998260] p9_req_put req ff1100010c5a3760 tag 0 refcount before=3
> [   63.945960] p9_req_put req ff1100010c5a3760 tag 0 refcount before=2
> [   63.948549] p9_req_put req ff1100010c5a3870 tag 65535 refcount before=2
> [   63.951153] p9_req_put req ff1100010c5a3760 tag 0 refcount before=1
> [   63.953651] tag 0 removed from IDR
> [   63.957254] p9_req_put req ff1100010c5a3870 tag 65535 refcount before=1
> [   63.960655] tag 65535 removed from IDR
> executing program
> ...
> 
> 1. The dying thread breaks out of the loop and calls `p9_req_put()`.
> The refcount correctly drops from 2 to 1.
> 
> 2. The tag is NOT removed from the IDR yet, preventing any Use-After-Free.
> The request is safely "orphaned".
> 
> 3. Shortly after, when the parent process closes the pipe/socket (or the
> transport is torn down due to process exit), the transport layer cleans up
> the pending requests. It calls the final `p9_req_put()`, refcount hits 0,
> and the tag is properly removed from the IDR.

Thanks for taking the time to verify

So if I understand this right, with a sane server, the tag will be used
until umount or a reply (to either the flush or the original request)?

That sounds sane enough (and definitely better then Shigeru's patch)

> Since this safely resolves the permanent D-state hang during
> `coredump_wait()` without causing the UAF or leaks that plagued earlier
> kernels (pre-`728356dedeff` ("9p: Add refcount to p9_req_t")), do you
> think this pragmatic fix is acceptable for mainline and stable?

I'll need to find time (or someone) to perform some heavy tests with
this -- basically my problem is that these hangs are easy to test with
syzcaller but these are not ever supposed to happen with a real
(correct) server, so while I agree not having stuck threads would be
helpful to improve test efficiency I don't see it as a "real" problem
(as in, 9p mount is a privileged operation, so normal users aren't
supposed to mess around with a pipe/fd mount like syzcaller does)

Basically I'd want to "just" mess around with a real system on a 9p
mount and play with ^C/sigkills and see what happens, but I don't have
time for that short term, so this will likely be a few months

>  While the ideal long-term goal is the asynchronous implementation (as
> seen in your 9p-async-v2 branch [2]), this patch serves as a reliable
> intermediate solution for a critical regression.
> [2] https://github.com/martinetd/linux/commits/9p-async-v2

iirc one of the problem with the async branch is that the process would
quit immediately on, say, ^C, before the IO has completed, but it's
possible for the server to process the IO (and not the flush) afterwards
and you'd get something that's not supposed to happen e.g.

p1             p2

write(1)
^C/sigkill
flush sent but process exit without waiting for server ack
               1 not written yet
               write(2) in same spot
               write(2) done
write(1) completes
data isn't 2 as expected after p2 completed


So it's quite possible async isn't the way to go, but that there is no
good solution for this
(given this is true even without async on sigkill: if we have something
that works safely, there's no reason to wait only for non-fatal signals...)

> > (By the way, what's with the rekindled interest on this?...)
> > 
> 
> The issue was identified by Syzkaller, and I have a stable C reproducer
> that consistently triggers the hang. There is also a similar report on
> Syzbot with a nearly identical reproducer:
> https://syzkaller.appspot.com/text?tag=ReproC&x=156aa534580000

Yes, it's been a known problem since syzcaller started testing
9p... many years ago.
Back then requests didn't have a refcount, so it's quite likely nobody
looked hard enough after this became possible.

Thanks,
-- 
Dominique Martinet | Asmadeus



More information about the lvc-project mailing list