[lvc-project] [PATCH] [RFC] net: smc: fix fasync leak in smc_release()

Wen Gu guwen at linux.alibaba.com
Fri Feb 23 06:36:14 MSK 2024



On 2024/2/21 23:02, Antipov, Dmitriy wrote:
> On Wed, 2024-02-21 at 21:09 +0800, Wen Gu wrote:
> 
>> 1. on = 1; ioctl(sock, FIOASYNC, &on), a fasync entry is added to
>>      smc->sk.sk_socket->wq.fasync_list;
>>
>> 2. Then fallback happend, and swapped the socket:
>>      smc->clcsock->file = smc->sk.sk_socket->file;
>>      smc->clcsock->file->private_data = smc->clcsock;
>>      smc->clcsock->wq.fasync_list = smc->sk.sk_socket->wq.fasync_list;
>>      smc->sk.sk_socket->wq.fasync_list = NULL;
>>
>> 3. on = 0; ioctl(sock, FIOASYNC, &on), the fasync entry is removed
>>      from smc->clcsock->wq.fasync_list,
>> (Is there a race between 2 and 3 ?)
> 
> 1) IIUC yes. The following sequence from smc_switch_to_fallback():
> 
> smc->clcsock->file = smc->sk.sk_socket->file;
> smc->clcsock->file->private_data = smc->clcsock;
> 
> is executed with locked smc->sk.sk_socket but unlocked smc->clcsock.
> This way,
> 
> struct socket *sock = filp->private_data;
> 
> in sock_fasync() introduces an undefined behavior (because an
> actual value of 'private_data' is unpredictable). So there are
> two possible scenarios. When
> 
> on = 1; ioctl(sock, FIOASYNC, &on);
> on = 0; ioctl(sock, FIOASYNC, &on);
> 
> actually modifies fasync list of the same socket, this works as
> expected. If kernel sockets behind 'sock' gets swapped between
> calls to ioctl(), fasync list of the first socket has an entry
> which can't be removed by the second ioctl().
> 

Thank you for the explanation, Dmitriy.

So the race could be:

sock_fasync                         | smc_switch_to_fallback
------------------------------------------------------------------
/* smc->sk.sk_socket->wq.fasync_list|
is empty at the beginning */        |
                                     |
/* attempt to add fasync_struct */  |
sock = filp->private_data;          |
(smc->sk.sk_socket)                 | /* fallback happens */
                                     | lock_sock(sk);
                                     | file->private_data = smc->clcsock
                                     | smc->clcsock->wq.fasync_list = smc->sk.sk_socket->wq.fasync_list
                                     | smc->sk.sk_socket->wq.fasync_list = NULL
                                     | release_sock(sk);
lock_sock(sk);                      |
fasync_helper(on=1)                 |
(successed to add the entry in      |
smc->sk.sk_socket->wq.fasync_list)  |
release_sock(sk);                   |
                                     |
/* the fasync_struct entry can't be |
removed later, since                |
file->private_data has been changed |
to clcsock */                       |

Now clcsock->wq.fasync_list is empty and the fasync_struct entry is
always left in smc->sk.sk_socket->wq.fasync_list.

If we only remove fasync_struct entries in smc->sk.sk_socket->wq.fasync_list
during smc_release, it indeed solves the leak, but it fails to address
the problem of fasync_struct entries being incorrectly inserted into the
old socket (they should have been placed in smc->clcsock->wq.fasync_list
if fallback happens).

One solution to this issue I can think of is to check whether
filp->private_data has been changed when the sock_fasync holds the sock lock,
but it inevitably changes the general code..

diff --git a/net/socket.c b/net/socket.c
index ed3df2f749bf..a28435195854 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1443,6 +1443,11 @@ static int sock_fasync(int fd, struct file *filp, int on)
                 return -EINVAL;

         lock_sock(sk);
+       /* filp->private_data has changed */
+       if (on && unlikely(sock != filp->private_data)) {
+               release_sock(sk);
+               return -EAGAIN;
+       }
         fasync_helper(fd, filp, on, &wq->fasync_list);

         if (!wq->fasync_list)

Let's see if anyone else has a better idea.

Best regards,
Wen Gu

> 
>> 4. Then close the file, __fput() calls file->f_op->fasync(-1, file, 0),
>>      then sock_fasync() calls fasync_helper(fd, filp, on, &wq->fasync_list)
>>      and fasync_remove_entry() removes entries in smc->clcsock->wq.fasync_list.
>>      Now smc->clcsock->wq.fasync_list is empty.
> 
> 2) No. In the second (bad) scenario from the above, an attempt to remove
> fasync entry from smc->clcsock->wq.fasync_list always fails because
> fasync entry was actually linked to smc->sk.sk_socket->wq.fasync_list.
> 
> Note sock_fasync() doesn't check the value returned from fasync_helper().
> How dumb.
> 
>> 5. __fput() calls file->f_op->release(inode, file), then sock_close calls
>>      __sock_release, then ops->release calls smc_release(), and __smc_release()
>>      calls smc_restore_fallback_changes() to restore socket:
>>      if (smc->clcsock->file) { /* non-accepted sockets have no file yet */
>>           smc->clcsock->file->private_data = smc->sk.sk_socket;
>>           smc->clcsock->file = NULL;
>>           smc_fback_restore_callbacks(smc);
>>      }
> 
> 3) Yes. And it's too late because __fput() calls file->f_op->fasync(-1, ...,
> 0) _before_ file->f_op->release(). So even if you have sockets swapped back,
> no one will take care about freeing fasync list.
> 
> 
>> 6. Then back to __sock_release, check if sock->wq.fasync_list (that is
>>      smc->sk.sk_socket->wq.fasync_list) is empty and it is empty.
> 
> 4) No. It's not empty because an attempt to remove fasync entry has failed
> at [2] just because it was made against the wrong socket.
> 
> 
> For your convenience, there is a human-readable reproducer loosely modeled
> around the one generated by syzkaller. You can try it on any system running
> recently enough kernel with CONFIG_SMC enabled (root is not required), and
> receiving a few (or may be a lot of) "__sock_release: fasync list not empty"
> messages clearly indicates an issue. NOTE: this shouldn't crash the system
> and/or make it unusable, but remember that each message comes with a small
> kernel memory leak.
> 
> Dmitry
> 
> #include <signal.h>
> #include <unistd.h>
> #include <pthread.h>
> #include <sys/ioctl.h>
> #include <sys/socket.h>
> 
> int sock;
> 
> void *
> loop0 (void *arg)
> {
>    struct msghdr msg = { 0 };
> 
>    while (1)
>      {
>        sock = socket (AF_SMC, SOCK_STREAM, 0);
>        sendmsg (sock, &msg, MSG_FASTOPEN);
>        close (sock);
>      }
>    return NULL;
> }
> 
> void *
> loop1 (void *arg)
> {
>    int on;
> 
>    while (1)
>      {
>        on = 1;
>        ioctl (sock, FIOASYNC, &on);
>        on = 0;
>        ioctl (sock, FIOASYNC, &on);
>      }
> 
>    return NULL;
> }
> 
> int
> main (int argc, char *argv[])
> {
>    pthread_t a, b;
>    struct sigaction sa = { 0 };
> 
>    sa.sa_handler = SIG_IGN;
>    sigaction (SIGIO, &sa, NULL);
> 
>    pthread_create (&a, NULL, loop0, NULL);
>    pthread_create (&b, NULL, loop1, NULL);
> 
>    pause ();
> 
>    pthread_join (a, NULL);
>    pthread_join (b, NULL);
> 
>    return 0;
> }
> 
> 



More information about the lvc-project mailing list