[lvc-project] [PATCH] [RFC] net: smc: fix fasync leak in smc_release()

Wen Gu guwen at linux.alibaba.com
Wed Feb 21 16:09:01 MSK 2024



On 2024/2/21 13:16, Dmitry Antipov wrote:
> I've tracked https://syzkaller.appspot.com/bug?extid=5f1acda7e06a2298fae6
> down to the problem which may be illustrated by the following pseudocode:
> 
> int sock;
> 
> /* thread 1 */
> 
> while (1) {
>         struct msghdr msg = { ... };
>         sock = socket(AF_SMC, SOCK_STREAM, 0);
>         sendmsg(sock, &msg, MSG_FASTOPEN);
>         close(sock);
> }
> 
> /* thread 2 */
> 
> while (1) {
>         int on = 1;
>         ioctl(sock, FIOASYNC, &on);
>         on = 0;
>         ioctl(sock, FIOASYNC, &on);
> }
> 
> That is, something in thread 1 may cause 'smc_switch_to_fallback()' and
> swap kernel sockets (of 'struct smc_sock') behind 'sock' between 'ioctl()'
> calls in thread 2, so this becomes an attempt to add fasync entry to one
> socket but remove from another one. When 'sock' is closing, '__fput()'
> calls 'f_op->fasync()' _before_ 'f_op->release()', and it's too late to
> revert the trick performed by 'smc_switch_to_fallback()' in 'smc_release()'
> and below. Finally we end up with leaked 'struct fasync_struct' object
> linked to the base socket, and this object is noticed by '__sock_release()'
> ("fasync list not empty"). Of course using 'fasync_remove_entry()' in such
> a way is extremely ugly, but what else we can do without touching generic
> socket code, '__fput()', etc.? Comments are highly appreciated.
> 

Hi, Dmitry. Just to confirm if I understand correctly:

1. on = 1; ioctl(sock, FIOASYNC, &on), a fasync entry is added to
    smc->sk.sk_socket->wq.fasync_list;

2. Then fallback happend, and swapped the socket:
    smc->clcsock->file = smc->sk.sk_socket->file;
    smc->clcsock->file->private_data = smc->clcsock;
    smc->clcsock->wq.fasync_list = smc->sk.sk_socket->wq.fasync_list;
    smc->sk.sk_socket->wq.fasync_list = NULL;

3. on = 0; ioctl(sock, FIOASYNC, &on), the fasync entry is removed
    from smc->clcsock->wq.fasync_list,
(Is there a race between 2 and 3 ?)

4. Then close the file, __fput() calls file->f_op->fasync(-1, file, 0),
    then sock_fasync() calls fasync_helper(fd, filp, on, &wq->fasync_list)
    and fasync_remove_entry() removes entries in smc->clcsock->wq.fasync_list.
    Now smc->clcsock->wq.fasync_list is empty.

5. __fput() calls file->f_op->release(inode, file), then sock_close calls
    __sock_release, then ops->release calls smc_release(), and __smc_release()
    calls smc_restore_fallback_changes() to restore socket:
    if (smc->clcsock->file) { /* non-accepted sockets have no file yet */
         smc->clcsock->file->private_data = smc->sk.sk_socket;
         smc->clcsock->file = NULL;
         smc_fback_restore_callbacks(smc);
    }

6. Then back to __sock_release, check if sock->wq.fasync_list (that is
    smc->sk.sk_socket->wq.fasync_list) is empty and it is empty.

So in which step we leaked the fasync_struct entry in smc->sk.sk_socket->wq.fasync_list?
Looks like I missed something, could you please point it to me?

Thanks!

> Signed-off-by: Dmitry Antipov <dmantipov at yandex.ru>
> ---
>   net/smc/af_smc.c | 8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index 0f53a5c6fd9d..68cde9db5d2f 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -337,9 +337,13 @@ static int smc_release(struct socket *sock)
>   	else
>   		lock_sock(sk);
>   
> -	if (old_state == SMC_INIT && sk->sk_state == SMC_ACTIVE &&
> -	    !smc->use_fallback)
> +	if (smc->use_fallback) {
> +		/* FIXME: ugly and should be done in some other way */
> +		if (sock->wq.fasync_list)
> +			fasync_remove_entry(sock->file, &sock->wq.fasync_list);
> +	} else if (old_state == SMC_INIT && sk->sk_state == SMC_ACTIVE) {
>   		smc_close_active_abort(smc);
> +	}
>   
>   	rc = __smc_release(smc);
>   



More information about the lvc-project mailing list