[lvc-project] [PATCH 5.10/5.15/6.1] nfsd: cancel nfsd_shrinker_work using sync mode in nfs4_state_shutdown_net

Chuck Lever chuck.lever at oracle.com
Sun Dec 29 18:45:45 MSK 2024


On 12/29/24 9:45 AM, Vasiliy Kovalev wrote:
> From: Yang Erkun <yangerkun at huaweicloud.com>
> 
> [ Upstream commit d5ff2fb2e7167e9483846e34148e60c0c016a1f6 ]
> 
> In the normal case, when we excute `echo 0 > /proc/fs/nfsd/threads`, the
> function `nfs4_state_destroy_net` in `nfs4_state_shutdown_net` will
> release all resources related to the hashed `nfs4_client`. If the
> `nfsd_client_shrinker` is running concurrently, the `expire_client`
> function will first unhash this client and then destroy it. This can
> lead to the following warning. Additionally, numerous use-after-free
> errors may occur as well.
> 
> nfsd_client_shrinker         echo 0 > /proc/fs/nfsd/threads
> 
> expire_client                nfsd_shutdown_net
>    unhash_client                ...
>                                 nfs4_state_shutdown_net
>                                   /* won't wait shrinker exit */
>    /*                             cancel_work(&nn->nfsd_shrinker_work)
>     * nfsd_file for this          /* won't destroy unhashed client1 */
>     * client1 still alive         nfs4_state_destroy_net
>     */
> 
>                                 nfsd_file_cache_shutdown
>                                   /* trigger warning */
>                                   kmem_cache_destroy(nfsd_file_slab)
>                                   kmem_cache_destroy(nfsd_file_mark_slab)
>    /* release nfsd_file and mark */
>    __destroy_client
> 
> ====================================================================
> BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on
> __kmem_cache_shutdown()
> --------------------------------------------------------------------
> CPU: 4 UID: 0 PID: 764 Comm: sh Not tainted 6.12.0-rc3+ #1
> 
>   dump_stack_lvl+0x53/0x70
>   slab_err+0xb0/0xf0
>   __kmem_cache_shutdown+0x15c/0x310
>   kmem_cache_destroy+0x66/0x160
>   nfsd_file_cache_shutdown+0xac/0x210 [nfsd]
>   nfsd_destroy_serv+0x251/0x2a0 [nfsd]
>   nfsd_svc+0x125/0x1e0 [nfsd]
>   write_threads+0x16a/0x2a0 [nfsd]
>   nfsctl_transaction_write+0x74/0xa0 [nfsd]
>   vfs_write+0x1a5/0x6d0
>   ksys_write+0xc1/0x160
>   do_syscall_64+0x5f/0x170
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> ====================================================================
> BUG nfsd_file_mark (Tainted: G    B   W         ): Objects remaining
> nfsd_file_mark on __kmem_cache_shutdown()
> --------------------------------------------------------------------
> 
>   dump_stack_lvl+0x53/0x70
>   slab_err+0xb0/0xf0
>   __kmem_cache_shutdown+0x15c/0x310
>   kmem_cache_destroy+0x66/0x160
>   nfsd_file_cache_shutdown+0xc8/0x210 [nfsd]
>   nfsd_destroy_serv+0x251/0x2a0 [nfsd]
>   nfsd_svc+0x125/0x1e0 [nfsd]
>   write_threads+0x16a/0x2a0 [nfsd]
>   nfsctl_transaction_write+0x74/0xa0 [nfsd]
>   vfs_write+0x1a5/0x6d0
>   ksys_write+0xc1/0x160
>   do_syscall_64+0x5f/0x170
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> To resolve this issue, cancel `nfsd_shrinker_work` using synchronous
> mode in nfs4_state_shutdown_net.
> 
> Fixes: 7c24fa225081 ("NFSD: replace delayed_work with work_struct for nfsd_client_shrinker")
> Signed-off-by: Yang Erkun <yangerkun at huaweicloud.com>
> Reviewed-by: Jeff Layton <jlayton at kernel.org>
> Signed-off-by: Chuck Lever <chuck.lever at oracle.com>
> (cherry picked from commit f965dc0f099a54fca100acf6909abe52d0c85328)
> Signed-off-by: Vasiliy Kovalev <kovalev at altlinux.org>
> ---
> Backport to fix CVE-2024-50121
> Link: https://www.cve.org/CVERecord/?id=CVE-2024-50121
> ---
>   fs/nfsd/nfs4state.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 8bceae771c1c75..f6fa719ee32668 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -8208,7 +8208,7 @@ nfs4_state_shutdown_net(struct net *net)
>   	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>   
>   	unregister_shrinker(&nn->nfsd_client_shrinker);
> -	cancel_work(&nn->nfsd_shrinker_work);
> +	cancel_work_sync(&nn->nfsd_shrinker_work);
>   	cancel_delayed_work_sync(&nn->laundromat_work);
>   	locks_end_grace(&nn->nfsd4_manager);
>   

Backport Acked-by: Chuck Lever <chuck.lever at oracle.com>

Not sure why automation didn't pick this one up.


-- 
Chuck Lever



More information about the lvc-project mailing list