[lvc-project] [PATCH 5.10/5.15/6.1] nfsd: cancel nfsd_shrinker_work using sync mode in nfs4_state_shutdown_net
Chuck Lever
chuck.lever at oracle.com
Sun Dec 29 18:45:45 MSK 2024
On 12/29/24 9:45 AM, Vasiliy Kovalev wrote:
> From: Yang Erkun <yangerkun at huaweicloud.com>
>
> [ Upstream commit d5ff2fb2e7167e9483846e34148e60c0c016a1f6 ]
>
> In the normal case, when we excute `echo 0 > /proc/fs/nfsd/threads`, the
> function `nfs4_state_destroy_net` in `nfs4_state_shutdown_net` will
> release all resources related to the hashed `nfs4_client`. If the
> `nfsd_client_shrinker` is running concurrently, the `expire_client`
> function will first unhash this client and then destroy it. This can
> lead to the following warning. Additionally, numerous use-after-free
> errors may occur as well.
>
> nfsd_client_shrinker echo 0 > /proc/fs/nfsd/threads
>
> expire_client nfsd_shutdown_net
> unhash_client ...
> nfs4_state_shutdown_net
> /* won't wait shrinker exit */
> /* cancel_work(&nn->nfsd_shrinker_work)
> * nfsd_file for this /* won't destroy unhashed client1 */
> * client1 still alive nfs4_state_destroy_net
> */
>
> nfsd_file_cache_shutdown
> /* trigger warning */
> kmem_cache_destroy(nfsd_file_slab)
> kmem_cache_destroy(nfsd_file_mark_slab)
> /* release nfsd_file and mark */
> __destroy_client
>
> ====================================================================
> BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on
> __kmem_cache_shutdown()
> --------------------------------------------------------------------
> CPU: 4 UID: 0 PID: 764 Comm: sh Not tainted 6.12.0-rc3+ #1
>
> dump_stack_lvl+0x53/0x70
> slab_err+0xb0/0xf0
> __kmem_cache_shutdown+0x15c/0x310
> kmem_cache_destroy+0x66/0x160
> nfsd_file_cache_shutdown+0xac/0x210 [nfsd]
> nfsd_destroy_serv+0x251/0x2a0 [nfsd]
> nfsd_svc+0x125/0x1e0 [nfsd]
> write_threads+0x16a/0x2a0 [nfsd]
> nfsctl_transaction_write+0x74/0xa0 [nfsd]
> vfs_write+0x1a5/0x6d0
> ksys_write+0xc1/0x160
> do_syscall_64+0x5f/0x170
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> ====================================================================
> BUG nfsd_file_mark (Tainted: G B W ): Objects remaining
> nfsd_file_mark on __kmem_cache_shutdown()
> --------------------------------------------------------------------
>
> dump_stack_lvl+0x53/0x70
> slab_err+0xb0/0xf0
> __kmem_cache_shutdown+0x15c/0x310
> kmem_cache_destroy+0x66/0x160
> nfsd_file_cache_shutdown+0xc8/0x210 [nfsd]
> nfsd_destroy_serv+0x251/0x2a0 [nfsd]
> nfsd_svc+0x125/0x1e0 [nfsd]
> write_threads+0x16a/0x2a0 [nfsd]
> nfsctl_transaction_write+0x74/0xa0 [nfsd]
> vfs_write+0x1a5/0x6d0
> ksys_write+0xc1/0x160
> do_syscall_64+0x5f/0x170
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> To resolve this issue, cancel `nfsd_shrinker_work` using synchronous
> mode in nfs4_state_shutdown_net.
>
> Fixes: 7c24fa225081 ("NFSD: replace delayed_work with work_struct for nfsd_client_shrinker")
> Signed-off-by: Yang Erkun <yangerkun at huaweicloud.com>
> Reviewed-by: Jeff Layton <jlayton at kernel.org>
> Signed-off-by: Chuck Lever <chuck.lever at oracle.com>
> (cherry picked from commit f965dc0f099a54fca100acf6909abe52d0c85328)
> Signed-off-by: Vasiliy Kovalev <kovalev at altlinux.org>
> ---
> Backport to fix CVE-2024-50121
> Link: https://www.cve.org/CVERecord/?id=CVE-2024-50121
> ---
> fs/nfsd/nfs4state.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 8bceae771c1c75..f6fa719ee32668 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -8208,7 +8208,7 @@ nfs4_state_shutdown_net(struct net *net)
> struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>
> unregister_shrinker(&nn->nfsd_client_shrinker);
> - cancel_work(&nn->nfsd_shrinker_work);
> + cancel_work_sync(&nn->nfsd_shrinker_work);
> cancel_delayed_work_sync(&nn->laundromat_work);
> locks_end_grace(&nn->nfsd4_manager);
>
Backport Acked-by: Chuck Lever <chuck.lever at oracle.com>
Not sure why automation didn't pick this one up.
--
Chuck Lever
More information about the lvc-project
mailing list