[lvc-project] [PATCH] [RFC] dma-buf: fix race condition between poll and close

Christian König christian.koenig at amd.com
Fri May 3 11:18:39 MSK 2024


Am 03.05.24 um 09:07 schrieb Dmitry Antipov:
> On 4/24/24 2:28 PM, Christian König wrote:
>
>> I don't fully understand how that happens either, it could be that 
>> there is some bug in the EPOLL_FD code. Maybe it's a race when the 
>> EPOLL file descriptor is closed or something like that.
>
> IIUC the race condition looks like the following:
>
> Thread 0                        Thread 1
> -> do_epoll_ctl()
>    f_count++, now 2
>    ...
>    ...                          -> vfs_poll(), f_count == 2
>    ...                          ...
> <- do_epoll_ctl()               ...
>    f_count--, now 1             ...
> -> filp_close(), f_count == 1   ...
>    ...                            -> dma_buf_poll(), f_count == 1
>    -> fput()                      ... [*** race window ***]
>       f_count--, now 0              -> maybe get_file(), now ???
>       -> __fput() (delayed)
>
> E.g. dma_buf_poll() may be entered in thread 1 with f->count == 1
> and call to get_file() shortly later (and may even skip this if
> there is nothing to EPOLLIN or EPOLLOUT). During this time window,
> thread 0 may call fput() (on behalf of close() in this example)
> and (since it sees f->count == 1) file is scheduled to delayed_fput().

Wow, this is indeed looks like a bug in the epoll implementation.

Basically Thread 1 needs to make sure that the file reference can't 
vanish. Otherwise it is illegal to call vfs_poll().

I only skimmed over the epoll implementation and never looked at the 
code before, but of hand it looks like the implementation uses a mutex 
in the eventpoll structure which makes sure that the epitem structures 
don't go away during the vfs_poll() call.

But when I look closer at the do_epoll_ctl() function I can see that the 
file reference acquired isn't handed over to the epitem structure, but 
rather dropped when returning from the function.

That seems to be a buggy behavior because it means that vfs_poll() can 
be called with a stale file pointer. That in turn can lead to all kind 
of use after free bugs.

Attached is a compile only tested patch, please verify if it fixes your 
problem.

Regards,
Christian.





>
> Dmitry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-epoll-fix-file-reference-counting.patch
Type: text/x-patch
Size: 2019 bytes
Desc: not available
URL: <http://linuxtesting.org/pipermail/lvc-project/attachments/20240503/b9720a2f/attachment.bin>


More information about the lvc-project mailing list